Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Question 2. Branch delay on a heavily pipelined architecture Question 2. Branch delay on a heavily pipelined architecture Sometimes, architectures are very heavily pipelined, to

Question 2. Branch delay on a heavily pipelined architecture

image text in transcribed

Question 2. Branch delay on a heavily pipelined architecture Sometimes, architectures are very heavily pipelined, to get a fast clock cycle time. I might not mind a higher CPI, if the clock cycle time is very fast! Suppose that in a certain mix of code: 16% of instructions are conditional branches. 60% of conditional branches are taken. 1% of instructions are jumps (unconditional branches, always taken). The average CPI of non-branch instructions is 1.2. A heavily pipelined architecture with fourteen pipeline stages calculates branch addresses (for both unconditional and conditional branches) in the third stage, storing the address in the pipeline stage register between the third and fourth stages before it can be used. It calculates the condition for conditional branches in the tenth stage, storing the condition bit in the pipeline stage register between the ninth and tenth stages before it can be used. (Throughout, be sure you clearly state how many cycles of stall are needed for conditional and unconditional branches, respectively Draw a pipeline stage diagram, and think about when exactly you can update the PC in each situation.) a) If the architecture uses a freeze-the-pipeline strategy, what is the CPI? b) If the architecture uses a predict-not-taken strategy, what is the CPI, and what is the speed-up relative to the freeze-the pipeline architecture? c) If the architecture uses a predict-taken strategy, what is the CPI, and what is the speed-up relative to the freeze-the pipeline architecture? (Note that you still need the branch address before you can branch, even if you predict-taken.) d) Branch predictors are pieces of hardware that predict the condition bit of a branch based on run-time conditions at the time the branch starts executing. What is the theoretical) best CPI that could be achieved if the machine were able to use a branch predictor to perfectly predict whether a branch is taken? What is the corresponding speed-up relative to the freeze-the-pipeline architecture? Question 2. Branch delay on a heavily pipelined architecture Sometimes, architectures are very heavily pipelined, to get a fast clock cycle time. I might not mind a higher CPI, if the clock cycle time is very fast! Suppose that in a certain mix of code: 16% of instructions are conditional branches. 60% of conditional branches are taken. 1% of instructions are jumps (unconditional branches, always taken). The average CPI of non-branch instructions is 1.2. A heavily pipelined architecture with fourteen pipeline stages calculates branch addresses (for both unconditional and conditional branches) in the third stage, storing the address in the pipeline stage register between the third and fourth stages before it can be used. It calculates the condition for conditional branches in the tenth stage, storing the condition bit in the pipeline stage register between the ninth and tenth stages before it can be used. (Throughout, be sure you clearly state how many cycles of stall are needed for conditional and unconditional branches, respectively Draw a pipeline stage diagram, and think about when exactly you can update the PC in each situation.) a) If the architecture uses a freeze-the-pipeline strategy, what is the CPI? b) If the architecture uses a predict-not-taken strategy, what is the CPI, and what is the speed-up relative to the freeze-the pipeline architecture? c) If the architecture uses a predict-taken strategy, what is the CPI, and what is the speed-up relative to the freeze-the pipeline architecture? (Note that you still need the branch address before you can branch, even if you predict-taken.) d) Branch predictors are pieces of hardware that predict the condition bit of a branch based on run-time conditions at the time the branch starts executing. What is the theoretical) best CPI that could be achieved if the machine were able to use a branch predictor to perfectly predict whether a branch is taken? What is the corresponding speed-up relative to the freeze-the-pipeline architecture

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions