Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

( 3 0 points ) Consider the following instruction sequence running on the five - stage pipelined processor: beq x 1 1 , x 1

(30 points) Consider the following instruction sequence running on the five-stage pipelined
processor:
beq x11, x12, Label
sd x15,0(23)
ld x15,0(x24)
add 11,6,12
sub x11,x13,x12
Assume x11x12.
Note that in the following questions, structural hazards are considered only in (a) and (b).
(a)(10 points) Assume the processor predicts each branch instruction to be not taken. If we only
have one memory (for both instructions and data), there is a structural hazard every time we
need to fetch an instruction in the same cycle in which another instruction accesses data. To
guarantee the processor to work correctly, this structural hazard must always be resolved in
favor of the instruction that accesses data. In other words, there is a hazard detection unit in
the IF stage, and if a structural hazard occurs, the instruction in the IF stage needs to stall for
that cycle. What is the total execution time of this instruction sequence? We have learned
that data hazards can be eliminated by adding NOPs to the code, so can you do the same with
this structural hazard and why? (b)(5 points) Assume we use the same processor in (a). What is the minimum number of cycles
you can achieve by adjusting the order of the instructions without losing the correctness?
Also give the new sequence of instructions after re-ordering.
(c)(5 points) Assuming Stall on Branch (i.e., wait until the branch outcome is determined before
fetching next instruction), what speedup is achieved on this instruction sequence if branch
outcomes are determined in the ID stage, relative to the execution where branch outcomes are
determined in the MEM stage?
(d)(5 points) Assume the processor predicts each branch instruction to be not taken. Also assume
each individual pipeline stage of IF, ID, EX, MEM, and WB has the latency of 210ps,160
ps,220ps,180ps, and 100ps, respectively. If we change load/store instructions to use a
register (without an offset) as the address, these instructions no longer need to use the ALU.
As a result, MEM and EX stages can be overlapped and the pipeline has only 4 stages.
Assuming this change does not affect the clock period, what speedup is achieved in this
instruction sequence compared to the original five-stage one?
(e)(5 points) Given the pipeline stage latencies in (d), repeat the speedup calculation of (d) by
considering the (possible) change in the clock period as follows. When EX and MEM are
done in a single stage (called EX/MEM stage), most of their work can be done in parallel. As
a result, the EX/MEM stage now has a latency that is the larger of the original two, plus 25ps
needed for the work that could not be done in parallel.
image text in transcribed

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Securing SQL Server Protecting Your Database From Attackers

Authors: Denny Cherry

1st Edition

1597496251, 978-1597496254

More Books

Students also viewed these Databases questions

Question

* What is the importance of soil testing in civil engineering?

Answered: 1 week ago

Question

Explain the concept of shear force and bending moment in beams.

Answered: 1 week ago