Answered step by step
Verified Expert Solution
Question
1 Approved Answer
SW (6 points - Correctness) More Hazards. iven the following program: loop: lw $t2, 4($t0) ori $t2, $t2, 8 $t2, 0 ($t0) addi $t0, $t0,
SW (6 points - Correctness) More Hazards. iven the following program: loop: lw $t2, 4($t0) ori $t2, $t2, 8 $t2, 0 ($t0) addi $t0, $t0, 4 bne $t0, $t1, loop add $t4, $t3, $t5 (a) Rearrange and/or modify (do not remove instructions, the number of instructions should remain the same) the instructions in the given program to increase performance Assume a 5-stage pipeline processor with forwarding wherever possible. Also, assume that branches are resolved in Decode, branches are always predicted not-taken, and no architectural branch delay slot. Make sure that you maintain the functional correctness of the program after your changes. (b) Compare the average steady state CPI of the 'loop', before (given program) and after the modifications (part (a)). Assume that branches are resolved in Decode, branches are always predicted not-taken, and no branch delay slot. We also assume for branch instructions, there exists data forwarding between the beginning of the M stage to the comparator in the D stage. If necessary, consider the case when the branch was actually taken 100 times before it left the loop. SW (6 points - Correctness) More Hazards. iven the following program: loop: lw $t2, 4($t0) ori $t2, $t2, 8 $t2, 0 ($t0) addi $t0, $t0, 4 bne $t0, $t1, loop add $t4, $t3, $t5 (a) Rearrange and/or modify (do not remove instructions, the number of instructions should remain the same) the instructions in the given program to increase performance Assume a 5-stage pipeline processor with forwarding wherever possible. Also, assume that branches are resolved in Decode, branches are always predicted not-taken, and no architectural branch delay slot. Make sure that you maintain the functional correctness of the program after your changes. (b) Compare the average steady state CPI of the 'loop', before (given program) and after the modifications (part (a)). Assume that branches are resolved in Decode, branches are always predicted not-taken, and no branch delay slot. We also assume for branch instructions, there exists data forwarding between the beginning of the M stage to the comparator in the D stage. If necessary, consider the case when the branch was actually taken 100 times before it left the loop
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started