Answered step by step
Verified Expert Solution
Question
1 Approved Answer
5. (28 pts.) The following compiled code runs on the 5-stage pipelined MIPS processor with 1.6 GHz clock speed and supply voltage of 1.2 V.
5. (28 pts.) The following compiled code runs on the 5-stage pipelined MIPS processor with 1.6 GHz clock speed and supply voltage of 1.2 V. RF write and read can be done in one cycle. addi $80, $zero, 1000 again: beq $80, $zero, out addi $80, $s0, -4 lw $s4, 0($sl) add $s2, $4, $s2 add $s5, $s2, $s 0 addi $sl, $80, 5 j again out: add $v0, $zero, $s2 a) (6 pts.) Assume no implementation for hazard detection, no branch delay scheduling, and the branch condition is evaluated in the Execution stage. What is the execution time, after taking into account all stalls? b) (6 pts.) Given the following characterized power parameters per pipeline stage, calculate the average power dissipation while running the given code on the processor. Assume the processor has no signal switching at any stage during stall cycles. Stage Cayn (nF) Istatic (A) IF 0.2 2 ID 0.8 1.4 EX 1.6 0.6 MEM 0.4 2.5 WB 1 1.2 c) (10 pts.) The compiler has been enhanced to implement instruction reordering (but still no branch delay scheduling)? Show reordered code for performance improvement, and calculate how much average CPI reduction can be achieved with your code. d) (6 pts.) What is the total number of clock cycles that can be reduced compared to your answer in part (a) by implementing each of the following branch prediction schemes: i. Static prediction - Predict-Taken, ii. Dynamic 1-bit prediction with initial state of Predict-Not-Taken, iii. Dynamic 2-bit prediction with the below state diagram and initial state of Weakly-Predict- Not-Taken. Taken Not taken Predict Taken Predict Taken Taken Taken Not taken Not taken Predict Not Taken Predict Not Taken Not taken Taken 5. (28 pts.) The following compiled code runs on the 5-stage pipelined MIPS processor with 1.6 GHz clock speed and supply voltage of 1.2 V. RF write and read can be done in one cycle. addi $80, $zero, 1000 again: beq $80, $zero, out addi $80, $s0, -4 lw $s4, 0($sl) add $s2, $4, $s2 add $s5, $s2, $s 0 addi $sl, $80, 5 j again out: add $v0, $zero, $s2 a) (6 pts.) Assume no implementation for hazard detection, no branch delay scheduling, and the branch condition is evaluated in the Execution stage. What is the execution time, after taking into account all stalls? b) (6 pts.) Given the following characterized power parameters per pipeline stage, calculate the average power dissipation while running the given code on the processor. Assume the processor has no signal switching at any stage during stall cycles. Stage Cayn (nF) Istatic (A) IF 0.2 2 ID 0.8 1.4 EX 1.6 0.6 MEM 0.4 2.5 WB 1 1.2 c) (10 pts.) The compiler has been enhanced to implement instruction reordering (but still no branch delay scheduling)? Show reordered code for performance improvement, and calculate how much average CPI reduction can be achieved with your code. d) (6 pts.) What is the total number of clock cycles that can be reduced compared to your answer in part (a) by implementing each of the following branch prediction schemes: i. Static prediction - Predict-Taken, ii. Dynamic 1-bit prediction with initial state of Predict-Not-Taken, iii. Dynamic 2-bit prediction with the below state diagram and initial state of Weakly-Predict- Not-Taken. Taken Not taken Predict Taken Predict Taken Taken Taken Not taken Not taken Predict Not Taken Predict Not Taken Not taken Taken
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started