Answered step by step
Verified Expert Solution
Question
1 Approved Answer
3. For this question, check all that apply. Given the circumstances described in question 1 above, which of the following changes by itself would yield
3. For this question, check all that apply. Given the circumstances described in question 1 above, which of the following changes by itself would yield at least 2X speedup? Using a bigger battery and fan, the clock can run at 1GHz Adding a cache memory reduces both Load and Store to 8 CPI An improved design reduces the CPI for ALU instructions from 8 to 4 A new compiler reduces the number of ALU instructions from 100 to 30 A multi-cycle ALU makes ALU instructions take 10 CPI, but allows a 3ns clock period 1. Given this processor hardware design, add control states to the following to implement an average instruction (as decoded by the when below), such that avg $rd, $rs, $rt gives rd the average of the other two values. You're probably used to averaging by adding and then dividing by 2, but that actually doesn't work -- because the add can go out of range. Instead, use the algorithm that rd=((rs&rt)+((rs^rt)>>1)), which is a trick from Prof. Dietz's MAGIC algorithms page. when op() op(1) Avg Start: PCout, MARin, MEMread, Yin CONST(4), ALUadd, Zin, UNTILmfc MDRout, I Rin Zout, PCin, JUMPonop HALT /* Should end here on undecoded op */ Avg: SELrs, REGout, Yin SELrt, REGout, ALUand, Zin Zout, MDRin SELTS, REGout, Yin SELrt, REGout, ALUxor, Zin CONST(1), Yin Zout, ALUsrl, Zin MDRout, Yin Zout, ALUadd, Zin Zout, SELrd, REGin, JUMP(Start) 3. For this question, check all that apply. Given the circumstances described in question 1 above, which of the following changes by itself would yield at least 2X speedup? Using a bigger battery and fan, the clock can run at 1GHz Adding a cache memory reduces both Load and Store to 8 CPI An improved design reduces the CPI for ALU instructions from 8 to 4 A new compiler reduces the number of ALU instructions from 100 to 30 A multi-cycle ALU makes ALU instructions take 10 CPI, but allows a 3ns clock period 1. Given this processor hardware design, add control states to the following to implement an average instruction (as decoded by the when below), such that avg $rd, $rs, $rt gives rd the average of the other two values. You're probably used to averaging by adding and then dividing by 2, but that actually doesn't work -- because the add can go out of range. Instead, use the algorithm that rd=((rs&rt)+((rs^rt)>>1)), which is a trick from Prof. Dietz's MAGIC algorithms page. when op() op(1) Avg Start: PCout, MARin, MEMread, Yin CONST(4), ALUadd, Zin, UNTILmfc MDRout, I Rin Zout, PCin, JUMPonop HALT /* Should end here on undecoded op */ Avg: SELrs, REGout, Yin SELrt, REGout, ALUand, Zin Zout, MDRin SELTS, REGout, Yin SELrt, REGout, ALUxor, Zin CONST(1), Yin Zout, ALUsrl, Zin MDRout, Yin Zout, ALUadd, Zin Zout, SELrd, REGin, JUMP(Start)
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started