Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

When counting the number of cycles for a loop, find out when (in which cycle) the first instruction in the loop is fetched and when

When counting the number of cycles for a loop, find out when (in which cycle) the first instruction in the loop is fetched and when it is fetched again (assuming the loop repeats), and then calculate the difference of cycle numbers. The difference is the number of cycles an iteration takes. For example, if the first instruction in a loop is fetched in cycle 1 and fetched again in cycle 11, each iteration takes 10 cycles.

image text in transcribed

4. (30 points) In this exercise, we will look at how a common vector loop runs on statically scheduled version of the MIPS pipeline (We will consider dynamic scheduling in later assignments). The loop is the so-called DAXPY loop and the central operation in Gaussian elimination. The loop implements the vector operation Y-a *X + Y. Here is the MIPS code for the loop foo: F2, 0 (R1) F4, F2, FO F6, 0 (R2) F6, F4, F6 F6, 0 (R2) R1, R1, 8 R2, R2, 8 R3, R3, 1 R3, R9, foo ; load X[i] ; multiply a *X[i] ; load Y[i] MULT. D ADD.D S.D DADDI DADDI DADDI BNE ; store Y[i] increment X pointer ; increment Y pointer ; increment the counter ; loop if not done Instructions with '.D' are double precision floating-point operations. R3 is the loop counter. It is incremented every iteration and the loop exits when R3 reaches R9 For this problem, use the MIPS pipeline shown in Figure C.35 (in Section C.5). The pipeline latencies are listed in Figure C.34 (the numbers are also shown on the slide "Latencies and initiation intervals for functional units") Assume results from instructions are fully forwarded to the beginning of each FUs (but not to the MEM stages). The MEM stage of load instructions completes in 1 clock cycle. The branch instruction is resolved in the EXE stage and it is not delayed. Branches are predicted not-taken a) Show a pipeline diagram for this instruction sequence, starting from cycle to the cycle where the first L.D enters the ID stagc in the sccond iteration. How many clock cycles docs each loop iteration take? b) Construct a table shows when (in which cycle) an instruction enters the execution stage, and the number of cycles the instruction has to wait in the ID stage. An example is shown below Instructions ,D MULT. D EXE StartStalls 3 5 F2, 0 (R1) F4, F2, FO F6, F4, F6 F6, (R2) R1, R1, 8 R2, R2, 8 R3, R3, 1 R3, R9, foo 1 DADDI DADDI DADDI BNE 4. (30 points) In this exercise, we will look at how a common vector loop runs on statically scheduled version of the MIPS pipeline (We will consider dynamic scheduling in later assignments). The loop is the so-called DAXPY loop and the central operation in Gaussian elimination. The loop implements the vector operation Y-a *X + Y. Here is the MIPS code for the loop foo: F2, 0 (R1) F4, F2, FO F6, 0 (R2) F6, F4, F6 F6, 0 (R2) R1, R1, 8 R2, R2, 8 R3, R3, 1 R3, R9, foo ; load X[i] ; multiply a *X[i] ; load Y[i] MULT. D ADD.D S.D DADDI DADDI DADDI BNE ; store Y[i] increment X pointer ; increment Y pointer ; increment the counter ; loop if not done Instructions with '.D' are double precision floating-point operations. R3 is the loop counter. It is incremented every iteration and the loop exits when R3 reaches R9 For this problem, use the MIPS pipeline shown in Figure C.35 (in Section C.5). The pipeline latencies are listed in Figure C.34 (the numbers are also shown on the slide "Latencies and initiation intervals for functional units") Assume results from instructions are fully forwarded to the beginning of each FUs (but not to the MEM stages). The MEM stage of load instructions completes in 1 clock cycle. The branch instruction is resolved in the EXE stage and it is not delayed. Branches are predicted not-taken a) Show a pipeline diagram for this instruction sequence, starting from cycle to the cycle where the first L.D enters the ID stagc in the sccond iteration. How many clock cycles docs each loop iteration take? b) Construct a table shows when (in which cycle) an instruction enters the execution stage, and the number of cycles the instruction has to wait in the ID stage. An example is shown below Instructions ,D MULT. D EXE StartStalls 3 5 F2, 0 (R1) F4, F2, FO F6, F4, F6 F6, (R2) R1, R1, 8 R2, R2, 8 R3, R3, 1 R3, R9, foo 1 DADDI DADDI DADDI BNE

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Intelligent Information And Database Systems Asian Conference Aciids 2012 Kaohsiung Taiwan March 2012 Proceedings Part 2 Lnai 7197

Authors: Jeng-Shyang Pan ,Shyi-Ming Chen ,Ngoc-Thanh Nguyen

2012th Edition

3642284892, 978-3642284892

More Books

Students also viewed these Databases questions