Question: Exercise 7.3 Consider the following piece of C code: for (j=2;j <1000;j++) D[j] = D[j-1]+D[j2]; The MIPS code corresponding to the above fragment is: DADDIU

Exercise 7.3 Consider the following piece of C code:

for (j=2;j<1000;j++)

D[j] = D[j-1]+D[j–2];

The MIPS code corresponding to the above fragment is:

DADDIU r2,r2,999 loop: L.D f1, -16(f1)

L.D f2, -8(f1)

ADD.D f3, f1, f2 S.D f3, 0(r1)

DADDIU r1, r1, 8 BNE r1, r2, loop Instructions have the following associated latencies (in cycles):
ADD.D L.D S.D DADDIU BNE 3 5 1 1 3 7.3.1 [10] <7.2> How many cycles does it take for all instructions in a single iteration of the above loop to execute?
7.3.2 [10] <7.2> When an instruction in a later iteration of a loop depends upon a data value produced in an earlier iteration of the same loop, we say that there is a loop-carried dependence between iterations of the loop. Identify the loop-carried dependences in the above code. Identify the dependent program variable and assembly-level registers. You can ignore the loop induction variable j.
7.3.3 [10] <7.2> Loop unrolling was described in Chapter 4. Apply loop unrolling to this loop and then consider running this code on a 2-node distributed memory message-passing system. Assume that we are going to use message passing as described in Section 7.4, where we introduce a new operation send (x, y) that sends to node x the value y, and an operation receive( ) that waits for the value being sent to it.
Assume that send operations take a cycle to issue (i.e., later instructions on the same node can proceed on the next cycle), but take 4 cycles to be received on the receiving node. Receive instructions stall execution on the node where they are executed until they receive a message. Produce a schedule for the two nodes; assume an unroll factor of 4 for the loop body (i.e., the loop body will appear 4 times). Compute the number of cycles it will take for the loop to run on the message-passing system.
7.3.4 [10] <7.2> The latency of the interconnect network plays a large role in the effi ciency of message-passing systems. How fast does the interconnect need to be in order to obtain any speed-up from using the distributed system described in Exercise 7.3.3?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock