Answered step by step
Verified Expert Solution
Question
1 Approved Answer
iteration of the shown loop. Assume that the data values of the arrays A , B , and C are already in vector registers so
iteration of the shown loop. Assume that the data values of the arrays A B and C are already in vector registers so there are no loads and stores in this program. Hint: Notice that there are instructions in each thread. A warp in the GPU consists of threads, and there are SIMD lanes in the GPU.
for i ; i ; i
if Bi
Ai Ai Ci;
Bi Bi Di;
Di Di;
Ci Ci;
How many warps does it take to execute this program?
B If the code in part A excluding the test of Bi is executed on a vector processor with vector registers elements each, and this time the vector processor has to load the vectors from the memory and store them at the end of the loop that is after the operations on them have finished executing The processor has two loadstore LS units, one add ADDVVADDVI unit and one multiply MULTVV unit and does not supports chaining.
The vector operations are shown below:
A A C;
B B D;
D D ;
C C ;
The vector code is shown below:
I
LV A
I
LV C
I
MULVV A A C
I
LV B
I
LV D
I
ADDVV B B D
I
SV A
I
ADDVI D D
I
SV B
I
ADDVI C C
I
SV D
I
SV C
a Show how the code will be executed using the table below. Use squares in the horizontal direction for each operation.
b Assume that chaining is supported and show how the code can be executed in the table below. Use squares in the horizontal direction for each operation. When chaining, start the operation that you chain to at least one square to the right under the operation that you chain from. There will be only one chaining path from a functional unit to another functional unit.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started