Question
Computer architecture Question 5) We define the SIMD utilization of a program run on a GPU as the fraction of SIMD lanes that are kept
Computer architecture
Question 5) We define the SIMD utilization of a program run on a GPU as the fraction of SIMD lanes that are kept busy with active threads during the run of a program.
The following code segment is run on a GPU. Each thread executes a single iteration of the shown loop. Assume that the data values of the arrays A, B, and C are already in vector registers so there are no loads and stores in this program. (Hint: Notice that there are 4 instructions in each thread.)
A warp in the GPU consists of 64 threads, and there are 64 SIMD lanes in the GPU:
for (i = 0; i < 1024768; i++) {
if (B[i] < 4444) {
A[i] = A[i] * C[i];
B[i] = A[i] + B[i];
C[i] = B[i] + 1;
}
}
(a) How many warps does it take to execute this program?
(b) When we measure the SIMD utilization for this program with one input set, we find that it is 67 / 256. What can you say about arrays A, B, and C? Be precise (Hint: Look at the if branch, what can you say about A, B and C?)
(c) Is it possible for this program to yield a SIMD utilization of 100% (circle one)? If YES, what should be true about arrays A, B, C for the SIMD utilization to be 100%? Be precise.If NO, explain why not.
(d) Is it possible for this program to yield a SIMD utilization of 25% (circle one)? If YES, what should be true about arrays A, B, and C for the SIMD utilization to be 25%? Be precise.If NO, explain why not
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started