Question
Suppose we wish to write a procedure that computes the inner product of two vectors u and v. An abstract version of the function has
Suppose we wish to write a procedure that computes the inner product of two vectors u and v. An abstract version of the function has a CPE of 1418 with x86-64 for different types of integer and floating-point data. By doing the same sort of transformations we did to transform the abstract program combine1 into the more efficient combine4, we get the following code:
void inner4(vec_ptr u, vec_ptr v, data t *dest) { long i;
long length = vec_length(u); data_t *udata = get_vec_start(u); data_t *vdata = get_vec_start(v); data_t sum = (data_t) 0;
for (i = 0; i < length; i++){ sum = sum + udata[i] * vdata[i];
}
*dest = sum; }
Our measurements show that this function has a CPE of 1.50 for integer data and 3.00 for floating-point data. For data type double, the x86-64 assembly code for the inner loop is as follows:
# Inner loop of inner4. data_t = double. OP = *. # udata in %rbp, vdata %rax, sum in %xmm0, i in rcx, limit in rbx .L15: # loop:
vmovsd 0(%rbp,%rcx,8), %xmm1 # Get udata[i] vmulsd (%rax,%rcx,8), %xmm1, %xmm1 # Multiply by vdata[i]
vaddsd %xmm1, %xmm0, %xmm0 addq $1, %rcx cmpq %rbx, %rcx jl .L15
# Add to sum # Increment i # Compare i:limit # If <, goto loop
For x86-64, our measurements of the unrolled version give a CPE of 1.07 for integer data but still 3.01 for floating-point data.
Write a version of the inner product procedure described above that uses 6 1a loop unrolling to enable greater parallelism (six-way unrolling, one accumulator, and a reassociation transformation). (Measurements for this function give a CPE of 1.10 for integer data and 1.05 for floating-point data.)
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started