Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Write a version of the inner product procedure described in the problem 5.13 in the textbook that uses 6_6 loop unrolling. Our measurements for this

image text in transcribed

image text in transcribed

image text in transcribed

image text in transcribed

image text in transcribed

image text in transcribed

image text in transcribed

Write a version of the inner product procedure described in the problem 5.13 in the textbook that uses 6_6 loop unrolling. Our measurements for this function with x86-64 give a CPE of 1.06 for integer data and 1.01 for floating-point data What factor limits the performance to a CPE of 1.00? 4 Fill in the missing parts of the code below 1 Inner Product. 6 X6 unrolling 2 void inner_u6x6(vec_ptr u, vec_ptr v, data_t *dest) long length long limit ..., data_t udata - get_vec_start(u); data_t *vdta - get_vec_start (v) data-t sumo = (data-t) 0; data_t sum1(data_t) 0; data_t sum2(data_t) 0; data_t sum3 (data_t) 0; data_t sum4(data_t) 0; data-t sumb (data-t) 0; Do 6 elements at a time/ for (..) 10 13 15 17 18 19 20 21 sum1.... sum2- sum3 sum4- 23 24 25 26 27 28 29 /Finish off any remaining elements for (...) *dest

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_step_2

Step: 3

blur-text-image_step3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

More Books

Students also viewed these Databases questions