Question: Consider the following loop that adds a constant to a vector (we discussed this earlier). There's quite a lot of overhead associated with the solitary

Consider the following loop that adds a constant to a vector (we discussed this earlier). There's quite a lot of overhead associated with the solitary SIMD instruction. Suppose you were designing a new ISA that implemented operations like paddb. How would you make the code more efficient?

movq mov mov Next: movq mm1, c cx, 3 esi, 0 mm0, x [esi] paddb mm0, mm 1 movq x[esi), mm0 add loop Next esi,

movq mov mov Next movq mm1, c cx, 3 esi, 0 mm0, x [esi] paddb mm0, mm 1 movq x[esi), mm0 add loop Next esi, 8 ;load constant into mm1 (8 copies) ;set up loop counter for three trips 8 3 = 24 ;set pointer to 0 (use as index into vector) ;Repeat: load 8 bytes into mm0 using indexed addressing now do 8 bytes of the vector addition ; ; store 8 bytes of result in x ; increment index by 8 ;Until all done

Step by Step Solution

3.39 Rating (149 Votes )

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock

This problem goes to the heart of ISA design How do we make operations more efficient Here the paral... View full answer

blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Computer Architecture Questions!