Question: Consider the following loop that adds a constant to a vector (we discussed this earlier). There's quite a lot of overhead associated with the solitary
Consider the following loop that adds a constant to a vector (we discussed this earlier). There's quite a lot of overhead associated with the solitary SIMD instruction. Suppose you were designing a new ISA that implemented operations like paddb. How would you make the code more efficient?
![movq mov mov Next: movq mm1, c cx, 3 esi, 0 mm0, x [esi] paddb mm0, mm 1 movq x[esi), mm0 add loop Next esi,](https://dsd5zvtm8ll6.cloudfront.net/images/question_images/1705/7/3/4/43865ab7126a93911705734434459.jpg)
movq mov mov Next movq mm1, c cx, 3 esi, 0 mm0, x [esi] paddb mm0, mm 1 movq x[esi), mm0 add loop Next esi, 8 ;load constant into mm1 (8 copies) ;set up loop counter for three trips 8 3 = 24 ;set pointer to 0 (use as index into vector) ;Repeat: load 8 bytes into mm0 using indexed addressing now do 8 bytes of the vector addition ; ; store 8 bytes of result in x ; increment index by 8 ;Until all done
Step by Step Solution
3.39 Rating (149 Votes )
There are 3 Steps involved in it
This problem goes to the heart of ISA design How do we make operations more efficient Here the paral... View full answer
Get step-by-step solutions from verified subject matter experts
