Question: Now assume that we can use scatter-gather loads and stores (LVI and SVI). Assume that tiPL, tiPR, clL, clR, and clP are arranged consecutively in
Now assume that we can use scatter-gather loads and stores (LVI and SVI). Assume that tiPL, tiPR, clL, clR, and clP are arranged consecutively in memory. For example, if seq_length==500, the tiPR array would begin 500 * 4 bytes after the tiPL array. How does this affect the way you can write the VMIPS code for this kernel? Assume that you can initialize vector registers with integers using the following technique which would, for example, initialize vector register V1 with values (0,0,2000,2000):

Assume the maximum vector length is 64. Is there any way performance can be improved using gather-scatter loads? If so, by how much?
LI R2,0 SW R2,vec SW R2, vec+4 LI R2,2000 SW R2, vec+8 SW R2, vec+12 LV V1,vec
Step by Step Solution
3.40 Rating (169 Votes )
There are 3 Steps involved in it
In this case the 16 values could be loaded into each vector register pe... View full answer
Get step-by-step solutions from verified subject matter experts
