Ooredoo I C mybb.qu.edu.qa Qatar University College of Engineering Homework 3 CMPE 364 Microprocessor Based Design Spring 2018 [Graded out of 30 poirts] Samir your answers nped on Blackloard by he and of dlay on Thursdy Me 1e 38 Consider the followingC+ functions that process arrays of positive short (16-be integers ero is used asa sentinel value to indicate the last element of the aray Assume that you now beforehand that the size of your array is large and that the allocated size of the array sa multiple of 4 elemerts You will need to wite asisembly code and then optimge it to reduce the total number of exeoution cydes.Below, you are being asked to provide the average number of cycles per array elements Since the size of the array is large, you can ignore the cycles outside the loope, ignore the initialization instructions and the instructions after exiting from the loop) For each of the C+ functions given below, write ARM assembly subroutines usin APCSas folows L 13 points/routinel Straight forward literal conversion of the Cr+ code to anu optimired ARM assembly code. Provide the following a. How many cycles does this subroutine take per array element ii. [6 poines/routinel Optimize the code using pre-looding Provide the lowing a. Highlight the changes you made to optimize the code and explain your b. How many cycles does this subroutine take per array element e. What is the speedup achieved over the un-optimued code d. Discuss the dsadvantages of your optimization, fory ai. [6 poines/routinel Further optimize the code using loop unroWing process 4 elements a time Provide the following a. Highlight the changes you made to optimiae the code and explain your b. How many cycles, on the average, does this subroutine take per amay element C What is the speedup achieved over the un-optimired code d. Discuss the disadvantages of your optimiaation, fony Ooredoo I C mybb.qu.edu.qa Qatar University College of Engineering Homework 3 CMPE 364 Microprocessor Based Design Spring 2018 [Graded out of 30 poirts] Samir your answers nped on Blackloard by he and of dlay on Thursdy Me 1e 38 Consider the followingC+ functions that process arrays of positive short (16-be integers ero is used asa sentinel value to indicate the last element of the aray Assume that you now beforehand that the size of your array is large and that the allocated size of the array sa multiple of 4 elemerts You will need to wite asisembly code and then optimge it to reduce the total number of exeoution cydes.Below, you are being asked to provide the average number of cycles per array elements Since the size of the array is large, you can ignore the cycles outside the loope, ignore the initialization instructions and the instructions after exiting from the loop) For each of the C+ functions given below, write ARM assembly subroutines usin APCSas folows L 13 points/routinel Straight forward literal conversion of the Cr+ code to anu optimired ARM assembly code. Provide the following a. How many cycles does this subroutine take per array element ii. [6 poines/routinel Optimize the code using pre-looding Provide the lowing a. Highlight the changes you made to optimize the code and explain your b. How many cycles does this subroutine take per array element e. What is the speedup achieved over the un-optimued code d. Discuss the dsadvantages of your optimization, fory ai. [6 poines/routinel Further optimize the code using loop unroWing process 4 elements a time Provide the following a. Highlight the changes you made to optimiae the code and explain your b. How many cycles, on the average, does this subroutine take per amay element C What is the speedup achieved over the un-optimired code d. Discuss the disadvantages of your optimiaation, fony