Answered step by step
Verified Expert Solution
Question
1 Approved Answer
i need solutions asap!!!!!! Q3. Consider an instruction pipeline with six stages without any branch prediction: S1, S2, S3, S4, S5 (Execution stage (ALU)) and
i need solutions asap!!!!!!
Q3. Consider an instruction pipeline with six stages without any branch prediction: S1, S2, S3, S4, S5 (Execution stage (ALU)) and S6. The stage delays for S1, S2, S3,S4,S5 and S6 are 12 nsec, 11 nsec, 16 nsec, 6 nsec, 7 nsec and 15 nsec, respectively. There are intermediate storage buffers after each stage and the delay of each buffer is 3 nsec. A program consisting of 24 instructions I 1.1_2, ...,1_24 is executed in the pipelined processor. Instruction I_8 is the only branch instruction, and its branch target is I 14. If the branch is taken during the execution of this program, what is the time needed to complete the program (5M) Q2. Consider the following sequence of instructions executed by 3-stage pipeline architecture. Instruction Instruction Seq. No. 1. and St3.St1.S12 lw $t1,0(St4) sub St5,St3.St4 4. add Sto, St1,$12 5. add St1,$t3,St6 6. sw $t1.4 (St4) 7. lw $12.4($14) or St3,5t5,St6 2. 3. 8. a. Draw the pipeline implementation diagram for the execution of above instructions only with stall implementation and calculate the total number of clock cycles required to complete the execution of all the instructions. Also find the type of hazards between any instructions in the above by filling the following table Instruction Seq. Instruction Seq. Due to which Type of hazards No. Register No. b. Draw the pipeline implementation diagram for the execution of above instructions with forwarding and stall implementation and calculate the total number of clock cycle required to complete the execution of all the instructions. c. Draw the pipeline implementation diagram for the execution of above instructions with code reordering and stall implementation and calculate the total number of clock cycle required to complete the execution of all the instructions (10M) Q4. Consider a 32 bit processor with 32KB direct mapped Ll-cache that uses a block size of 16 words (IW=4B). It has an L2 cache of 128 KB with 8-way associativity and block size of 8 words. The system uses a byte addressable 256MB DRAM system. Upon running a program, 32 consecutive fixed length instructions (each instruction is one word) starting at main memory address Ox FECA720 are executed. These instructions operate on an array A of 64 words, with starting address 0x09F7CF4. Assuming caches are initially empty; indicate the non-empty sets (in decimal) on Ll cache and L2 cache after the execution of the program. (10M) Q3. Consider an instruction pipeline with six stages without any branch prediction: S1, S2, S3, S4, S5 (Execution stage (ALU)) and S6. The stage delays for S1, S2, S3,S4,S5 and S6 are 12 nsec, 11 nsec, 16 nsec, 6 nsec, 7 nsec and 15 nsec, respectively. There are intermediate storage buffers after each stage and the delay of each buffer is 3 nsec. A program consisting of 24 instructions I 1.1_2, ...,1_24 is executed in the pipelined processor. Instruction I_8 is the only branch instruction, and its branch target is I 14. If the branch is taken during the execution of this program, what is the time needed to complete the program (5M) Q2. Consider the following sequence of instructions executed by 3-stage pipeline architecture. Instruction Instruction Seq. No. 1. and St3.St1.S12 lw $t1,0(St4) sub St5,St3.St4 4. add Sto, St1,$12 5. add St1,$t3,St6 6. sw $t1.4 (St4) 7. lw $12.4($14) or St3,5t5,St6 2. 3. 8. a. Draw the pipeline implementation diagram for the execution of above instructions only with stall implementation and calculate the total number of clock cycles required to complete the execution of all the instructions. Also find the type of hazards between any instructions in the above by filling the following table Instruction Seq. Instruction Seq. Due to which Type of hazards No. Register No. b. Draw the pipeline implementation diagram for the execution of above instructions with forwarding and stall implementation and calculate the total number of clock cycle required to complete the execution of all the instructions. c. Draw the pipeline implementation diagram for the execution of above instructions with code reordering and stall implementation and calculate the total number of clock cycle required to complete the execution of all the instructions (10M) Q4. Consider a 32 bit processor with 32KB direct mapped Ll-cache that uses a block size of 16 words (IW=4B). It has an L2 cache of 128 KB with 8-way associativity and block size of 8 words. The system uses a byte addressable 256MB DRAM system. Upon running a program, 32 consecutive fixed length instructions (each instruction is one word) starting at main memory address Ox FECA720 are executed. These instructions operate on an array A of 64 words, with starting address 0x09F7CF4. Assuming caches are initially empty; indicate the non-empty sets (in decimal) on Ll cache and L2 cache after the execution of the program. (10M) Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started