CMP 334 Homework 10: TINY ISA CPI For the implementation of the Tiny instruction set architecture sketched in the slides for day 17 of class 4 cycles to execute 6 cycles to execute 4cycles to execute 5 cycles to execute Conditional branch instructions (not taken) require 4 cycles to execute 5 cycles to execute For an important suite of applications, half of the instructions are found to be ALU ALU instructions require Load /store instructions require Ordinary unconditional branch instructions require Subprogram call instructions require Conditional branch instructions (taken) require instructions, 20% loads, 10% stores, and 20% branches. Of the branches, one tenth are sub-program calls (unconditional branches that save the return address-rT is not SO), three tenths are ordinary unconditional branches, two tenths are conditional branch instructions where the branch does not end up being taken, and four tenths are conditional branches that are taken. 1) What is the average CPI for this instruction mix on this implementation of the TINY instruction set architecture? Hint: This is a weighted average problem. a) Before you do any arithmetic, show the equation(s) you will use to obtain a solution. (Briefly explain any variables you invent for this step.) b) Then substitute numeric values for the variables of the preceding step c) Without using a calculator, estimate the numeric value of the formula from the preceding step. (This step is for our mutual edification. You will not be graded on it.) d) (If you are unsure of your answer from the preceding step) use a calculator to obtain an answer that is correct to three significant digits. 2) Suppose that, in a radically different implementation of the TINY ISA, every instruction requires six cycles (3 to fetch and 3 to execute), but the execution phase of one instruction can be overlapped with the fetch phase of the next instruction. Since two instructions are being processed at once, the effective CPI for this implementation is 3 cycles per instruction. What is the speed up of this implementation over that of #1? 3) Suppose that the overlap of #2 were only 90% effective (1 in 10 instruction cannot be fetched until the preceding instructions completes execution), what would be the effective CPI for this implementation