Answered step by step
Verified Expert Solution
Question
1 Approved Answer
6. (20 pts) This problem examines the performance of moving from a single-core to a multi-core system. We begin by running on a single-core 2GHz
6. (20 pts) This problem examines the performance of moving from a single-core to a multi-core system. We begin by running on a single-core 2GHz CPU with three classes of instructions the CPI and instruction counts as shown: Class CP IC Math 2 2.40E+09 L/S Branch 5 Total 10 1.40E+09 1.20E+08 3.92E+09 As we move to multi-core, we discover that we don't realize the speedup you might expect, because there's a coordination tax for all but the branch instructions- in our case, our core efficiency is 70%. So for p processors, for our Math and Load/Store instructions, our per- processor instruction count is 1/(0.7*p) * the number of instructions the single-core system handled. But because multiple cores are running in parallel, the total execution time is that of any one of the processors. So by calculating the cycles for one processor, we can calculate the elapsed time. Do so in the following table a. (10 pts) [5] COD 1.7> Find the total execution time for this program on 1, 2, 4, 8, 16, and 32 cores, and show the relative speedup relative to the single processor result. Use the following table for your answers #of Cores Math IC/ Math L/S IC/ CPI #branch CPI exec per-p time CPI core L'S instr Branch cycles (sec) speedup corc 1 2.40E+09 2 1.40E+09 10 1.20E+08 5 1.94E+10 9.7 10 10 10 10 10 16 32 b. (5 pts) [10] Is it possible to reduce the CPI of your load/store commands on a single-core system to match the execution time in the 4-core configuration above? What formula would you use? What would the new CPI be? 6. (20 pts) This problem examines the performance of moving from a single-core to a multi-core system. We begin by running on a single-core 2GHz CPU with three classes of instructions the CPI and instruction counts as shown: Class CP IC Math 2 2.40E+09 L/S Branch 5 Total 10 1.40E+09 1.20E+08 3.92E+09 As we move to multi-core, we discover that we don't realize the speedup you might expect, because there's a coordination tax for all but the branch instructions- in our case, our core efficiency is 70%. So for p processors, for our Math and Load/Store instructions, our per- processor instruction count is 1/(0.7*p) * the number of instructions the single-core system handled. But because multiple cores are running in parallel, the total execution time is that of any one of the processors. So by calculating the cycles for one processor, we can calculate the elapsed time. Do so in the following table a. (10 pts) [5] COD 1.7> Find the total execution time for this program on 1, 2, 4, 8, 16, and 32 cores, and show the relative speedup relative to the single processor result. Use the following table for your answers #of Cores Math IC/ Math L/S IC/ CPI #branch CPI exec per-p time CPI core L'S instr Branch cycles (sec) speedup corc 1 2.40E+09 2 1.40E+09 10 1.20E+08 5 1.94E+10 9.7 10 10 10 10 10 16 32 b. (5 pts) [10] Is it possible to reduce the CPI of your load/store commands on a single-core system to match the execution time in the 4-core configuration above? What formula would you use? What would the new CPI be
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started