Question
This is from the exercise of textbook Parallel Computer Organization and Design by Michael Dubois, Murali Annavaram, and Per Stenstrom, does someone have the exercise
This is from the exercise of textbook "Parallel Computer Organization and Design" by Michael Dubois, Murali Annavaram, and Per Stenstrom, does someone have the exercise solution of this book? Thanks!!
Two improvements are considered to a base machine with a load/store ISA and in which floating-point arithmetic instructions are implemented by software handlers. The first
improvement is to add hardware floating-point arithmetic units to speed up floating-point arithmetic instructions. It is estimated that the time taken by each floating-point instruction can be reduced by a factor of 10 with the new hardware. The second improvement
is to add more first-level data cache to speed up the execution of loads and stores. It is
estimated that, with the same amount of additional on-chip cache real-estate as for the
floating-point units, loads and stores can be speeded up by a factor of 2 over the base
machine.
Let Ffp and Fls be the fraction of execution time spent in floating-point and load/store
instructions respectively. The executions of these two sets of instructions are nonoverlapping
in time.
(a) Using Amdahls speedup, what should the relation be between the fractions Ffp and
Fls such that the addition of the floating-point units is better than the addition of cache
space?
(b) Suppose that, instead of being given the values of fractions Ffp and Fls, you are given the
fraction of floating-point instructions and the fraction of loads and stores. You are also
given the average number of cycles taken by floating-point operations and loads/stores.
Can you still find out which improvement is better based on these numbers? Explain
why and how. Can you still estimate the maximum speedups for each improvement
using Amdahls law? Why?
(c) What are fractions Ffp and Fls such that a speedup of 50% (or 1.5) is achieved for each
improvement deployed separately?
(d) It is decided to deploy the floating-point unit first and to add cache space later on.
In the original workload, fractions Ffp and Fls are 30% and 20%, respectively. What
is the maximum speedup obtained by upgrading to the floating-point units? Assuming
that this maximum speedup is achieved by the floating-point unit upgrade, what is
the maximum speedup of the cache upgrade with respect to the floating-point unit
upgrade?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access with AI-Powered Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started