Chip multiprocessors (CMPs) have multiple cores and their caches on a single chip. CMP on-chip L2 cache
Question:
Assume the following hit latencies:
1. Which cache design is better for each of these benchmarks? Use data to support your conclusion.
2. Shared cache latency increases with the CMP size. Choose the best design if the shared cache latency doubles. Off -chip bandwidth becomes the bottleneck as the number of CMP cores increases. Choose the best design if off -chip memory latency doubles.
3. Discuss the pros and cons of shared vs. private L2 caches for both single-threaded, multi-threaded, and multiprogrammed workloads, and reconsider them if having on-chip L3 caches.
4. Assume both benchmarks have a base CPI of 1 (ideal L2 cache). If having non-blocking cache improves the average number of concurrent L2 misses from 1 to 2, how much performance improvement does this provide over a shared L2 cache? How much improvement can be achieved over private L2?
5. Assume new generations of processors double the number of cores every 18 months. To maintain the same level of per-core performance, how much more off -chip memory bandwidth is needed for a processor released in three years?
6. Consider the entire memory hierarchy. What kinds of optimizations can improve the number of concurrent misses?
Step by Step Answer:
Computer Organization and Design The Hardware Software Interface
ISBN: 978-0124077263
5th edition
Authors: David A. Patterson, John L. Hennessy