Question: Exercise 7.9 Considering the CC-NUMA system described in Exercise 7.8, assume that the system has four nodes, each with a single-core CPU (each CPU has

Exercise 7.9 Considering the CC-NUMA system described in Exercise 7.8, assume that the system has four nodes, each with a single-core CPU (each CPU has its own L1 data cache and L2 data cache). The L1 data cache is store-through, though the L2 data cache is write-back. Assume that system has a workload where one CPU writes to an address, and the other CPUs all read the data that is written. Also assume that the address written to is initially only in memory and not in any local cache. Also, after the write, assume that the updated block is only present in the L1 and L2 caches of the core per forming the write.

7.9.1 [10] <7.3> For a system that maintains coherency using cache-based block status, describe the internode traffi c that will be generated as each of the four cores writes to a unique address, after which each address written to is read from by each of the remaining three cores.

7.9.2 [10] <7.3> For a directory-based coherency mechanism, describe the internode traffi c generated when executing the same code pattern.

7.9.3 [20] <7.3> Repeat Exercises 7.9.1 and 7.9.2 assuming that each CPU is now a multicore CPU, with four cores per CPU, each maintaining an L1 data cache, but provided with a shared L2 data cache across the four cores. Each core will perform the write, followed by reads by each of the 15 other cores.

7.9.4 [10] <7.3> Consider the system described in Exercise 7.9.3, now assuming that each core writes to two different bytes stored in the same cache block. How does this impact bus traffi c? Explain.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock