(8 marks) Question 1: Nehalem is the microarchitecture of the first generation of Intel Core i7 processor. It consists of 2, 4, or 8 cores, each with a 64 KB of Li cache per core (32 KB L1 instruction cache and 32 KB L1 data cache, both with 256-bit cache blocks), a unified 256 KB 8-way set associative L2 cache (with 64-byte blocks) per core, and a shared 8MB 16-way set associative L3 cache (with 64-byte blocks). The 4-way associative L1 instruction cache latency is 3 CPU cycles, while the 8-way associative ll data cache latency is 4 cycles. The L2 cache access time is 10 cycles, and L3 is 40 cycles. It has a 36-bit address bus. (a) Based on the above given information, complete the below table. (b) Given that the processor is Core i7-930 which has a clock rate of 2.8 GHz and it works with a memory module whose access time for a block of 64-byte data is 88.57 ns. For a particular program, the L1 D-Cache miss rate is 10%, the L2 miss rate is 7%, and the L3 miss rate is 4% (all are global miss rates). Find the average data memory access time (ADMAT) (in terms of ns). (c) To save the cost, either the L2 cache or the L3 cache is to be removed from the processor in (b). Based on ADMAT, explain whether L2 or L3 would be chosen for removal. (a) Answer: Caches L3 Cache L1 I-Cache L1 D-Cache (per core) (per core) L2 Cache (per core) Data Cache Capacity (in Bytes) Cache Block Size (in Bytes) Degree of Associativity Depth Tag Width (in bits) Index Width (in bits) Offset Width (in bits) (8 marks) Question 1: Nehalem is the microarchitecture of the first generation of Intel Core i7 processor. It consists of 2, 4, or 8 cores, each with a 64 KB of Li cache per core (32 KB L1 instruction cache and 32 KB L1 data cache, both with 256-bit cache blocks), a unified 256 KB 8-way set associative L2 cache (with 64-byte blocks) per core, and a shared 8MB 16-way set associative L3 cache (with 64-byte blocks). The 4-way associative L1 instruction cache latency is 3 CPU cycles, while the 8-way associative ll data cache latency is 4 cycles. The L2 cache access time is 10 cycles, and L3 is 40 cycles. It has a 36-bit address bus. (a) Based on the above given information, complete the below table. (b) Given that the processor is Core i7-930 which has a clock rate of 2.8 GHz and it works with a memory module whose access time for a block of 64-byte data is 88.57 ns. For a particular program, the L1 D-Cache miss rate is 10%, the L2 miss rate is 7%, and the L3 miss rate is 4% (all are global miss rates). Find the average data memory access time (ADMAT) (in terms of ns). (c) To save the cost, either the L2 cache or the L3 cache is to be removed from the processor in (b). Based on ADMAT, explain whether L2 or L3 would be chosen for removal. (a) Answer: Caches L3 Cache L1 I-Cache L1 D-Cache (per core) (per core) L2 Cache (per core) Data Cache Capacity (in Bytes) Cache Block Size (in Bytes) Degree of Associativity Depth Tag Width (in bits) Index Width (in bits) Offset Width (in bits)