Question: Chapter1: 1. In Example 1 of Section 1.2.1, we assumed that the cache miss penalty was 20 cycles. With modern processors running at a frequency

Chapter1:

1. In Example 1 of Section 1.2.1, we assumed that the cache miss penalty was 20 cycles. With modern processors running at a frequency of 1 to 3 GHz, the cache miss penalty can reach several hundred cycles (we will see how this can be somewhat mitigated by a cache hierarchy). Keeping all other parameters, the same as in the example, plot CPI vs. cache miss penalty cost when the latter varies between 20 and 500 cycles (choose appropriate intervals). Do your computations argue for the threat of a memory wall whereby loading instructions and data could potentially dominate the execution time?

Chapter2 Assume that you want to use only one ALU for the basic five-stage pipeline of Section

2.1.2. A problem arises when you implement branches in that the ALU needs to be used twice: once for branch target computation and once for making the register comparison. Design the data path and control unit with this constraint. What are the performance implications?

Assume that it takes 1 cycle to access and return the information on a data-TLB hit and 1 cycle to access and return the information on a data-cache hit. A data-TLB miss takes 300 cycles to resolve, and a data-cache miss takes 100 cycles to resolve. The data-TLB hit rate is 0.99, and the data-cache hit rate is 0.95. What is the average memory access time for a load data reference? Now the data cache is an L1 D-cache and is backed up by an L2 unified cache. Every data reference that misses in L1 has a 60% chance of hitting in L2. A miss in L1 followed by a hit in L2 has a latency of 10 cycles. What is the average memory access time for a load data reference in this new configuration?

As a designer you are asked to evaluate three possible options for an on-chip write-through data cache. Some of the design options (associativity) and performance consequences (miss rate, miss penalty) are described in the table below:

Data cache options Miss rate Miss penalty Cache A: Direct mapped 0.08 4 cycles Cache B: Two-way set-associative 0.04 6 cycles Cache C: Four-way set-associative 0.02 8 cycles

(a) Assume that load instructions have a CPI of 1.2 if they hit in the cache and a CPI of 1.2 + (miss penalty) otherwise. All other instructions have a CPI of 1.2. The instruction mix is such that 20% of instructions are loads. What is the CPI for each configuration (youll need to keep three decimal digits, i.e., compute the CPI as x.xxx)? Which cache would you choose if CPI is the determining factor?

(b) Assume now that if the direct-mapped cache is used, the cycle time is 20 ns. If the two-way set-associative cache is used, the cycle time is 22 ns. If the four-way is used, the cycle time is 24 ns. What is the average time per instruction? Which cache would you choose if average time per instruction is the determining factor?

(c) In the case of the two-way set-associative cache, the replacement algorithm is LRU. In the case of the four-way set-associative cache, the replacement algorithm is such that the most recently used (MRU) line is not replaced; the choice of which of the other three is replaced is random and not part of the logic associated with each line in the cache. Indicate what bits are needed to implement the replacement algorithms for each line.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

ADVANCED COMPUTER ARCHITECTURE CPCS504 Assignment1 Spring 2021 /1442 Due Date 1st March 2021 Chapter1: 1. In Example 1 of Section 1.2.1, we assumed that the cache miss penalty was 20 cycles. With...

Exercise 4. (30 Marks] The following tables summarizes the instructions present in some program (Table 1) as well as two different processors (Table 2). Operation ALU Load Store Branch Instruction...

Multiple CHoice QUestions Question 80 Some filesystems require ______________ tools to restore the performance on mechanical drives, which have sections of the filesystem become non-contiguous. Save...

NOTE: Looking for an answer for part b Exercise 1. [20 Marks] The following tables summarizes the instructions present in some program (Table 1) as well as two different processors (Table 2)....

Exercise 4. [30 Marks The following tables summarizes the instructions present in some program (Table 1) as well as two different processors (Table 2). Operation ALU Load Store Branch Instruction...

Exercise 4. [30 Marks The following tables summarizes the instructions present in some program (Table 1) as well as two different processors (Table 2). Operation Instruction Count ALU 2000 Load 1000...

Exercise 5.8 This exercise examines the impact of different cache designs, specifi cally comparing associative caches to the direct-mapped caches from Section 5.2. For these exercises, refer to the...

Exercise 2. [30 Marks The following tables summarizes the instructions present in some program (Table 1) as well as two different processors (Table 2) OperationFreguency OperationProcessor 1...

Exercise 2. [30 Marks] The following tables summarizes the instructions present in some program (Table 1) as well as two different processors (Table 2). Operation Frequency ALU 1500 Load 800 Store...

Provide a summary technical report with your own words about Pipelined Execution which is also named as Instruction Level Parallelism, addressing mainly the following areas: 1. What is Pipelined...

Moist air at 12oC and 80% R.H. enters a duct at a rate of 150 m3/min. The mixture is heated until it exits at 35oC. The pressure remains constant at 100 kPa.Determine (a) The relative humidity at the...

Two roommates, Prudence and Glitter, graduate from college and get identical jobs that pay them $50,000 this year and $55,000 next year. The roommates have different utility functions so that the...

Financial statement analysis is an indispensable skill for evaluating companies financial health and performance. This is a key tool for investors, credit analysts, corporate managers, and...

Write a critical reflection for a vending machine program Vending Machine Features (do not make the program): A menu of drinks and snacks presented via the console. The number and range of items is...

=+Duration (and especially the number of days out of the home country in a particular tax year) usually has important tax ramifications for the employee (and for the employer, if it is taking...

=+3 What is the anticipated duration of the international assignment? Is it a shortterm assignment (usually less than one year and probably within the same calendar/tax year) or long-term...

=+4 What happens with the IA at the end of the assignment? What are the repatriation plans for the IA upon completion of the assignment? Is the employee returning to the home country, continuing in...