Answered step by step
Verified Expert Solution
Question
1 Approved Answer
A short program loop goes through a 16 kB array one word at a time, reads a number from the array, adds a random number,
A short program loop goes through a 16 kB array one word at a time, reads a number from the array, adds a random number, and stores the result in the corresponding entry in another array that is located in the memory immediately following the first array. An outer loop repeats the above operation 100 times. The 64-bit processor, operating at a clock frequency of 4 GHz, is pipelined, has 48 address lines, three levels of caches with a 64 B block size. Each of the L1 caches has 512 sets, 2-way set-associativity, and an alternate cache replacement policy. The L2 cache is a 4-way set-associative 512 kB structure, whereas the L3 cache features 4 MB and 8-way set-associativity; L2 and L3 caches employ pseudo-LRU cache replacement policy. Write back and write allocate strategies are used in L2 and L3 caches, but simpler write hit and miss policies are used with L1 cache. The virtual address contains 52 bits plus 12 bits for security and PID/ASN, page size is 64 kB, and each of the page table caches contains 40 entries. Miss penalties for L1, L2 and L3 caches are 10, 20 and 50 cc, respectively. a. What is the size (in bytes) of each TLB? b. Compute the numbers of index, tag and block offset bits in each cache. c. Write the MIPS-64 assembly code to implement the problem described in the first sentence of this question. Assume that R30 comes up with a random number every time it is read. Also, assume that register R1 holds the address of the first byte of the source array. d. Explain the steps required for the processor to fetch and execute the first load instruction that reads the first element of the source array in the very first iteration of this program (note: this is not necessarily the first instruction in the program). Remember that some of the required information may not be available and so misses might result. Also, remember that the size of the displacement field in the instruction is limited. Assume that main memory always contains the needed information, whether any level of cache also has this or not. e. Calculate the number of accesses to TLBs, every cache, and main memory when this program executes. Calculate the number of misses in each of the storage structures when this program is executed. f. Calculate the time taken to execute this program in milliseconds. g. Every processor, regardless of its word size, has a byte addressible memory. Why? h. BONUS: How will the miss rates change if the size of each array is 128 kB
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started