Question
2.39 [10] Repeat Exercise 2.38, but this time use LDXR/STXR to perform an atomic update of the shvar variable directly, without using lock() and unlock().
2.39 [10] Repeat Exercise 2.38, but this time use LDXR/STXR to perform an atomic update of the shvar variable directly, without using lock() and unlock(). Note that in this exercise there is no variable lk.
2.40 [5] Using your code from Exercise 2.38 as an example, explain what happens when two processors begin to execute this critical section at the same time, assuming that each processor executes exactly one instruction per cycle.
2.41 Assume for a given processor the CPI of arithmetic instructions is 1, the CPI of load/store instructions is 10, and the CPI of branch instructions is 3. Assume a program has the following instruction breakdowns: 500 million arithmetic instructions, 300 million load/store instructions, 100 million branch instructions.
2.41.1 [5] Suppose that new, more powerful arithmetic instructions are added to the instruction set. On average, through the use of these more powerful arithmetic instructions, we can reduce the number of arithmetic instructions needed to execute a program by 25%, while increasing the clock cycle time by only 10%. Is this a good design choice? Why?
2.41.2 [5] Suppose that we find a way to double the performance of arithmetic instructions. What is the overall speedup of our machine? What if we find a way to improve the performance of arithmetic instructions by 10 times? 2.42 Assume that for a given program 70% of the executed instructions are arithmetic, 10% are load/store, and 20% are branch.
2.42.1 [5] Given this instruction mix and the assumption that an arithmetic instruction requires two cycles, a load/store instruction takes six cycles, and a branch instruction takes three cycles, find the average CPI.
2.42.2 [5] For a 25% improvement in performance, how many cycles, on average, may an arithmetic instruction take if load/store and branch instructions are not improved at all?
2.42.3 [5] For a 50% improvement in performance, how many cycles, on average, may an arithmetic instruction take if load/store and branch instructions are not improved at all?
Most of these questions are based off of question 38, THis is the question for # 38:
2.38 [10] Write the LEGv8 assembly code to implement the following C code: lock(lk); shvar=max(shvar,x); unlock(lk); Assume that the address of the lk variable is in X0, the address of the shvar variable is in X1, and the value of variable x is in X2. Your critical section should not contain any function calls. Use LDXR/STXR instructions to implement the lock() operation, the unlock() operation is simply an ordinary store instruction.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started