Questions and Answers of Computer Organization Design

If on average we need to access memory once every 100 cycles, what is impact on our application?On a CC-NUMA system, the cost of accessing non-local memory can limit our ability to utilize
Loop unrolling was described in Chapter 4. Apply loop unrolling to this loop and then consider running this code on a 2-node distributed memory message passing system. Assume that we are going to use
First, write down a list of your daily activities that you typically do on a weekday. For instance, you might get out of bed, take a shower, get dressed, eat breakfast, dry your hair, brush your
Find all hazards in this instruction sequence for a 5-stage pipeline with and then without forwarding.Problems in this exercise refer to the following instruction sequences: a. b. ADD R1, R2,
Control hazards can be eliminated by adding branch delay slots. How many delay slots must follow each branch if we want to eliminate all control hazards in this processor?This exercise is intended to
What is the total latency of an LW instruction in a pipelined and non-pipelined processor?In this exercise, we examine how pipelining affects the clock cycle time of the processor. Problems in this
This exercise is intended to help you understand the cost/complexity/performance trade-offs of forwarding in a pipelined processor. Problems in this exercise refer to pipelined datapaths from Figure
This exercise explores some of the tradeoffs involved in pipelining, such as clock cycle time and utilization of hardware resources. The first three problems in this exercise refer to the following
If the loop exits after executing only two iterations, draw a pipeline diagram for your MIPS code from 4.28.1 executed on a 2-issue processor shown in Figure 4.69. Assume the processor has perfect
In this exercise we examine in detail how an instruction is executed in a single-cycle datapath. Problems in this exercise refer to a clock cycle in which the processor fetches the following
Repeat 4.27.1, but this time assume that the instruction in the delay slot also causes a hardware error exception when it is in MEM stage.Exercise 4.27.1Assume that this branch is correctly predicted
The first three problems in this exercise refer to the execution of the following instruction in the pipelined datapath from Figure 4.51, and assume the following clock cycle time, ALU latency, and
Consider a datapath similar to the one in Figure 4.11, but for a processor that only has one type of instruction: unconditional PC-relative branch. What would the cycle time be for this
This exercise is intended to help you better understand the last pitfall from failure to consider pipelining in instruction set design. The first four problems in this exercise refer to the following
This exercise explores how exception handling affects control unit design and processor clock cycle time. The first three problems in this exercise refer to the following MIPS instruction that
Different instructions utilize different hardware blocks in the basic single-cycle implementation. The next three problems in this exercise refer to the following instruction:Which resources (blocks)
What are the values of control signals generated by the control in Figure 4.2 for this instruction?Figure 4.2Different instructions utilize different hardware blocks in the basic single-cycle
Assuming there are no stalls and that 60% of all conditional branches are taken, in what percentage of clock cycles does the branch adder in the EX stage generate a value that is actually
This exercise is designed to help you understand the discussion of the “Pipelining is easy. The first four problems in this exercise refer to the following MIPS instruction:Describe a pipelined
The first three problems in this exercise refer to the following MIPS instruction:As this instruction executes, what is kept in each register located between two pipeline stages? a. b. SW R16,-100
Assuming there are no stalls, how often (percentage of all cycles) do we actually need to use all three register ports (two reads and a write) in the same cycle?Problems in this exercise assume that
What is the speedup that would be achieved by using four branch delay slots to reduce control hazards in this processor? Assume that there are no data dependences between instructions and that all
Describe the requirements of forwarding and hazard detection units for your datapath from 4.34.1.Exercise 4.34.1Describe a pipelined datapath needed to support only this instruction. Your datapath
If there is a separate handler address for each exception, show how the pipeline organization must be changed to be able to handle this exception. You can assume that the addresses of these handlers
Which registers need to be read, and which registers are actually read?The first three problems in this exercise refer to the following MIPS instruction: a. b. SW R16,-100 (R6) OR R2, R1,
In a 4-issue processor with these pipeline parameters, how many branch instructions can be expected to be “in progress” (already fetched but not yet committed) at any given time?The remaining
What do you get if you take the square root of B and then multiply that value by itself? What should you get? Do for both single and double precision floating point numbers. (Write a program to do
Write down the binary representation of the decimal number assuming it was stored using the single precision IBM format (base 16, instead of base 2, with 7 bits of exponent).The following table shows
Problems in this exercise refer to the following logic block:Does this block contain logic only, lip-lops only, or both? Logic Block a. Small Multiplexor (Mux) with four 8-bit data inputs b. Small
For this problem, assume that all branches are perfectly predicted (this eliminates all control hazards) and that no delay slots are used. If we only have one memory (for both instructions and data),
Write an MIPS assembly language program to calculate the product of the signed integers A and B. State if you are using the approach given in 3.4.4 or 3.4.5.Problem 3.4.4When multiplying signed
Assume A and B are unsigned 8-bit integers. Calculate A + B using saturating arithmetic. The result should be written in decimal. Show your work.The following table also shows pairs of decimal
Write an MIPS assembly language program to calculate the sum of A and B, assuming they are stored in the modified 16-bit NVIDIA format described in 3.11.2. Assume 1 guard, 1 round bit, and 1 sticky
Describe in detail one technique for performing floating point division in a digital computer. Be sure to include references to the sources you used.The following table shows further pairs of decimal
Write an MIPS assembly language program to perform the multiplication of A and B using Booth’s algorithm.The following table shows further pairs of hexadecimal numbers. a. b. A F6 08 B 7F 55
Using a table similar to that shown in Figure 3.11, calculate A divided by B using the hardware described in Figure 3.12. You should show thecontents of each register on each step. Assume A and B are
Assume A and B are signed 8-bit decimal integers stored in two’s complement format. Calculate A – B using saturating arithmetic. The result should be written in decimal. Show your work.The
In the following exercise, the data table contains various MIPS logical operations. You will be asked to find the result of these operations given values for registers $t0 and $t1.Assume that $t0 =
How does the performance of non-restoring and nonperforming division compare? Demonstrate by showing the number of steps necessary to calculate A divided by B using each method. Assume A and B are
Based on your answers to 3.13.4 and 3.13.5, does (A × B) × C = A × (B × C)?Problem 3.13.4Calculate (A × B) × C by hand, assuming A, B, and C are stored in the modified 16-bit NVIDIA format
Write an MIPS assembly language program to calculate A divided by B, using the approach described in Figure 3.12. Assume A and B are signed integers.Figure 3.12The following table shows further pairs
Translate this instruction into MIPS micro-operations.This exercise is intended to help you better understand the last pitfall from failure to consider pipelining in instruction set design. The first
If there is no forwarding or hazard detection, insert NOPs to ensure correct execution.This exercise is intended to help you understand the relationship between forwarding, hazard detection, and ISA
What is the value of the instruction word?In this exercise we examine the operation of the single-cycle datapath for a particular instruction. Problems in this exercise refer to the following MIPS
If many (e.g., 1,000,000) iterations of this loop are executed, determine the fraction of all register reads that are useful in a 2-issue static superscalar processor.In this exercise, we consider
To avoid lengthening the critical path of the datapath shown in Figure 4.24, how much time can the control unit take to generate the MemWrite signal?In this exercise we examine how the clock cycle
Draw the pipeline execution diagram for this code, assuming there are no delay slots and that branches execute in the EX stage.This exercise is intended to help you understand the relationship
Implement the logic for the Control signal 1. Your circuit should directly implement the given expression (do not reorganize the expression to “optimize” it), using NOT gates and 2-input AND, OR,
What is the CPI achieved by a 2-issue static superscalar processor on this program?In this exercise, we make several assumptions. First, we assume that an N-issue superscalar processor can execute
Stall cycles due to mispredicted branches increase the CPI. What is the extra CPI due to mispredicted branches with the always-taken predictor? Assume that branch outcomes are determined in the EX
What CPI would be achieved if the MIPS version of this loop is executed on a 1-issue processor with static scheduling and a 5-stage pipeline?Problems in this exercise refer to the following loop,
Design a circuit with 1-bit data inputs and a 1-bit data output that accomplishes this operation serially, starting with the least-significant bit. In a serial implementation, the circuit is
What is the accuracy of always-taken and always-not-taken predictors for this sequence of branch outcomes?This exercise examines the accuracy of various branch predictors for the following repeating
What must be changed in the pipelined datapath to add this instruction to the MIPS ISA?In this exercise, we examine how the ISA affects pipeline design. Problems in this exercise refer to the
How many instructions are expected to be executed between the time one branch misprediction is detected and the time the next branch misprediction is detected?Problems in this exercise assume that
How many register read ports should the processor have to avoid any resource hazards due to register reads?This exercise explores how branch prediction affects performance of a deeply pipelined
Convert A into a binary number. What makes base 8 (octal) an attractive numbering system for representing values in computers?The following table also shows pairs of octal numbers.
Convert A into a binary number. What makes base 16 (hexadecimal) an attractive numbering system for representing values in computers?The following table also shows pairs of hexadecimal numbers.
Write an MIPS assembly language program to calculate the sum of A and B, assuming they are stored using the format described in 3.11.1. Now modify the program to calculate the sum assuming the format
For each stage of the pipeline, determine the values of exception- related control signals from Figure 4.66 as this instruction passes through that pipeline stage.This exercise explores how exception
If the only thing we need to do in a processor is fetch consecutive instructions, what would the cycle time be?Problems in this exercise assume that logic blocks needed to implement a processor’s
For each stage of the pipeline, what are the values of the control signals asserted by this instruction in that pipeline stage?The first three problems in this exercise refer to the execution of the
What are the outputs of the sign-extend and the jump “Shift left 2” unit (near the top of Figure 4.24) for this instruction word?In this exercise we examine in detail how an instruction is
Assume that this branch is correctly predicted as taken, but then the instruction at “Label” is an undefined instruction. Describe what is done in each pipeline stage for each cycle, starting
If we use no forwarding, what fraction of cycles are we stalling due to data hazards?This exercise is intended to help you understand the cost/complexity/performance trade-offs of forwarding in a
Translate this C code into MIPS instructions. Your translation should be direct, without rearranging instructions to achieve better performance.In this exercise we compare the performance of 1-issue
Which parts of the basic single-cycle datapath are used by all of these instructions? Which parts are the least utilized?This exercise explores some of the tradeoffs involved in pipelining, such as
What is the clock cycle time if the only types of instructions we need to support are ALU instructions (ADD, AND, etc.)?In this exercise we examine how latencies of individual components of the
What is the clock cycle time in a pipelined and non-pipelined processor?In this exercise, we examine how pipelining affects the clock cycle time of the processor. Problems in this exercise assume
Let us assume that processor testing is done by illing the PC, registers, and data and instruction memories with some values (you can choose which values), letting a single instruction execute, then
Which existing blocks (if any) can be used for this instruction?The basic single-cycle MIPS implementation in Figure 4.2 can only implement some instructions. New instructions can be added to an
Find all data dependences in this instruction sequence.Problems in this exercise refer to the following instruction sequences: a. b. ADD R1, R2, R1 LW R2,0 (R1) LW R1,4 (R1) OR R3, R1, R2 LW R1,0
How much energy is spent to execute an ADD instruction in a single-cycle design and in the 5-stage pipelined design?This exercise explores energy efficiency and its relationship with performance.
Which exceptions can each of these instructions trigger? For each of these exceptions, specify the pipeline stage in which it is detected.This exercise explores how exception handling affects
Using a table similar to that shown in Figure 3.11, calculate A divided by B using the hardware described in Figure 3.12. You should show the contents of each register on each step. Assume A and B
Calculate A + (B + C) by hand, assuming A, B, and C are stored in the modified 16-bit NVIDIA format described in 3.11.2 (and also described in the text). Assume 1 guard, 1 round bit, and 1 sticky
Write an MIPS assembly language program to calculate A divided by B using non-restoring division. Assume A and B are 6-bit signed (two’s complement) integers.Figure 3.10 describes a restoring
Use a low chart (or a high-level code snippet) to describe how the algorithm works.Division is so time-consuming and difficult that the CRAY T3E Fortran Optimization guide states, “The best
Calculate (A × B) + (A × C) by hand, assuming A, B, and C are stored in the modified 16-bit NVIDIA format described in 3.11.2 (and also described in the text). Assume 1 guard, 1 round bit, and 1
If this bit pattern is placed into the Instruction Register, what MIPS instruction will be executed?In a Von Neumann architecture, groups of bits have no intrinsic meanings by themselves. What a bit
What is the sum of A and B if they represent signed 12-bit octal numbers stored in sign-magnitude format? The result should be written in octal. Show your work.The book shows how to add and subtract
What is the sum of A and B if they represent signed 16-bit hexadecimal numbers stored in sign-magnitude format? The result should be written in hexadecimal. Show your work.Hexadecimal (base 16) is
NVIDIA has a “half” format, which is similar to IEEE 754 except that it is only 16 bits wide. The leftmost bit is still the sign bit, the exponent is 5 bits wide and stored in excess-56 format,
Write down the bit pattern in the mantissa assuming a floating point format that uses Binary Coded Decimal (base 10) numbers in the mantissa instead of base 2. Assume there are 24 bits, and you do
Assume A and B are signed 8-bit decimal integers stored in sign magnitude format. Calculate A – B. Is there overflow, underflow, or neither?Overflow occurs when a result is too large to be
Write an MIPS assembly language program to calculate the product of unsigned integers A and B, using the approach described in Figure 3.4.Figure 3.4Let’s look in more detail at multiplication. We
Write an MIPS assembly language program to calculate the product of A and B, assuming they are stored using the format described in 3.11.1. Now modify the program to calculate the sum assuming the
Calculate the time necessary to perform a multiply using the approach given in Figure 3.8 if an integer is A bits wide and an adder takes B time units.Figure 3.8For many reasons, we would like to
Write an MIPS assembly language program that performs a multiplication on signed integers using shifts and adds, using the approach described in 3.6.1.In this exercise we will look at a couple of
Write an MIPS assembly language program to calculate A divided by B, using the approach described in Figure 3.9. Assume A and B are unsigned 6-bit integers.Figure 3.9Let’s look in more detail at
Based on your answers to 3.13.1 and 3.13.2, does (A + B) + C = A + (B + C)?Operations performed on fixed-point integers behave the way one expects—the commutative, associative, and distributive
How does the performance of restoring and non-restoring division compare? Demonstrate by showing the number of steps necessary to calculate A divided by B using each method. Assume A and B are 6-bit
Write an MIPS assembly language program to perform division using the algorithm.Division is so time-consuming and difficult that the CRAY T3E Fortran Optimization guide states, “The best strategy
Based on your answers to 3.14.1. and 3.14.2, does (A × B) + (A × C) = A × (B + C)?The Associative law is not the only one that does not always hold in dealing with floating point numbers. There
What decimal number does the bit pattern represent if it is a floating point number? Use the IEEE 754 standard.In a Von Neumann architecture, groups of bits have no intrinsic meanings by themselves.
Convert A into a decimal number, assuming it is unsigned.Repeat assuming it stored in sign-magnitude format. Show your work.The book shows how to add and subtract binary and decimal numbers. However,
Convert A into a decimal number, assuming it is unsigned. Repeat assuming it stored in sign-magnitude format. Show your work.Hexadecimal (base 16) is also a commonly used numbering system for
The Hewlett-Packard 2114, 2115, and 2116 used a format with the leftmost 16 bits being the mantissa stored in two’s complement format, followed by another 16-bit field which had the leftmost 8 bits
Write down the bit pattern assuming that we are using base 15 numbers in the mantissa instead of base 2. (Base 16 numbers use the symbols 0–9 and A–F. Base 15 numbers would use 0–9 and A–E.)
When multiplying signed numbers, one way to get the correct answer is to convert the multiplier and multiplicand to positive numbers, save the original signs, and then adjust the final value

Showing 400 - 500 of 1073