Question

1 Approved Answer

Posted on Sep 26, 2024

In the classic 5-stage pipeline, it is proposed to predict branches as always taken instead of always untaken. The branch instruction is decoded in ID

In the classic 5-stage pipeline, it is proposed to predict branches as always taken instead of always untaken. The branch instruction is decoded in ID and its target address is computed in ID. At the end of ID, a conditional branch is always taken and IF is systematically flushed. Then, in the EX stage, the branch condition is evaluated. If the branch is verified taken, then execution continues. However, if the branch is verified untaken, the IF and ID stages are flushed and the instruction at branch_PC + 4 is fetched. (a) What is the fraction f of branches that should be taken so that the design with branch predicted always taken is a good choice over branch predicted always untaken? (b) A hint bit is associated with each conditional branch instruction. The compiler sets the hint bit to steer the hardware prediction to taken and it resets the hint bit to steer the hardware prediction to untaken. The hint bit is known in the decode stage so that the two hardwired schemes (always taken and always untaken) can be applied with no additional loss of cycle to each branch instruction. What should be the success rate of the compilers prediction so that the performance of this approach is always better than the hardware scheme where a branch is predicted always taken? What should be the success rate of the compilers prediction so that the performance of this approach is always better than the hardware scheme where a branch is predicted always untaken? The compiler prediction success rates for taken and untaken branches are assumed equal. Please use the following variables in your solution: f is the fraction of taken branches; X is the success rate of the compiler prediction algorithm (i.e., the fraction of branches that are accurately predicted by the compiler to meet the conditions); X should be a function of f in both cases. (c) Take the 5-stage pipeline with perfect branch handling (optimum, no cycle ever wasted on branches) as the baseline. Compare the energy per instruction (EPI) of this baseline with the following cases: always predicted untaken; always predicted taken; compiler-based prediction with hint bit and 5% misprediction rate uniform over all predictions. To make this problem possible, we assume that each stage of the pipeline consumes the same energy per clock (whatever the instruction, even after an instruction has become a noop) and that the energy needed to flush a stage is negligible. Also assume that the fraction of instructions that are branches is denoted by b.