Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Consider the reservation table in Figure 8.4. Suppose that the processor includes forwarding logic that is able to tell that instruction A is writing to

Consider the reservation table in Figure 8.4. Suppose that the processor includes forwarding logic that is able to tell that instruction A is writing to the same register that instruction B is reading from, and that therefore the result written by A can be forwarded directly to the ALU before the write is done. Assume the forwarding logic itself takes no time. Give the revised reservation table. How many cycles are lost to the pipeline bubble?
image text in transcribed
reached the memory stage before the PC can he updated. The instactions that follow in memory will have been fetched and will be at the decode and esvecute stages already by the time it is determined that those instructions should ot in fact be enacutod Like data harands, there are multiple lechniques fordcaling with coelharands A delayed branch simply documents the fact that the beanch will be taken some number of cyclkes after it is encountered, and leaves it upto the programmer (or compiler) to ensure that the instructions that follow the conditional branch instnaction are either harmless like no-ops) or do useful work that does not depend on whother the beanch is taken. An interlock proviades hardware to itsert pipeline bubles as meodedl jst as with data harards In the most elaborate technique, speculative esecution hwae estimutes whether the branch is likely to be taken, and begins eecuting the instructions it expects to evecute. If its expectation is not met, then it undoes any sade effects (ach as regier writes) that the speculatively executed instructions caused Except for explicit pipelines and delayed beanches, all of these tochniges introdace vari- aility in the timing of execution of an instruction seqaence. Analysis of the timing of segisner bank read egister bank wne cyde Figure 8.4: Reservation table for the pipeline shown in Figure 82 with interlocks assuming that rstrucion B reads a register that is wrtw, by rtutioh A. Lee &Seshia, Intreduction to Embedded Systems 227 8.2. PARALLELISM a program can become extremely difficult when there is a decp pipcline with elaborate forwarding and speculation. Explicit pipelines are selatively common in DSP processors, which are ofiten applied in contests where precise timing is essential Out-of-onder and speculative execution are common in peneral purpose processors whene timing matters quirements of the application and avoid processors whene the requis evel of sming Achieving high performance demands parallelism in the handware Such parallelism can take two broad Sorms, multicoee archiloctures, describad laier in Section 824, ee instruction-level parallelism (ILP, which is the subject of this section. A processoe supporting ILP is able to perform multiple indepemdent aperations in each instruction cyck. We discuss four major forms of ILP: CISC instructions, subwond parallelism, su perscalar, and VI.IW CISC Instructions A processor with complex (and typicalily, rather specialined) insaructioes is calleda CISC machine (complex instruction set computer). The p behind sach processors is distinetly different from that of RISC machines (redaced instruction set cempaters) Patterson and Ditzel, 1980) DSPs are typically CISC machines and inclade intructions specifically supporting FIR filtering (and ofien other algonithms sech as FFTs (fast Fourier transforms) and Viterbi decoding). In fact, so qualify as a DSP a processor must be able to perform FIR filering in one instrection cycle per tap Esample 8.6: The Texas Instuments TM532054s family of DSP processoes s imended to be used in power-constrained embedded applicatices that demand personal diital assistants (PDAs) The inner locp of an FIR computation (3I)is RPT numberofTaps reached the memory stage before the PC can he updated. The instactions that follow in memory will have been fetched and will be at the decode and esvecute stages already by the time it is determined that those instructions should ot in fact be enacutod Like data harands, there are multiple lechniques fordcaling with coelharands A delayed branch simply documents the fact that the beanch will be taken some number of cyclkes after it is encountered, and leaves it upto the programmer (or compiler) to ensure that the instructions that follow the conditional branch instnaction are either harmless like no-ops) or do useful work that does not depend on whother the beanch is taken. An interlock proviades hardware to itsert pipeline bubles as meodedl jst as with data harards In the most elaborate technique, speculative esecution hwae estimutes whether the branch is likely to be taken, and begins eecuting the instructions it expects to evecute. If its expectation is not met, then it undoes any sade effects (ach as regier writes) that the speculatively executed instructions caused Except for explicit pipelines and delayed beanches, all of these tochniges introdace vari- aility in the timing of execution of an instruction seqaence. Analysis of the timing of segisner bank read egister bank wne cyde Figure 8.4: Reservation table for the pipeline shown in Figure 82 with interlocks assuming that rstrucion B reads a register that is wrtw, by rtutioh A. Lee &Seshia, Intreduction to Embedded Systems 227 8.2. PARALLELISM a program can become extremely difficult when there is a decp pipcline with elaborate forwarding and speculation. Explicit pipelines are selatively common in DSP processors, which are ofiten applied in contests where precise timing is essential Out-of-onder and speculative execution are common in peneral purpose processors whene timing matters quirements of the application and avoid processors whene the requis evel of sming Achieving high performance demands parallelism in the handware Such parallelism can take two broad Sorms, multicoee archiloctures, describad laier in Section 824, ee instruction-level parallelism (ILP, which is the subject of this section. A processoe supporting ILP is able to perform multiple indepemdent aperations in each instruction cyck. We discuss four major forms of ILP: CISC instructions, subwond parallelism, su perscalar, and VI.IW CISC Instructions A processor with complex (and typicalily, rather specialined) insaructioes is calleda CISC machine (complex instruction set computer). The p behind sach processors is distinetly different from that of RISC machines (redaced instruction set cempaters) Patterson and Ditzel, 1980) DSPs are typically CISC machines and inclade intructions specifically supporting FIR filtering (and ofien other algonithms sech as FFTs (fast Fourier transforms) and Viterbi decoding). In fact, so qualify as a DSP a processor must be able to perform FIR filering in one instrection cycle per tap Esample 8.6: The Texas Instuments TM532054s family of DSP processoes s imended to be used in power-constrained embedded applicatices that demand personal diital assistants (PDAs) The inner locp of an FIR computation (3I)is RPT numberofTaps

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Oracle Database 10g Insider Solutions

Authors: Arun R. Kumar, John Kanagaraj, Richard Stroupe

1st Edition

0672327910, 978-0672327919

More Books

Students also viewed these Databases questions

Question

2. What recommendations will you make to the city council?

Answered: 1 week ago

Question

3. The group answers the questions.

Answered: 1 week ago