Question
A superscalar processor may speculatively execute loads even when one or more earlier stores have not yet computed their memory addresses. In practice, we would
A superscalar processor may speculatively execute loads even when one or more earlier stores have not yet computed their memory addresses. In practice, we would need to restart execution from the speculative load if a memory-carried dependency is subsequently detected. (i) With the help of some additional hardware it is possible to record which loads cause such ordering violations. Briefly outline how this could be done and how such a record could be used to help improve performance. [3 marks] (ii) Describe why such a scheme may unnecessarily delay the issuing of a load even when the mechanism correctly recalls that the load has led to an order violation between a store and load in the past? [4 marks] (b) Why might it also be advantageous for a superscalar processor to predict whether a particular load will hit or miss in the processor's L1 data cache? [3 marks] (c) You are asked to design hardware to run artificial neural network applications in a high-performance and energy-efficient manner. Such workloads can typically make good use of many multiply-accumulate (MAC) units operating in parallel and narrow datatypes. Your system is required to support a range of different neural networks that vary considerably in the type of computations they perform. You consider three approaches: (1) to use a multicore processor; (2) to design a single domain-specific accelerator; (3) to compose your design from two or more domain-specific accelerators where each is specialised for different types of neural network. (i) What are the advantages and disadvantages of each approach? [6 marks] (ii) Describe one possible way of organising the multicore processor and a possible choice for the architecture(s) of its individual cores. Briefly justify your design decisions.
(a) Compute the local alignment between the following sequences: GATTACA, TATACG with the following rules: match score = +5, mismatch = ?3, gap penalty = ?4 and discuss how the alignment depends on the choices of match scores, mismatch and gap penalty. [5 marks] (b) Discuss how a local alignment algorithm allows identification of internal sequence duplications. [3 marks] (c) Define the UPGMA algorithm and state and justify its complexity. What is the output of the algorithm given the distance matrix of the species X1, X2, X3, X4 below?
species X1 X2 X3 X2 2 X3 4 4 X4 6 6 6
[4 marks] (d) Discuss a method to perform random access in DNA-based storage memory. [4 marks] (e) Discuss with one example the complexity of the Gillespie algorithm and comment on the main differences with respect to a deterministic approach
b) The Stability of a sampled data system can be checked by checking the characteristic polynomial corresponding to the pulse transfer function of the system. (1) Discuss the use of the bilinear transformation z = (r+1)/(r+2) to determine the stability of a sampled data control system. (ii) The characteristic polynomial corresponding to the pulse transfer function of a sampled data system is f(z) = z 3 +5.94z 2 + 7.2 z-0.368 Investigate the stability of the system. c) A discrete -time system can be simulated on a digital computer in the same way as a continuous-time system on an analogue computer. The only difference is that integrators are replaced by delays. This implies that the blocks containing s -1 are replaced by blocks containing z -1. Use this approach to draw the simulation diagram for a system represented by the transfer function G(z) = ( 3z+ 4z )/( z 1.2z +0.45 z - 0.05)
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started