Question
A creative engineer suggests structuring the TLB so that not all the bits of the presented address need match to result in a hit. Suggest
A creative engineer suggests structuring the TLB so that not all the bits of the presented address need match to result in a hit. Suggest how this might be achieved, and what might be the costs and benefits of doing so. [6 marks][5 marks] (b) In 1976 Peter Chen introduced the Entity Relationship (E-R) Model to support a more natural description of real world data. (i) Describe the basic elements of the Model, and explain some of the choices available to the database designer. [4 marks] (ii) Explain what is meant by a foreign key in the relational model. How could you use foreign keys to represent a database described by an E-R model in relational form? To what extent are the two approaches to data modelling complementary? [6 marks]Consider an alphabet of 5 symbols whose probabilities are as follows: A B C D E 1 16 1 4 1 8 1 16 1 2 One of these symbols has been selected at random and you need to discover which symbol it is by asking 'yes/no' questions that will be truthfully answered. (i) What would be the most efficient sequence of such questions that you could ask in order to discover the selected symbol? [2 marks] (ii) By what principle can you claim that each of your proposed questions in the sequence is maximally informative? [2 marks] (iii) On average, how many such questions will need to be asked before the symbol is discovered? What is the entropy of the symbol set? [2 marks] (iv) Construct a uniquely decodable prefix code for the symbols. Explain why it is uniquely decodable and why it has the prefix property. [2 marks] (v) Relate the bits in the code words forming your prefix code to the 'yes/no' questions that you proposed in (i). [2 marks] (b) Explain how the bits in an IrisCode are set by phase sequencing. Discuss how quantisation of the complex plane into phase quadrants sets each pair of bits; why it is beneficial for quadrant codes to form a Gray Code; how much entropy is thereby typically extracted from iris images; and why such bit sequences enable extremely efficient identity searches and matching. [5 marks] (c) Consider a noisy analog communication channel of bandwidth = 1 MHz, which is perturbed by additive white Gaussian noise whose total spectral power is N0 = 1. Continuous signals are transmitted across such a channel, with average transmitted power P = 1,000. Give a numerical estimate for the channel capacity, in bits per second, of this noisy channel. Then, for a channel having the same bandwidth but whose signal-to-noise ratio P N0 is four times better, repeat your numerical estimate of capacity in bits per second. [5 marks]
Continuous Mathematics
The complex form of the Fourier series is:
where ck is a complex number and ck = ck.
(a) Prove that the complex coeffiffifficient, ck, encodes the amplitude and phase
coeffiffifficients, Ak and k, in the alternative form:
An interprocess communication environment is based on synchronous message
passing. A server is to be designed to support a moderate number of simultaneous
client requests.
Clients send a request message to the server, continue in parallel with server
operation, then wait for the server's reply message.
Discuss the design of the server's interaction with the clients. Include any problems
you foresee and discuss alternative solutions to them. [20 marks]
2CST.2001.4.3
3 Further Java
(a) Describe how mutual-exclusion locks provided by the synchronized keyword
can be used to control access to shared data structures. In particular you
should be clear about the behaviour of concurrent invocations of difffferent
synchronized methods on the same object, or of the same synchronized method
on difffferent objects. [6 marks]
(b) Consider the following class defifinition:
class Example implements Runnable {
public static Object o = new Object();
int count = 0;
public void run() {
while (true) {
synchronized (o) {
A software module controls a car park of known capacity. () and exit() are triggered when cars enter and leave via the barriers. Give pseudocode for the enter and exit procedures (i) if the module is a monitor [8 marks] (ii) if the programming language in which the module is written provides only semaphores [4 marks] (b) Outline the implementation of (i) semaphores [4 marks] (ii) monitors [4 marks]
In a proposed, next-generation banking system a number of transactions are to be scheduled to run concurrently: Debit (D) transactions to make payments from customer accounts to a credit card company. Interest (I) transactions to add daily interest to customer account balances. Transfer (T) transactions which first check whether the source account contains sufficient funds then either abort or continue the transfer from source to destination accounts. Customer x is running a T to transfer 1000 from A to B. Customer y is running a T to transfer 200 from B to A. (a) Discuss the potential for interference between any of these transactions. [7 marks] (b) Demonstrate the effect of concurrency control based on strict two-phase locking in relation to the discussion in (a). [8 marks] (c) Comment on the scope of concurrency control in relation to the discussion in (a). [5 marks] [Hint: you may assume that operations on bank account objects, such as debit, credit and add-interest are atomic.]
You are asked to write Prolog program to work with binary trees. Your code should not rely on any library predicates and you should assume that the interpreter is running without occurs checking. (a) Describe a data representation scheme for such trees in Prolog and demonstrate it by encoding the tree shown above. [3 marks] (b) Implement a Prolog predicate bfs/2 which effects a breadth-first traversal of a tree passed as the first argument and unifies the resulting list with its second argument. For example, when given the tree shown above as the first argument the predicate should unify the second argument with the list [3,2,7,4,2,5]. [4 marks] (c) Explain why the bfs/2 predicate might benefit from being converted to use difference lists. [2 marks] (d) Implement a new predicate diffbfs/2 which makes use of a difference list to exploit the benefit you identified in part (c). Your predicate should take the same arguments as bfs/2. [6 marks] (e) A friend observes that a clause in diffbfs/2 will need to contain an empty difference list and proposes two possible ways of representing it, either []-[] or A-A. Consider your implementation of diffbfs/2. For each use of an empty difference list, justify your choice and explain what can go wrong using the alternative form. [2 marks... [5:13 PM, 4/26/2022] Dr. Tee: Question: Suppose that R(A, B, C) is a relational schema with functional dependencies F = {A, B C, C B}. (i) Is this schema in 3NF? Explain. [2 marks] (ii) Is this schema in BCNF? Explain. [2 marks] (b) Decomposition plays an important role in database design. (i) Define what is meant by a lossless-join decomposition. [2 marks] (ii) Define what is meant by a dependency preserving decomposition. [2 marks] (c) Let R(A, B, C, D, E) be a relational schema with the following functional dependencies i) What is the closure of {A, B}? [2 marks] (ii) What is the closure of {B, E}? [2 marks] (iii) Decompose the schema to BCNF in two different ways. In each case, are all dependencies preserved? Explain. [4 + 4 marks]
For a transaction model based on objects and object operation time-stamps: (a) (i) Define how conflict may be specified in terms of object operation semantics. (ii) Give an example of conflicting operations. (iii) Give an example of non-conflicting operations that would be defined as conflicting under read-write semantics. [3 marks] (b) Define the necessary and sufficient condition for two transactions to be serialisable. Give an example of a non-serialisable execution of a pair of transactions. [3 marks] (c) Define the necessary and sufficient condition for any number of transactions to be serialisable. [1 mark] (d) Discuss how the following methods of providing concurrency control in database systems enforce the properties defined above. (i) Strict two-phase locking. [4 marks] (ii) Strict timestamp ordering. [4 marks] (iii) Optimistic concurrency control. [5 marks]
) In the context of virtual memory management: implemented? [4 marks] (ii) What is meant by temporal locality of reference? [2 marks] (iii) How does the assumption of temporal locality of reference influence page replacement decisions? Illustrate your answer by briefly describing an appropriate page replacement algorithm or algorithms. [3 marks] (iv) What is meant by spatial locality of reference? [2 marks] (v) In what ways does the assumption of spatial locality of reference influence the design of the virtual memory system? [3 marks] (b) Buses are used to connect devices to the processor. (i) Describe with the aid of a diagram the operation of a synchronous bus. [4 marks] (ii) In what ways does an asynchronous bus differ? [2 marks
Consider an operating system that uses hardware support for paging to provide virtual memory to applications. (a) (i) Explain how the hardware and operating system support for paging combine to prevent one process from accessing another's memory. [3 marks] (ii) Explain how space and time overheads arise from use of paging, and how the Translation Lookaside Buffer (TLB) mitigates the time overheads. [3 marks] (b) Consider a system with a five level page table where each level in the page table is indexed by 9 bits and pages are 4 kB in size. A TLB is provided that is indexed by the first 57 bits of the address provided by the process, and achieves a 90% hit rate. A main memory access takes 40 ns while an access to the TLB takes 10 ns. The maximum memory read bandwidth is 100 GB/s. (i) What is the effective memory access latency? [4 marks] (ii) A colleague suggests replacing the system above with one that provides 80 GB/s memory read bandwidth and main memory access latency of 30 ns. Explain whether you should accept the replacement or not, and why. [4 marks] (c) A creative engineer suggests structuring the TLB so that not all the bits of the presented address need match to result in a hit. Suggest how this might be achieved, and what might be the costs and benefits of doing so. [6 marks]
(a) Describe two quantitative and two qualitative techniques for analysing the usability of a software product. [4 marks] (b) Compare the costs and benefits of the quantitative techniques. [6 marks] (c) Compare the costs and benefits of the qualitative techniques. [6 marks] (d) If restricted to a single one of these techniques when designing a new online banking system, which would you choose and why?
(a) Suppose that women who live beyond the age of 80 outnumber men in the same age group by three to one. How much information, in bits, is gained by learning that a person who lives beyond 80 is male? [2 marks] (b) Consider n discrete random variables, named X1, X2, . . . , Xn, of which Xi has entropy H(Xi), the largest being H(XL). What is the upper bound on the joint entropy H(X1, X2, . . . , Xn) of all these random variables, and under what condition will this upper bound be reached? What is the lower bound on the joint entropy H(X1, X2, . . . , Xn)? [3 marks] (c) If discrete symbols from an alphabet S having entropy H(S) are encoded into blocks of length n symbols, we derive a new alphabet of symbol blocks S n . If the occurrence of symbols is independent, then what is the entropy H(S n ) of this new alphabet of symbol blocks? [2 marks] (d) Consider an asymmetric communication channel whose input source is the binary alphabet X = {0, 1} with probabilities {0.5, 0.5} and whose outputs Y are also this binary alphabet {0, 1}, but with asymmetric error probabilities. Thus an input 0 is flipped with probability , but an input 1 is flipped with probability , giving this channel matrix p (i) Give the probabilities of both outputs, p(Y = 0) and p(Y = 1). [2 marks] (ii) Give all the values of (, ) that would maximise the capacity of this channel, and state what that capacity then would be. [3 marks] (iii) Give all the values of (, ) that would minimise the capacity of this channel, and state what that capacity then would be. [3 marks] (e) In order for a variable length code having N codewords with bit lengths 1 mark] (f ) The information in continuous signals which are strictly bandlimited (lowpass or bandpass) is quantised, in that such continuous signals can be completely represented by a finite set of discrete samples. Describe two theorems about how discrete samples suffice for exact reconstruction of continuous bandlimited signals, even at all the points between the sampled values. [4 mark
(a) A two state Markov process emits the letters {A, B, C, D, E} with the probabilities shown for each state. Changes of state can occur when some of the symbols are generated, as indicated by the arrows. 4.2 Information sources with memory We will wish to consider sources with memory, so we also consider Markov processes. Our four event process (a symbol is generated on each edge) is shown graphically together with a two state Markov process for the alphabet fA, B, C, D, Eg in gure 17. We can then solve for the state occupancy using ow equations (this example is trivial).
ess with states fS 1; S2; : : :Sng, with transition probabilities pi(j) being the probability of moving from state Si to state Sj (with the emission of some symbol). First we can dene the entropy of each state in the normal manner: Hi = X jCodd's 1970 paper introduced the Relational Model of data to address the difficulties of building database applications using the technology that was available at the time. (i) What problems were encountered by database developers before Codd introduced the Relational Model? [1 mark] (ii) Describe the basic elements of the Model, and explain what is meant by a relational schema. [4 marks] (iii) Explain how a formal schema can assist both the application database designer and a database application programmer. What if any are the disadvantages of adopting a mathematical description of database structure? [5 marks] (b) In 1976 Peter Chen introduced the Entity Relationship (E-R) Model to support a more natural description of real world data. (i) Describe the basic elements of the Model, and explain some of the choices available to the database designer. [4 marks] (ii) Explain what is meant by a foreign key in the relational model. How could you use foreign keys to represent a database described by an E-R model in relational form? To what extent are the two approaches to data modelling complementary? [6 marks]
A software module controls a car park of known capacity. () and exit() are triggered when cars enter and leave via the barriers. Give pseudocode for the enter and exit procedures (i) if the module is a monitor [8 marks] (ii) if the programming language in which the module is written provides only semaphores [4 marks] (b) Outline the implementation of (i) semaphores [4 marks] (ii) monitors [4 marks]
In a proposed, next-generation banking system a number of transactions are to be scheduled to run concurrently: Debit (D) transactions to make payments from customer accounts to a credit card company. Interest (I) transactions to add daily interest to customer account balances. Transfer (T) transactions which first check whether the source account contains sufficient funds then either abort or continue the transfer from source to destination accounts. Customer x is running a T to transfer 1000 from A to B. Customer y is running a T to transfer 200 from B to A. (a) Discuss the potential for interference between any of these transactions. [7 marks] (b) Demonstrate the effect of concurrency control based on strict two-phase locking in relation to the discussion in (a). [8 marks] (c) Comment on the scope of concurrency control in relation to the discussion in (a). [5 marks] [Hint: you may assume that operations on bank account objects, such as debit, credit and add-interest are atomic.]
You are asked to write Prolog program to work with binary trees. Your code should not rely on any library predicates and you should assume that the interpreter is running without occurs checking. (a) Describe a data representation scheme for such trees in Prolog and demonstrate it by encoding the tree shown above. [3 marks] (b) Implement a Prolog predicate bfs/2 which effects a breadth-first traversal of a tree passed as the first argument and unifies the resulting list with its second argument. For example, when given the tree shown above as the first argument the predicate should unify the second argument with the list [3,2,7,4,2,5]. [4 marks] (c) Explain why the bfs/2 predicate might benefit from being converted to use difference lists. [2 marks] (d) Implement a new predicate diffbfs/2 which makes use of a difference list to exploit the benefit you identified in part (c). Your predicate should take the same arguments as bfs/2. [6 marks] (e) A friend observes that a clause in diffbfs/2 will need to contain an empty difference list and proposes two possible ways of representing it, either []-[] or A-A. Consider your implementation of diffbfs/2. For each use of an empty difference list, justify your choice and explain what can go wrong using the alternative form. [2 marks... [5:13 PM, 4/26/2022] Dr. Tee: Question: Suppose that R(A, B, C) is a relational schema with functional dependencies F = {A, B C, C B}. (i) Is this schema in 3NF? Explain. [2 marks] (ii) Is this schema in BCNF? Explain. [2 marks] (b) Decomposition plays an important role in database design. (i) Define what is meant by a lossless-join decomposition. [2 marks] (ii) Define what is meant by a dependency preserving decomposition. [2 marks] (c) Let R(A, B, C, D, E) be a relational schema with the following functional dependencies i) What is the closure of {A, B}? [2 marks] (ii) What is the closure of {B, E}? [2 marks] (iii) Decompose the schema to BCNF in two different ways. In each case, are all dependencies preserved? Explain. [4 + 4 marks]
For a transaction model based on objects and object operation time-stamps: (a) (i) Define how conflict may be specified in terms of object operation semantics. (ii) Give an example of conflicting operations. (iii) Give an example of non-conflicting operations that would be defined as conflicting under read-write semantics. [3 marks] (b) Define the necessary and sufficient condition for two transactions to be serialisable. Give an example of a non-serialisable execution of a pair of transactions. [3 marks] (c) Define the necessary and sufficient condition for any number of transactions to be serialisable. [1 mark] (d) Discuss how the following methods of providing concurrency control in database systems enforce the properties defined above. (i) Strict two-phase locking. [4 marks] (ii) Strict timestamp ordering. [4 marks] (iii) Optimistic concurrency control. [5 marks]
) In the context of virtual memory management: implemented? [4 marks] (ii) What is meant by temporal locality of reference? [2 marks] (iii) How does the assumption of temporal locality of reference influence page replacement decisions? Illustrate your answer by briefly describing an appropriate page replacement algorithm or algorithms. [3 marks] (iv) What is meant by spatial locality of reference? [2 marks] (v) In what ways does the assumption of spatial locality of reference influence the design of the virtual memory system? [3 marks] (b) Buses are used to connect devices to the processor. (i) Describe with the aid of a diagram the operation of a synchronous bus. [4 marks] (ii) In what ways does an asynchronous bus differ? [2 marks
Consider an operating system that uses hardware support for paging to provide virtual memory to applications. (a) (i) Explain how the hardware and operating system support for paging combine to prevent one process from accessing another's memory. [3 marks] (ii) Explain how space and time overheads arise from use of paging, and how the Translation Lookaside Buffer (TLB) mitigates the time overheads. [3 marks] (b) Consider a system with a five level page table where each level in the page table is indexed by 9 bits and pages are 4 kB in size. A TLB is provided that is indexed by the first 57 bits of the address provided by the process, and achieves a 90% hit rate. A main memory access takes 40 ns while an access to the TLB takes 10 ns. The maximum memory read bandwidth is 100 GB/s. (i) What is the effective memory access latency? [4 marks] (ii) A colleague suggests replacing the system above with one that provides 80 GB/s memory read bandwidth and main memory access latency of 30 ns.
(d) If restricted to a single one of these techniques when designing a new online banking system, which would you choose and why?
(a) Suppose that women who live beyond the age of 80 outnumber men in the same age group by three to one. How much information, in bits, is gained by learning that a person who lives beyond 80 is male? [2 marks] (b) Consider n discrete random variables, named X1, X2, . . . , Xn, of which Xi has entropy H(Xi), the largest being H(XL). What is the upper bound on the joint entropy H(X1, X2, . . . , Xn) of all these random variables, and under what condition will this upper bound be reached? What is the lower bound on the joint entropy H(X1, X2, . . . , Xn)? [3 marks] (c) If discrete symbols from an alphabet S having entropy H(S) are encoded into blocks of length n symbols, we derive a new alphabet of symbol blocks S n . If the occurrence of symbols is independent, then what is the entropy H(S n ) of this new alphabet of symbol blocks? [2 marks] (d) Consider an asymmetric communication channel whose input source is the binary alphabet X = {0, 1} with probabilities {0.5, 0.5} and whose outputs Y are also this binary alphabet {0, 1}, but with asymmetric error probabilities. Thus an input 0 is flipped with probability , but an input 1 is flipped with probability , giving this channel matrix p (i) Give the probabilities of both outputs, p(Y = 0) and p(Y = 1). [2 marks] (ii) Give all the values of (, ) that would maximise the capacity of this channel, and state what that capacity then would be. [3 marks] (iii) Give all the values of (, ) that would minimise the capacity of this channel, and state what that capacity then would be. [3 marks] (e) In order for a variable length code having N codewords with bit lengths 1 mark] (f ) The information in continuous signals which are strictly bandlimited (lowpass or bandpass) is quantised, in that such continuous signals can be completely represented by a finite set of discrete samples. Describe two theorems about how discrete samples suffice for exact reconstruction of continuous bandlimited signals, even at all the points between the sampled values. [4 mark
(a) A two state Markov process emits the letters {A, B, C, D, E} with the probabilities shown for each state. Changes of state can occur when some of the symbols are generated, as indicated by the arrows. 4.2 Information sources with memory We will wish to consider sources with memory, so we also consider Markov processes. Our four event process (a symbol is generated on each edge) is shown graphically together with a two state Markov process for the alphabet fA, B, C, D, Eg in gure 17. We can then solve for the state occupancy using ow equations (this example is trivial).
ess with states fS 1; S2; : : :Sng, with transition probabilities pi(j) being the probability of moving from state Si to state Sj (with the emission of some symbol). First we can dene the entropy of each state in the normal manner: Hi = X j pi(j) log2 pi(j) and then the entropy of the system to be the sum of these individual state entropy values weighted with the state occupancy (calculated from the ow equations):
Pipi(j) log pi(j) (45) Clearly for a single state, we have the entropy of the memoryless source. 4.3 The Source Coding theorem Often we wish to eciently represent the symbols generated by some source. We shall consider encoding the symbols as binary digits. 19 (i) What are the state occupancy probabilities? [1 mark] (ii) What is the probability of the letter string AD being emitted? [1 mark] (iii) What is the entropy of State 1, what is the entropy of State 2, and what is the overall entropy of this symbol generating process? [5 marks] (b) A fair coin is secretly flipped until the first head occurs. Let X denote the number of flips required. The flipper will truthfully answer any "yes-no" questions about his experiment, and we wish to discover thereby the value of X as efficiently as possible. (i) What is the most efficient possible sequence of such questions? Justify your answer. [2 marks] (ii) On average, how many questions should we need to ask? Justify your answer. [2 marks] (iii) Relate the sequence of questions to the bits in a uniquely decodable prefix code for X. [1 mark] (c) Define complex Gabor wavelets, restricting yourself to one-dimensional functions if you wish, and list four key properties that make such wavelets useful for encoding and compressing information, as well as for pattern recognition. Explain how their self-Fourier property and their closure under multiplication (i.e. the product of any two of them is yet again a Gabor wavelet) gives them also closure under convolution. Mention one disadvantage of such wavelets for reconstructing data from their projection coefficients. [8 mark pi(j) log2 pi(j) and then the entropy of the system to be the sum of these individual state entropy values weighted with the state occupancy (calculated from the ow equations):
Pipi(j) log pi(j) (45) Clearly for a single state, we have the entropy of the memoryless source. 4.3 The Source Coding theorem Often we wish to eciently represent the symbols generated by some source. We shall consider encoding the symbols as binary digits. 19 (i) What are the state occupancy probabilities? [1 mark] (ii) What is the probability of the letter string AD being emitted? [1 mark] (iii) What is the entropy of State 1, what is the entropy of State 2, and what is the overall entropy of this symbol generating process? [5 marks] (b) A fair coin is secretly flipped until the first head occurs. Let X denote the number of flips required. The flipper will truthfully answer any "yes-no" questions about his experiment, and we wish to discover thereby the value of X as efficiently as possible. (i) What is the most efficient possible sequence of such questions? Justify your answer. [2 marks] (ii) On average, how many questions should we need to ask? Justify your answer. [2 marks] (iii) Relate the sequence of questions to the bits in a uniquely decodable prefix code for X. [1 mark]
In P2P systems, a peer can come or go without warning. Thus, when designing a DHT, we also must be concerned about maintaining the DHT overlay in the presence of such peer churn. To get a big-picture understanding of how this could be accomplished, let's once again consider the circular DHT in Figure 2.27(a). To handle peer churn, we will now require each peer to track (that is, know the IP address of) its first and second successors; for example, peer 4 now tracks both peer 5 and peer 8. We also require each peer to periodically verify that its two successors are alive (for example, by periodically sending ping messages to them and asking for responses). Let's now consider how the DHT is maintained when a peer abruptly leaves. For example, suppose peer 5 in Figure 2.27(a) abruptly leaves. In this case, the two peers preceding the departed peer (4 and 3) learn that 5 has departed, since it no longer responds to ping messages. Peers 4 and 3 thus need to update their successor state information. Let's consider how peer 4 updates its state: 1. Peer 4 replaces its first successor (peer 5) with its second successor (peer 8). 2. Peer 4 then asks its new first successor (peer 8) for the identifier and IP address of its immediate successor (peer 10). Peer 4 then makes peer 10 its second successor. In the homework problems, you will be asked to determine how peer 3 updates its overlay routing information. Having briefly addressed what has to be done when a peer leaves, let's now consider what happens when a peer wants to join the DHT. Let's say a peer with identifier 13 wants to join the DHT, and at the time of joining, it only knows about peer 1's existence in the DHT. Peer 13 would first send peer 1 a message, saying "what will be 13's predecessor and successor?" This message gets forwarded through the DHT until it reaches peer 12, who realizes that it will be 13's predecessor and that its current successor, peer 15, will become 13's successor. Next, peer 12 sends this predecessor and successor information to peer 13. Peer 13 can now join 2.6 PEER-TO-PEER APPLICATIONS 155 the DHT by making peer 15 its successor and by notifying peer 12 that it should change its immediate successor to 13. DHTs have been finding widespread use in practice. For example, BitTorrent uses the Kademlia DHT to create a distributed tracker. In the BitTorrent, the key is the torrent identifier and the value is the IP addresses of all the peers currently participating in the torrent [Falkner 2007, Neglia 2007]. In this manner, by querying the DHT with a torrent identifier, a newly arriving BitTorrent peer can determine the peer that is responsible for the identifier (that is, for tracking the peers in the torrent). After having found that peer, the arriving peer can query it for a list of other peers in the torrent. 2.7 Socket Programming: Creating Network Applications Now that we've looked at a number of important network applications, let's explore how network application programs are actually created. Recall from Section 2.1 that a typical network application consists of a pair of programsa client program and a server programresiding in two different end systems. When these two programs are executed, a client process and a server process are created, and these processes communicate with each other by reading from, and writing to, sockets. When creating a network application, the developer's main task is therefore to write the code for both the client and server programs. There are two types of network applications. One type is an implementation whose operation is specified in a protocol standard, such as an RFC or some other standards document; such an application is sometimes referred to as "open," since the rules specifying its operation are known to all. For such an implementation, the client and server programs must conform to the rules dictated by the RFC. For example, the client program could be an implementation of the client side of the FTP protocol, described in Section 2.3 and explicitly defined in RFC 959; similarly, the server program could be an implementation of the FTP server protocol, also explicitly defined in RFC 959. If one developer writes code for the client program and another developer writes code for the server program, and both developers carefully follow the rules of the RFC, then the two programs will be able to interoperate. Indeed, many of today's network applications involve communication between client and server programs that have been created by independent developersfor example, a Firefox browser communicating with an Apache Web server, or a BitTorrent client communicating with BitTorrent tracker. The other type of network application is a proprietary network application. In this case the client and server programs employ an application-layer protocol that has not been openly published in an RFC or elsewhere. A single developer (or 156 CHAPTER 2 APPLICATION LAYER development team) creates both the client and server programs, and the developer has complete control over what goes in the code. But because the code does not implement an open protocol, other independent developers will not be able to develop code that interoperates with the application
During the development phase, one of the first decisions the developer must make is whether the application is to run over TCP or over UDP. Recall that TCP is connection oriented and provides a reliable bytestream channel through which data flows between two end systems. UDP is connectionless and sends independent packets of data from one end system to the other, without any guarantees about delivery. Recall also that when a client or server program implements a protocol defined by an RFC, it should use the well-known port number associated with the protocol; conversely, when developing a proprietary application, the developer must be careful to avoid using such well-known port numbers. (Port numbers were briefly discussed in Section 2.1. They are covered in more detail in Chapter 3.) We introduce UDP and TCP socket programming by way of a simple UDP application and a simple TCP application. We present the simple UDP and TCP applications in Python. We could have written the code in Java, C, or C++, but we chose Python mostly because Python clearly exposes the key socket concepts. With Python there are fewer lines of code, and each line can be explained to the novice programmer without difficulty. But there's no need to be frightened if you are not familiar with Python. You should be able to easily follow the code if you have experience programming in Java, C, or C++Approach to solving the question: Each search process can be considered to be a tree traversal. The object of the search is to find a path from the initial state to a goal state using a tree. The number of nodes generated might be huge; and in practice many of the nodes would not be needed. The secret of a good search routine is to generate only those nodes that are likely to be useful, rather than having a precise tree. The rules are used to represent the tree implicitly and only to create nodes explicitly if they are actually to be of use. The following issues arise when searching: The tree can be searched forward from the initial node to the goal state or backwards from the goal state to the initial state. To select applicable rules, it is critical to have an efficient procedure for matching rules against states. How to represent each node of the search process? This is the knowledge representation problem or the frame problem. In games, an array suffices; in other problems, more complex data structures are needed. Finally in terms of data structures, considering the water jug as a typical problem do we use a graph or tree? The breadth-first structure does take note of all nodes generated but the depth-first one can be modified. 30 Check duplicate nodes 1. Observe all nodes that are already generated, if a new node is present. 2. If it exists add it to the graph. 3. If it already exists, then a. Set the node that is being expanded to the point to the already existing node corresponding to its successor rather than to the new one. The new one can be thrown away. b. If the best or shortest path is being determined, check to see if this path is better or worse than the old one. If worse, do nothing. Better save the new path and work the change in length through the chain of successor nodes if necessary. Example: Tic-Tac-Toe State spaces are good representations for board games such as Tic-Tac-Toe. The position of a game can be explained by the contents of the board and the player whose turn is next. The board can be represented as an array of 9 cells, each of which may contain an X or O or be empty. State: Player to move next: X or O. Board configuration: Operators: Change an empty cell to X or O. Start State: Board empty; X's turn. Terminal States: Three X's in a row; Three O's in a row; All cells full. Search Tree The sequence of states formed by possible moves is called a search tree. Each level of the tree is called a ply. Since the same state may be reachable by different sequences of moves, the state space may in general be a graph. It may be treated as a tree for simplicity, at the cost of duplicating states. 31 Solving problems using search Given an informal description of the problem, construct a formal description as a state space: Define a data structure to represent the state. Make a representation for the initial state from the given data. Write programs to represent operators that change a given state representation to a new state representation. Write a program to detect terminal states. Choose an appropriate search technique: How large is the search space? How well structured is the domain? What knowledge about the domain can be used to guide the search?Many traditional search algorithms are used in AI applications. For complex problems, the traditional algorithms are unable to find the solutions within some practical time and space limits. Consequently, many special techniques are developed, using heuristic functions. The algorithms that use heuristic functions are called heuristic algorithms. Heuristic algorithms are not really intelligent; they appear to be intelligent because they achieve better performance. Heuristic algorithms are more efficient because they take advantage of feedback from the data to direct the search path. Uninformed search algorithms or Brute-force algorithms, search through the search space all possible candidates for the solution checking whether each candidate satisfies the problem's statement. Informed search algorithms use heuristic functions that are specific to the problem, apply them to guide the search through the search space to try to reduce the amount of time spent in searching. A good heuristic will make an informed search dramatically outperform any uninformed search: for example, the Traveling Salesman Problem (TSP), where the goal is to find is a good solution instead of finding the best solution. In such problems, the search proceeds using current information about the problem to predict which path is closer to the goal and follow it, although it does not always guarantee to find the best possible solution. Such techniques help in finding a solution within reasonable time and space (memory). Some prominent intelligent search algorithms are stated below: 1. Generate and Test Search 2. Best-first Search 3. Greedy Search 4. A* Search 5. Constraint Search 6. Means-ends analysis There are some more algorithms. They are either improvements or combinations of these. Hierarchical Representation of Search Algorithms: A Hierarchical representation of most search algorithms is illustrated below. The representation begins with two types of search: Uninformed Search: Also called blind, exhaustive or brute-force search, it uses no information about the problem to guide the search and therefore may not be very efficient. Informed Search: Also called heuristic or intelligent search, this uses information about the problem to guide the searchusually guesses the distance to a goal state and is therefore efficient, but the search may not be always possible.The first requirement is that it causes motion, in a game playing program, it moves on the board and in the water jug problem, filling water is used to fill jugs. It means the control strategies without the motion will never lead to the solution. The second requirement is that it is systematic, that is, it corresponds to the need for global motion as well as for local motion. This is a clear condition that neither would it be rational to fill a jug and empty it repeatedly, nor it would be worthwhile to move a piece round and round on the board in a cyclic way in a game. We shall initially consider two systematic approaches for searching. Searches can be classified by the order in which operators are tried: depth-first, breadth-first, bounded depth-first.Advantages 1. Guaranteed to find an optimal solution (in terms of shortest number of steps to reach the goal). 2. Can always find a goal node if one exists (complete). Disadvantages 1. High storage requirement: exponential with tree depth. Depth-first search A search strategy that extends the current path as far as possible before backtracking to the last choice point and trying the next alternative path is called Depth-first search (DFS). This strategy does not guarantee that the optimal solution has been found. In this strategy, search reaches a satisfactory solution more rapidly than breadth first, an advantage when the search space is large. Algorithm Depth-first search applies operators to each newly generated state, trying to drive directly toward the goal. 1. If the starting state is a goal state, quit and return success. 2. Otherwise, do the following until success or failure is signalled: a. Generate a successor E to the starting state. If there are no more successors, then signal failure. b. Call Depth-first Search with E as the starting state. c. If success is returned signal success; otherwise, continue in the loop. Advantages 1. Low storage requirement: linear with tree depth. 2. Easily programmed: function call stack does most of the work of maintaining state of the search. Disadvantages 1. May find a sub-optimal solution (one that is deeper or more costly than the best solution). 2. Incomplete: without a depth bound, may not find a solution even if one exists. 2.4.2.3 Bounded depth-first search Depth-first search can spend much time (perhaps infinite time) exploring a very deep path that does not contain a solution, when a shallow solution exists. An easy way to solve this problem is to put a maximum depth bound on the search. Beyond the depth bound, a failure is generated automatically without exploring any deeper. Problems: 1. It's hard to guess how deep the solution lies. 2. If the estimated depth is too deep (even by 1) the computer time used is dramatically increased, by a factor of bextra. 3. If the estimated depth is too shallow, the search fails to find a solution; all that computer time is wasted. To find a solution in proper time rather than a complete solution in unlimited time we use heuristics. 'A heuristic function is a function that maps from problem state descriptions to measures of desirability, usually represented as numbers'. Heuristic search methods use knowledge about the problem domain and choose promising operators first. These heuristic search methods use heuristic functions to evaluate the next state towards the goal state. For finding a solution, by using the heuristic technique, one should carry out the following steps: 1. Add domainspecific information to select what is the best path to continue searching along. 2. Define a heuristic function h(n) that estimates the 'goodness' of a node n. Specifically, h(n) = estimated cost(or distance) of minimal cost path from n to a goal state. 3. The term, heuristic means 'serving to aid discovery' and is an estimate, based on domain specific information that is computable from the current state description of how close we are to a goal. Finding a route from one city to another city is an example of a search problem in which different search orders and the use of heuristic knowledge are easily understood. 1. State: The current city in which the traveller is located. 2. Operators: Roads linking the current city to other cities. 3. Cost Metric: The cost of taking a given road between cities The most important technique for achieving deep modules is information hiding. This technique was first described by David Parnas 1 . The basic idea is that each module should encapsulate a few pieces of knowledge, which represent design decisions. The knowledge is embedded in the module's implementation but does not appear in its interface, so it is not visible to other modules. The information hidden within a module usually consists of details about how to implement some mechanism. Here are some examples of information that might be hidden within a module: How to store information in a B-tree, and how to access it efficiently. How to identify the physical disk block corresponding to each logical block within a file. How to implement the TCP network protocol. How to schedule threads on a multi-core processor. How to parse JSON documents. The hidden information includes data structures and algorithms related to the mechanism. It can also include lower-level details such as the size of a page, and it can include higher-level concepts that are more abstract, such as an assumption that most files are small. Information hiding reduces complexity in two ways. First, it simplifies the interface to a module. The interface reflects a simpler, more abstract view of the module's functionality and hides the details; this reduces the cognitive load on developers who use the module. For instance, a developer using a B-tree class need not worry about the ideal fanout for nodes in the tree or how to keep the tree balanced. Second, information hiding makes it easier to evolve the system. If a piece of information is hidden, there are no dependencies on that information outside the module containing the information, so a design change related to that information will affect only the one module. For example, if the TCP protocol changes (to introduce a new mechanism for congestion control, for instance), the protocol's implementation will have to be modified, but no changes should be needed in higher-level code that uses TCP to send and receive data. When designing a new module, you should think carefully about what information can be hidden in that module. If you can hide more information, you should also be able to simplify the module's interface, and this makes the module deeper. Note: hiding variables and methods in a class by declaring them private isn't the same thing as information hiding. Private elements can help with information hiding, since they make it impossible for the items to be accessed directly from outside the class. However, information about the private items can still be exposed through public methods such as getter and setter methods. When this happens the nature and usage of the variables are just as exposed as if the variables were public. The best form of information hiding is when information is totally hidden within a module, so that it is irrelevant and invisible to users of the module. However, partial information hiding also has value. For example, if a particular feature or piece of information is only needed by a few of a class's users, and it is accessed through separate methods so that it isn't visible in the most common use cases, then that information is mostly hidden. Such information will create fewer dependencies than information that is visible to every user of the class. Information leakage The opposite of information hiding is information leakage. Information leakage occurs when a design decision is reflected in multiple modules. This creates a dependency between the modules: any change to that design decision will require changes to all of the involved modules. If a piece of information is reflected in the interface for a module, then by definition it has been leaked; thus, simpler interfaces tend to correlate with better information hiding. However, information can be leaked even if it doesn't appear in a module's interface. Suppose two classes both have knowledge of a particular file format (perhaps one class reads files in that format and the other class writes them). Even if neither class exposes that information in its interface, they both depend on the file format: if the format changes, both classes will need to be modified. Back-door leakage like this is more pernicious than leakage through an interface, because it isn't obvious. Information leakage is one of the most important red flags in software design. One of the best skills you can learn as a software designer is a high level of sensitivity to information leakage. If you encounter information leakage between classes, ask yourself "How can I reorganize these classes so that this particular piece of knowledge only affects a single class?" If the affected classes are relatively small and closely tied to the leaked information, it may make sense to merge them into a single class. Another possible approach is to pull the information out of all of the affected classes and create a new class that encapsulates just that information. However, this approach will be effective only if you can find a simple interface that abstracts away from the details; if the new class exposes most of the knowledge through its interface, then it won't provide much value (you've simply replaced back-door leakage with leakage through an interface). HTTP is a mechanism used by Web browsers to communicate with Web servers. When a user clicks on a link in a Web browser or submits a form, the browser uses HTTP to send a request over the network to a Web server. Once the server has processed the request, it sends a response back to the browser; the response normally contains a new Web page to display. The HTTP protocol specifies the format of requests and responses, both of which are represented textually.Rather than returning a single parameter, the method returns a reference to the Map used internally to store all of the parameters. This method is shallow, and it exposes the internal representation used by the HTTPRequest class to store parameters. Any change to that representation will result in a change to the interface, which will require modifications to all callers. When implementations are modified, the changes often involve changes in the representation of key data structures (to improve performance, for example). Thus, it's important to avoid exposing internal data structures as much as possible. This approach also makes more work for callers: a caller must first invoke getParams, then it must call another method to retrieve a specific parameter from the Map. Finally, callers must realize that they should not modify the Map returned by getParams, since that will affect the internal state of the HTTPRequest. Here is a better interface for retrieving parameter values: public String getParameter(String name) { ... } public int getIntParameter(String name) { ... }
Explanation AThe most common mistake made by students was to divide their code into a large number of shallow classes, which led to information leakage between the classes. One team used two different classes for receiving HTTP requests; the first class read the request from the network connection into a string, and the second class parsed the string. This is an example of a temporal decomposition ("first we read the request, then we parse it"). Information leakage occurred because an HTTP request can't be read without parsing much of the message; for example, the Content-Length header specifies the length of the request body, so the headers must be parsed in order to compute the total request length. As a result, both classes needed to understand most of the structure of HTTP requests, and parsing code was duplicated in both classes. This approach also created extra complexity for callers, who had to invoke two methods in different classes, in a particular order, to receive a request. Because the classes shared so much information, it would have been better to merge them into a single class that handles both request reading and parsing. This provides better information hiding, since it isolates all knowledge of the request format in one class, and it also provides a simpler interface to callers (just one method to invoke). This example illustrates a general theme in software design: information hiding can often be improved by making a class slightly larger. One reason for doing this is to bring together all of the code related to a particular capability (such as parsing an HTTP request), so that the resulting class contains everything related to that capability. A second reason for increasing the size of a class is to raise the level of the interface; for example, rather than having separate methods for each of three steps of a computation, have a single method that performs the entire computation. This can result in a simpler interface. Both of these benefits apply in the example of the previous paragraph: combining the classes brings together all of the code related to parsing an HTTP request, and it replaces two externally-visible methods with one. The combined class is deeper than the original classespproach to solving the question
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Suggest an Optimized TLB Scheme Part a Structuring the TLB for Partial Address Matching Suggestion for Optimization To achieve partial matching in a Translation Lookaside Buffer TLB the idea is to use ...Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started