Question

1 Approved Answer

Posted on Oct 12, 2024

fCHAPTER 14 Server Farms: M/M/k and M/M/k/k In today's high-volume world, almost no websites, compute centers, or call centers consist of just a single server.

\fCHAPTER 14 Server Farms: M/M/k and M/M/k/k In today's high-volume world, almost no websites, compute centers, or call centers consist of just a single server. Instead a \"server farm\" is used. The server farm is a collection of servers that work together to handle incoming requests. Each request might be routed to a different server, so that servers \"share\" the incoming load. From a practical perspective, server farms are often preferable to a single \"super-fast\" server because of their low cost (many slow servers are cheaper than a single fast one) and their flexibility (it is easy to increase/decrease capacity as needed by adding/removing servers). These practical features have made server farms ubiquitous. In this chapter, we study server farms where there is a single queue of requests and where each server, when free, takes the next request off the queue to work on. Specifically, there are no queues at the individual servers. We defer discussion of models with queues at the individual servers to the exercises and later chapters. The two systems we consider in this chapter are the M/M/k system and the M/M/k/k system. In both, the first \"M\" indicates that we have memoryless interarrival times, and the second \"M\" indicates memoryless service times. The third field denotes that k servers share a common pool of arriving jobs. For the M/M/k system, there is no capacity constraint, and this common pool takes the form of an unbounded FCFS queue, as shown later in Figure 14.3, where each server, when free, grabs the job at the head of the queue to work on. For the M/M/k/k system shown in Figure 14.1, there is a capacity constraint of k jobs. This means that there is no room for a queue. If a job arrives to find all k servers busy, then the job is dropped. Because the analysis of the M/M/k/k is easier, we begin with that, in Section 14.2, and then go on to the M/M/k, in Section 14.3. Before we discuss these systems, it will be helpful to revisit the concept of time-reversibility, this time for CTMCs rather than DTMCs. We do this in Section 14.1. 14.1 Time-Reversibility for CTMCs We start by reviewing terminology used in CTMCs. Question: Can you properly define the following terms: qij , i qij , i , i Pij ? Answer: Recall that qij is the rate of transitions from state i to state j , given that the CTMC is in state i. That is, qij is the label on the arrow from i to j in the Markov transition diagram for the CTMC. If i is the limiting probability that the CTMC is in 253 Downloaded from http:/www.cambridge.org/core. Ulsan National Institute of Science and Technology, on 05 Oct 2016 at 17:00:16, subject to the Cambridge Core terms of use, available at http:/www.cambridge.org/core/terms. http://dx.doi.org/10.1017/CBO9781139226424.020 254 server farms: m/m/k and m/m/k/k state i, then i qij is the rate of transitions from state i to state j . Likewise j qj i is the rate of transitions from state j to state i. Recall also that i is the total rate of transitions leaving i, given that we are in state i, and i Pij denotes the rate of transitions leaving state i and going to state j , given that we are in state i, i.e., i Pij = qij . Thus i i denotes the rate of transitions leaving state i, and i i Pij denotes the rate of transitions leaving i and going to j . Recall the useful time-reversibility theorem for DTMCs, Theorem 9.34, which said that, for an aperiodic and irreducible DTMC, if we can find xi 's such that \u0005 xi = 1 and xi Pij = xj Pj i , i, j i then these xi 's are the limiting probabilities. We now prove a counterpart to this theorem for CTMCs. Definition 14.1 A CTMC is time-reversible if, for all states i and j , the rate of transitions from state i to state \u0004 j equals the rate of transitions from state j to state i (i.e., i qij = j qj i , where i i = 1). Lemma 14.2 Given an irreducible CTMC, suppose we can find xi 's such that \u0005 xi = 1 and xi qij = xj qj i , i, j i where qij is the rate of transitions from state i to state j given that the MC is in state i. Then, 1. The xi 's are the limiting probabilities of the CTMC. 2. The CTMC is time-reversible. Proof What we need to prove here is that the xi 's are the i 's (the limiting probabilities). We are given that, i, j , xi qij = xj qj i . Thus, \u0005 xi qij = xj i \u0005 \u0005 xi qij = xj \u0005 i \u0005 i xi qij = xj j j Pj i \u0005 i \u0005 qj i i Pj i i xi qij = xj j . i Downloaded from http:/www.cambridge.org/core. Ulsan National Institute of Science and Technology, on 05 Oct 2016 at 17:00:16, subject to the Cambridge Core terms of use, available at http:/www.cambridge.org/core/terms. http://dx.doi.org/10.1017/CBO9781139226424.020 14.2 m/m/k/k loss system 255 Since these are the balance equations for the CTMC, by Theorem 12.6 it then follows that the xi 's must be the i 's. Thus it further follows that i qij = j qj i , i, j hence the CTMC is time-reversible. Question: What is an example of a CTMC that is not time-reversible? Answer: Consider a chain that has an arc from state i to state j labeled qij , but no arc from state j to state i. Then the rate of going from state i to state j is i qij , but the rate of going from state j to state i is zero. Question: Recall that the M/M/1 chain is a birth-death process. Are all birth-death processes time-reversible? Answer: Yes. Here's a proof: First observe that during any period of time, t, the number of transitions from state i to state i + 1 is within 1 of the number of transitions from state i + 1 to state i. The reason for this is that you cannot repeat going from i to i + 1 without first going back to i again - and the only way to go back to state i is to make a transition from i + 1 to i. Thus the long-run rate of transitions (number of transitions divided by time) from state i to state i + 1 is equal to the rate of transitions from i + 1 to i (as time gets big, that \"difference of 1\" can be ignored). As in the case of DTMCs, it is often helpful to write the time-reversibility equations and see if a solution to these can be found. If so, that solution also represents the limiting distribution. If not, then one needs to go back to the balance equations. Fortunately, we will see that the CTMCs for the M/M/k/k and the M/M/k are both birth-death processes, and hence the time-reversibility equations will be solvable. 14.2 M/M/k/k Loss System The M/M/k/k queueing system is also called the k-server loss system. Jobs arrive according to a Poisson process, with some average arrival rate, . Job sizes are Exponentially distributed with rate . There are k servers that can each hold one job. The system only has capacity for k jobs total; if an arrival shows up and sees all k servers already busy with a job, the arrival is dropped. The M/M/k/k loss system originally arose from the early phone switching systems that could handle at most k calls simultaneously, as shown in Figure 14.1. An incoming call request could be picked up and serviced by any one of the k circuits. However, if none of the k circuits was free, the phone call request was dropped. The duration of a phone call was assumed to be Exponentially distributed. Another application for the k -server loss system is a system that maintains virtual connections between nodes A and B in a network. Only k virtual connections are allowed. Each incoming request for a virtual connection is given one; however, if all k virtual connections are in use, the request is rejected. Downloaded from http:/www.cambridge.org/core. Ulsan National Institute of Science and Technology, on 05 Oct 2016 at 17:00:16, subject to the Cambridge Core terms of use, available at http:/www.cambridge.org/core/terms. http://dx.doi.org/10.1017/CBO9781139226424.020 256 server farms: m/m/k and m/m/k/k Server 1 Server 2 Phone calls come in Phone call may be serviced by any one of the k circuits. However, if none is free, the phone call is lost. Server k Duration of phone call ~ Exp() Figure 14.1. The k-server loss system: M/M/k/k. The key question in these types of systems is, \"What is the fraction of jobs that are lost?\" This fraction is known as the blocking probability, Pblock . We determine the blocking probability by modeling the M/M/k/k queueing system as a CTMC. Question: What should the state space be? Answer: The state represents the number of busy servers in the system. The CTMC is shown in Figure 14.2. 0 1 k 2 2 3 Figure 14.2. An M/M/k/k queueing system modeled using a CTMC. The state represents the number of busy servers. We solve the time-reversibility equations to determine the limiting probabilities for the states as shown in Table 14.1. Table 14.1. Time-reversibility equations for the M/M/k/k State Time-reversibility equation Simplified equation 0 0 = 1 1 = 1 1 = 2 2 2 = 2 2 = 3 3 3 = k1 k 1 = k k k = 0 \u001e \u001f2 1 2! 0 1 3! 0 1 k! 0 \u001e \u001f3 We guess that \u001e \u001fk \u0002 \u0003i 1 0 . i = i! (14.1) Downloaded from http:/www.cambridge.org/core. Ulsan National Institute of Science and Technology, on 05 Oct 2016 at 17:00:16, subject to the Cambridge Core terms of use, available at http:/www.cambridge.org/core/terms. http://dx.doi.org/10.1017/CBO9781139226424.020 14.2 m/m/k/k loss system 257 We can verify that this is correct by substituting back into\u0004 the time-reversibility equation k for i . Finally, we determine 0 such that the equation i=0 i = 1 is satisfied: k \u0005 i = 1 i=0 k \u0002 \u0003i \u0005 1 0 = 1 i! i=0 1 \u001e \u001fi 0 = \u0004 k i=0 1 i! Therefore, substituting back into equation (14.1), we obtain \u001e \u001fi / i! i = \u0004 \u001e \u001f j . k 1 j =0 j! The blocking probability, Pblock , is the probability that an arrival finds all k servers busy. By PASTA, this is the limiting probability that the chain is in state k . Thus, \u001e \u001fk / k! Pblock = k = \u0004 \u001e \u001fj . (14.2) k j =0 1 j! Equation (14.2) is called the Erlang-B formula. Question: There is an easy way to remember the formula for Pblock by relating it to the Poisson distribution. Can you see what it is? Hint: Multiply both the numerator and denominator by e . Lemma 14.3 Let X Poisson \u001e \u001f . Then e ( )k /k! =\u0004 \u001e \u001fj k e j =0 Pblock = 1 j! P {X = k} P {X k} (14.3) The applicability of the Erlang-B formula stems from the fact that it is independent of the service time distribution. That is, this same formula arises when the service demand has a mean of 1 with any probability distribution. This is known as an insensitivity result, because the result depends only on the mean of the distribution. Insensitivity results are always quite striking when they occur, because it is much more typical that queueing behavior is highly influenced by the distribution of the service time. Insensitivity results often occur in situations where there is no queue. We will see several other insensitivity results during the course of this book. Downloaded from http:/www.cambridge.org/core. Ulsan National Institute of Science and Technology, on 05 Oct 2016 at 17:00:16, subject to the Cambridge Core terms of use, available at http:/www.cambridge.org/core/terms. http://dx.doi.org/10.1017/CBO9781139226424.020 258 server farms: m/m/k and m/m/k/k 14.3 M/M/k Figure 14.3 illustrates the M/M/k queueing system. As in the fixed-capacity system, the k servers draw from the same pool of incoming jobs, except that this time the pool has infinite space. Whenever a server becomes free, it takes the next job from the pool. The \"pool\" is just an FCFS queue with unbounded capacity. Server 1 Server 2 Server k Figure 14.3. An M/M/k queueing system. We can model the number of jobs in the M/M/k queueing system as a CTMC as shown in Figure 14.4. 0 1 2 2 k k+1 k+2 3 Figure 14.4. An M/M/k queueing system modeled using a CTMC. We write the time-reversibility equations as shown in Table 14.2. Table 14.2. Time-reversibility equations for the M/M/k State Time-reversibility equation Simplified equation 0 0 = 1 1 = 1 1 = 2 2 2 = 2 2 = 3 3 3 = k1 k 1 = k k k = k k = k + 1 k k + 1 = 0 \u001e \u001f2 1 2! 0 1 3! 0 1 k! 0 \u001e \u001f3 \u001e \u001fk \u001e \u001fk + 1 1 1 k! k 1 1 k! k2 \u001e \u001fk + 2 k+1 k + 1 = k + 2 k k + 2 = 0 0 Downloaded from http:/www.cambridge.org/core. Ulsan National Institute of Science and Technology, on 05 Oct 2016 at 17:00:16, subject to the Cambridge Core terms of use, available at http:/www.cambridge.org/core/terms. http://dx.doi.org/10.1017/CBO9781139226424.020 14.3 Therefore, m/m/k \u001e \u001fi 1 i! 0 i = \u001e \u001f i 1 k! 259 if i k \b 1 ik k . 0 if i > k Let's express these equations in terms of the system utilization. Question: But what is the system utilization? Answer: For an M/M/k, the system utilization is defined as = k where is the arrival rate into the system in jobs/sec, and k represents the total service capacity of the system in jobs/sec. Note: In a typical system, the term \"system utilization\" is not well defined because different devices may be running at different utilizations. The term \"utilization\" is thus typically reserved for a single device. An M/M/k queue is an exception in that system utilization is well defined because, by symmetry, the utilization (load) is the same at all servers. Specifically, consider the fraction of time that a particular server is busy. That server, by symmetry, sees an arrival rate of k and experiences a service rate of . Hence the utilization at that server is k . But this is also the utilization at every server. Let R denote the expected number of busy processors. Question: What is R? Answer: R = /. To see this, consider that each server is busy with probability and there are k servers. Thus the expected number of jobs in service is R = E [number in service] = k = . k k (14.4) Observe that R can also be viewed as the minimal resource requirement (i.e., the minimum number of servers required to handle the arrival rate). For example, if the average arrival rate is = 30 jobs/sec, and the service rate of each server is = 5 jobs/sec, then we have a minimal resource requirement of R = / = 30/5 = 6 servers needed just to maintain stability. These definitions are used throughout the book, so we state them again for reference: Definition 14.4 For an M/M/k with average arrival rate and service rate , the system utilization or load is denoted by , where = . k Downloaded from http:/www.cambridge.org/core. Ulsan National Institute of Science and Technology, on 05 Oct 2016 at 17:00:16, subject to the Cambridge Core terms of use, available at http:/www.cambridge.org/core/terms. http://dx.doi.org/10.1017/CBO9781139226424.020 260 server farms: m/m/k and m/m/k/k The resource requirement is denoted by R, where R= . R can also be viewed as the minimum number of servers needed to keep the system stable, or as the expected number of servers that are busy, or as the expected number of jobs in service. Using , we rewrite the equations for i as follows: \u0012 (k ) i 0 if i k i! i = i k k 0 if i > k k! Finally, we need to determine 0 : 0 + i + i=1 \u0018 0 1 + k 1 \u0005 (k)i i=1 0 k 1 \u0005 i! \u0018 k 1 \u0005 (k)i i=0 i! + \u0005 i = 1 i=k \u0005 i i=k k! k \u0019 k =1 \u0019 k k k + =1 k! 1 \u00191 \u0018 k 1 \u0005 (k)i (k)k + 0 = i! k!(1 ) i=0 Probability That an Arriving Job Has to Queue Having found the stationary probabilities, we now find the probability that an arriving job has to queue, PQ . Observe that PQ is the probability that an arrival finds all k servers busy. PQ = P {An arrival finds all servers busy} = P {An arrival sees k jobs in system} = Limiting probability that there are k jobs in system \u0005 = i (by PASTA) i=k kk \u0005 i 0 k! i=k = (k)k 0 = k!(1 ) where 0 = \u0018 k 1 \u0005 (k)i i=0 i! (k)k + k!(1 ) \u00191 (14.5) Equation (14.5) is the famous Erlang-C formula. Downloaded from http:/www.cambridge.org/core. Ulsan National Institute of Science and Technology, on 05 Oct 2016 at 17:00:16, subject to the Cambridge Core terms of use, available at http:/www.cambridge.org/core/terms. http://dx.doi.org/10.1017/CBO9781139226424.020 14.3 m/m/k 261 It is interesting to compare the probability that an arrival finds all servers busy in an M/M/k, PQ , with the probability that an arrival finds all servers busy in an M/M/k/k, Pblock . Question: Intuitively, which system will have a higher probability that all k servers are busy? Answer: The M/M/k system will. Question: Why? Answer: In the M/M/k system, jobs can arrive during the time that the k servers are busy. These jobs do not disappear but rather queue up, thus creating more work for later and thus affecting the future probability that the system is busy. Theorem 14.5 relates the blocking probability for the M/M/k/k to the queueing probability for the M/M/k. Theorem 14.5 Let Pblock denote the blocking probability for the M/M/k/k and PQ the queueing probability for the M/M/k. Let denote the load (system utilization) for the M/M/k. Then (1 )PQ Pblock = . (14.6) 1 PQ Proof have Observing that the M/M/k/k chain is contained within the M/M/k chain, we M/M/k/k Pblock = P {N = k | N k} M/M/k M/M/k = P {N = k} M/M/k P {N k} M/M/k = = P {N k} M/M/k P {N > k} 1 P {N > k} M/M/k PQ PQ 1 PQ where the last line follows from the fact that, beyond state k , the M/M/k looks like an M/M/k M/M/k = P {N k} . M/M/1 with load , hence P {N > k} Expected Number in the Queue We can now calculate the expected number in the queue portion of the M/M/k: M/M/k E [NQ ] = \u0005 i (i k) i=k = 0 \u0005 i k k i=k k! (i k) Downloaded from http:/www.cambridge.org/core. Ulsan National Institute of Science and Technology, on 05 Oct 2016 at 17:00:16, subject to the Cambridge Core terms of use, available at http:/www.cambridge.org/core/terms. http://dx.doi.org/10.1017/CBO9781139226424.020 262 server farms: m/m/k and m/m/k/k k k k \u0005 ik = 0 (i k) k! i=k = 0 k k k \u0005 i i k! i=0 k k k 1 k! (1 )2 = PQ 1 = 0 Question: Explain why E [NQ ] = PQ . 1 Answer: E [NQ ] = E [NQ | queueing] P {queueing} + E [NQ | no queueing] P {no queueing} . But E [NQ | no queueing] = 0. So E [NQ ] = E [NQ | queueing] PQ . Now consider the CTMC for the M/M/k, given that there is queueing. That CTMC looks identical to the CTMC for an M/M/1, where the M/M/1 has arrival rate and service rate k. Specifically, E [NQ | queueing] for the M/M/k is just the expected number of jobs in system for an M/M/1 queue, where the M/M/1 queue has load = k and . So, mean number of jobs 1 E [NQ ] = PQ . 1 (14.7) Finishing up, we can derive the remaining performance metrics for the M/M/k easily as follows: 1 1 E [TQ ] = E [NQ ] = PQ (14.8) 1 1 1 1 = PQ + 1 + k E [N ] = E [T ] = PQ 1 E [T ] = E [TQ ] + (14.9) (14.10) As a check, observe that E [Number being served] = E [N ] E [NQ ] = k = = R. This is the expected result from (14.4). Downloaded from http:/www.cambridge.org/core. Ulsan National Institute of Science and Technology, on 05 Oct 2016 at 17:00:16, subject to the Cambridge Core terms of use, available at http:/www.cambridge.org/core/terms. http://dx.doi.org/10.1017/CBO9781139226424.020 14.4 comparison of three server organizations 263 14.4 Comparison of Three Server Organizations Consider the following three different server organizations, all having arrival rate , total service rate k, and load = k , shown in Figure 14.5. Under frequency-division multiplexing (FDM), traffic is split into k separate channels. Under the M/M/k, the traffic is lumped together, but the service capacity is split. Under the M/M/1, nothing is split. We want to determine which of these configurations is best for minimizing mean response time. /k /k /k M/M/k M/M/1 Frequency-division multiplexing k Figure 14.5. Frequency-division multiplexing, M/M/1, and M/M/k, all with load = k . Question: Which is better in terms of mean response time: FDM or the M/M/1? Answer: Obviously the M/M/1. Each job experiences a k times higher arrival rate under the M/M/1, but also a k times higher service rate. Thus we expect the mean response time to be k times lower for the M/M/1. Computing these response times, we have: E [T ] E [T ] FDM M/M/1 = 1 = 1 . k k = k . k (14.11) (14.12) By comparing equations (14.11) and (14.12), it is obvious that M/M/1 is k times better than FDM. Question: How does the M/M/1 system compare with the M/M/k system? Answer: Recall that for the M/M/k, from (14.9) E [T ] where = , k M/M/k = 1 1 PQ + 1 and PQ is the probability an arrival is forced to queue. Downloaded from http:/www.cambridge.org/core. Ulsan National Institute of Science and Technology, on 05 Oct 2016 at 17:00:16, subject to the Cambridge Core terms of use, available at http:/www.cambridge.org/core/terms. http://dx.doi.org/10.1017/CBO9781139226424.020 264 server farms: m/m/k and m/m/k/k To compare the M/M/k with the M/M/1, consider E [T ] M/M/k E [T ] M/M/1 = 1 PQ 1 1 1 + 1 1 M/M/k + k(1 ). = PQ = PQM/M/k + (14.13) Now consider two cases. Case 1: 0 As the load drops, the probability of queueing drops, so PQM/M/k 0, and expression (14.13) is approximately 0 + k = k . Thus, the M/M/1 is k times faster than M/M/k. Question: Explain intuitively why this makes sense. Answer: In the M/M/k, under light load, most servers are idle. The few servers that are busy serve the jobs they get at rate . By contrast, in the M/M/1, with the same light load, every job gets served with rate k. Thus jobs complete more quickly in the M/M/1. Case 2: 1 With the load high, PQM/M/k 1, and so expression (14.13) is approximately 1. Thus, the M/M/k and M/M/1 have the same response time. Question: Explain why this makes sense. Answer: Because the load is high, there are always jobs in the queue. Thus, the state of the CTMC for the M/M/k is always greater than k . This portion of the CTMC looks like Figure 14.6. Thus the M/M/k under high load behaves just like an M/M/1 with arrival rate and service rate k. k k k Figure 14.6. M/M/k queue under high load. 14.5 Readings We stated that the k -server loss system exhibits an interesting insensitivity property, whereby the distribution of the number of jobs in the loss system depends only on the mean job size, not on its distribution. For a unique and interesting proof of the insensitivity property for this system and several others, we refer the reader to [178], pp. 202-09. 14.6 Exercises 14.1 Comparison of Three Server Organizations Repeat the comparison of three server organizations from Section 14.4. This time, however, assume k = 2 and derive exact closed-form formulas for E [T ] for each of the three architectures shown in Figure 14.5. Downloaded from http:/www.cambridge.org/core. Ulsan National Institute of Science and Technology, on 05 Oct 2016 at 17:00:16, subject to the Cambridge Core terms of use, available at http:/www.cambridge.org/core/terms. http://dx.doi.org/10.1017/CBO9781139226424.020 14.6 exercises 265 14.2 Scherr's Thesis 1965 Once upon a time, back in 1965, an MIT student named Allan Scherr needed to analyze the Compatible Time-Sharing System (CTSS). CTSS was an early time-sharing system in which user programs were swapped in and out of main memory with only one complete program in memory at a time. Because there was no overlap of program execution and swapping, Scherr modeled the sum of the program execution time and swapping time as the CPU service time, S . He modeled the CTSS as a simple interactive system with N terminals and one CPU as shown in Figure 14.7. For Scherr's system, N was 60, the mean CPU service time was E [S] = 0.8 seconds, and the mean user think time was E [Z] = 35 seconds. To determine the mean response time of the system, E [R], Scherr made the false assumption that Z and S were Exponentially distributed random variables. This assumption allowed him to set up a CTMC and solve for the mean response time of the system. Everyone was surprised when the mean response time that Scherr got via his analysis was in fact very close to the measured mean response time of the system, given all his simplifications, so Scherr got a PhD and won a bunch of awards. His thesis is online. Don't you wish it was still 1965? N = 60 E{Z} = 35 seconds CPU E{S} = .8 seconds Figure 14.7. Scherr's CTSS model. (a) Solve Scherr's problem as he did, by making the Exponential assumptions and setting up a CTMC. Determine the limiting probabilities (can you apply the time-reversibility equations?). Write out an expression for E [R]. Now plug in Scherr's numbers and determine E [R] (you will need to write a small program to do the sum). (b) Now use operational analysis (see Chapters 6 and 7), which is distributionindependent, to obtain asymptotic bounds for E [R] in Scherr's problem (remember to determine N ). If you have done it all right, you may be wondering at this point why Scherr himself did not use operational analysis. Turns out operational analysis did not exist until the early 1970s. 14.3 M/M/2/3 Figure 14.8 shows a 2-server system with a waiting room that can hold only 1 job. Any arrival that sees 3 jobs already in the system is dropped. Jobs arrive from outside according to a Poisson process with rate = 1. Whenever a server finishes serving a job, it grabs the job from the waiting area, if there is one. Job sizes are Exponentially distributed with rate = 1. Downloaded from http:/www.cambridge.org/core. Ulsan National Institute of Science and Technology, on 05 Oct 2016 at 17:00:16, subject to the Cambridge Core terms of use, available at http:/www.cambridge.org/core/terms. http://dx.doi.org/10.1017/CBO9781139226424.020 266 server farms: m/m/k and m/m/k/k Poisson () Figure 14.8. The M/M/2/3 system. (a) Draw a CTMC where the state represents the total number of jobs in the system. (b) Suppose that there are exactly 2 jobs in the system. What is the probability that a job arrives before a job completes? (c) Use your CTMC to determine the probability that the system is idle (both servers are idle). (d) What is the throughput of the system? (e) What is E [N ], the expected number of jobs in the system? (f) What is E [T ], the expected response time (for those jobs not dropped)? (g) Consider the process of arrivals to the system that are not dropped. Is this a Poisson process? Why or why not? 14.4 The Infinite Help Desk (M/M/) Imagine that you could call up a company for customer service and never get the message, \"We're sorry; all of our service representatives are busy with other customers . . . \" This can be modeled by a queueing system with an infinite number of servers. Concretely, consider the M/M/ queueing system, where interarrival times are Exponential with rate and the service times are Exponential with rate , but where there are an infinite number of servers (k = ). (a) Draw a state diagram for the continuous-time Markov chain of this system. (b) Derive the limiting probabilities. You need to get a closed-form expression here that is simple and easy to recognize. (c) From the limiting probabilities, derive a closed-form expression for the expected number of jobs in the system, E [N ]. (d) Applying Little's Law gives you E [T ]. Does E [T ] make sense? Explain. 14.5 M/M/2 with Heterogeneous Servers Consider a variant of the M/M/2 queue where the service rates of the two processors are not identical. Denote the service rate of the first processor by 1 and the service rate of the second processor by 2 , where 1 > 2 . In the case of heterogeneous servers, the rule is that when both servers are idle, the faster server is scheduled for service before the slower one. Define the utilization, , for this system to be = 1 + . 2 Set up a CTMC and determine the mean number of jobs in the system and the mean response time. You should get E [N ] = 1 A(1 )2 (14.14) Downloaded from http:/www.cambridge.org/core. Ulsan National Institute of Science and Technology, on 05 Oct 2016 at 17:00:16, subject to the Cambridge Core terms of use, available at http:/www.cambridge.org/core/terms. http://dx.doi.org/10.1017/CBO9781139226424.020 14.6 exercises 267 where A= 1 1 2 (1 + 2) + . ( + 2 ) 1 (14.15) 14.6 Is Load Balancing Good? + More on Closed vs. Open Systems Consider the server farm shown in Figure 14.9. The arrival stream is a Poisson process with rate , and job sizes are Exponentially distributed. Each job with probability p is sent to Host 1, which has service rate 1 , and with probability 1 p is sent to Host 2, which has service rate 2 . There is a queue at each host. (a) Assume 1 = 2 . Either prove or disprove that E [TQ ] and E [T ] are always minimized when p is chosen to balance the load. (b) Now assume 1 \u000e= 2 . Either prove or disprove that E [TQ ] and E [T ] are always minimized when p is chosen to balance the load. (c) Continue to assume 1 \u000e= 2 , but now suppose that we have a closed system with zero think time and large MPL, N , where Figure 14.9 represents the central subsystem in the closed system. Either prove or disprove that E [TQ ] and E [T ] are always minimized when p is chosen to balance the load. FCFS 1 p Poisson () FCFS 1-p 2 Figure 14.9. Distributed server system. 14.7 Throwing Away Servers Suppose your computer center currently consists of a single server of speed . Jobs arrive according to a Poisson process with rate , and their service times are Exponentially distributed. Suppose the current response time is considered intolerable by the users. A second, faster server, running at speed (for > 1), is purchased and added to the system in a heterogeneous M/M/2 structure with a single queue as in Exercise 14.5. Denote the load (utilization) of the M/M/2 system by . Denote the mean response time of the M/M/2 system by E [T ]. (a) Use the result in (14.14) from Exercise 14.5 to derive a formula for E [T ], the mean response time of the M/M/2 system with heterogeneous servers. (b) A hotshot consultant walks in and makes the radical proposal of disconnecting the original server entirely (i.e., simply letting the faster server run by itself). Clearly this makes sense with respect to power, but the consultant claims this is also a win for E [T ]. For $400/hr, what is the consultant thinking? Come up with an instance, in terms of , 2 , and 1 for which the consultant is right. Also, explain intuitively what is happening. [If you Downloaded from http:/www.cambridge.org/core. Ulsan National Institute of Science and Technology, on 05 Oct 2016 at 17:00:16, subject to the Cambridge Core terms of use, available at http:/www.cambridge.org/core/terms. http://dx.doi.org/10.1017/CBO9781139226424.020 268 server farms: m/m/k and m/m/k/k find this problem interesting, you can think about a general criterion under which the consultant would be right . . . Throwing away servers is fun!] 14.8 Comparison of Multi-server Architectures with Heterogeneous Servers This problem asks you to apply (14.14), but does not require you to have solved Exercise 14.5. Consider four different server configurations. In all of these the outside arrival process is Poisson with rate . Also the service time at Host 1 (respectively, Host 2) is Exponentially distributed with rate 1 (respectively, 2 ), where 1 = 2 , 1. (1) An M/M/2 heterogeneous server system. (2) A server farm consisting of two hosts, each with its own queue. Every incoming job is immediately dispatched to Host 1 (with probability p) or to Host 2 (with probability 1 p). The probabilities p and 1 p are chosen so as to balance load at the two hosts. (3) A server farm consisting of two hosts, each with its own queue. Every incoming job is immediately dispatched to Host 1 (with probability p) or to Host 2 (with probability 1 p). The probabilities p and 1 p are chosen so as to minimize mean response time. (4) A server farm consisting of two hosts, each with its own queue. Every incoming job is immediately dispatched to Host 1 (with probability p) or to Host 2 (with probability 1 p), where we set p = 1. Observe that = = . 1 + 2 2 + 2 Now consider the following table: Two identical hosts: 1 = 2 = 1 Host 1 is faster: 1 = 4, 2 = 1 = low = 0.25 Fill in . . . (use = 0.5) Fill in . . . (use = 1.25) = high = 0.75 Fill in . . . (use = 1.5) Fill in . . . (use = 3.75) Your job is to fill in each entry of this table with a ranking of the four server configurations in order from greatest mean response time to least mean response time; for example, Tconfig1 > Tconfig2 = Tconfig3 > Tconfig4 . You may use \">\" or \"=\" signs, but may not use ambiguous signs like . Note: For configuration (4) observe that the correspondence between and does not make sense. Thus when evaluating configuration (4) please just use the value. It will help if you first try to think about the problem using intuition. Feel free to use Mathematica to compare your expressions. Downloaded from http:/www.cambridge.org/core. Ulsan National Institute of Science and Technology, on 05 Oct 2016 at 17:00:16, subject to the Cambridge Core terms of use, available at http:/www.cambridge.org/core/terms. http://dx.doi.org/10.1017/CBO9781139226424.020