Question

1 Approved Answer

Posted on Nov 04, 2024

Question. Consider a single-server system where there can be at most two customers in the system (including the one being served). In each hour, a

Question. Consider a single-server system where there can be at most two customers in the system (including the one being served). In each hour, a new customer enters to the system with probability 1/2 unless there are already 2 customers in the system. Assume that new arrival occurs at the end of each hour. At the beginning of each hour, the server can decide a configuration if there is a customer in the system. If the configuration is fast, with probability 0.7 , one customer is served and it leaves the system in a given hour. On the other hand, if the configuration is slow, this probability decreases to 0.4.80 TL revenue is obtained for each customer whose service is completed. The costs of slow and fast configurations are 10 and 15TL per hour, respectively. The hourly discount rate is =0.8. We would like to maximize the total expected discounted profit over an infinite horizon. a. Formulate the problem as MDP model by defining states, decision sets, transition probabilities and expected rewards clearly. b. Find the optimal policy using Policy Iteration where the initial policy is to use fast configuration whenever there is at least one customer in the system. c. Develop an LP model for the MDP. Assume that you have solved the LP optimally. What is the optimal solution, which constraints in your model are binding and why? (Do not solve the LP model; you can determine the binding constraints by observing the optimal solution found in part b). Question. Consider a single-server system where there can be at most two customers in the system (including the one being served). In each hour, a new customer enters to the system with probability 1/2 unless there are already 2 customers in the system. Assume that new arrival occurs at the end of each hour. At the beginning of each hour, the server can decide a configuration if there is a customer in the system. If the configuration is fast, with probability 0.7 , one customer is served and it leaves the system in a given hour. On the other hand, if the configuration is slow, this probability decreases to 0.4.80 TL revenue is obtained for each customer whose service is completed. The costs of slow and fast configurations are 10 and 15TL per hour, respectively. The hourly discount rate is =0.8. We would like to maximize the total expected discounted profit over an infinite horizon. a. Formulate the problem as MDP model by defining states, decision sets, transition probabilities and expected rewards clearly. b. Find the optimal policy using Policy Iteration where the initial policy is to use fast configuration whenever there is at least one customer in the system. c. Develop an LP model for the MDP. Assume that you have solved the LP optimally. What is the optimal solution, which constraints in your model are binding and why? (Do not solve the LP model; you can determine the binding constraints by observing the optimal solution found in part b)