Question
Question 1. You are designing a software to control the operation of a service machine. At each period of time, a new job may arrive
Question 1. You are designing a software to control the operation of a service machine. At each period of time, a new job may arrive with probability 0.25. If the machine is processing another job, the new job is queued but no more than 10 jobs can fit in the queue; the job requests that cannot be queued are lost. Only one new job can arrive at each time period. The processing speed can be set (for each period) to normal level or to high-speed level. The probability that a job will be completed in one unit of time is 0.25 when the service is at a normal level and 0.75 at the high-speed level of service. The normal service costs in energy and amortization 1 unit per time interval, while the high-speed costs 3.6 units. For each job, waiting costs 0.3 per time interval. If a job is lost, then a penalty of 20 units is accessed. (Q1.a) Formulate an infinite-horizon average-cost Markov decision problem to determine the optimal speed of service at each time so that the average cost/per term is minimal. (Q1.b) Determine an optimal policy using policy iteration method. (Q1.c) Solve the problem using linear programming. (Q1.d) For what discount factor is the discounted infinite-horizon problem equivalent to the average reward problem in this context? Question 2. Each quarter the marketing manager of a retail store divides the customers into two groups based on their purchase behavior in the previous quarter. The classes are denoted by L and H . The manager wishes to determine to which group of customers he should sent a catalog. The cost of sending a catalog is $ 15 per customer. If a customer from group L receives a catalog, then the expected purchase in the current quarter is $ 20, otherwise it is $ 10. If a customer from group H receives a catalog, then the expected purchase in the current quarter is $ 50, otherwise it is $ 25. Furthermore, if a customer from group L receives a catalog, then the probability that he will stay in group L for the next quarter is 0.3, otherwise, it is 0.5. If a customer from group H receives a catalog, then the probability that s/he will stay in group H for the next quarter is 0.8, otherwise, it is 0.4. (Q2.a) Formulate an average reward problem to help the manager. (Q2.b) Determine an optimal policy using policy iteration method. (Q2.c) Solve the problem using linear programming. (Q2.d) Formulate the dual problem to the linear programming problem in (Q2.c). What is the optimal solution of the dual problem and what is its meaning?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started