Question
Buck and Bill are twin brothers who work at a gas station and have a counterfeiting business on the side. Each day a decision is
Buck and Bill are twin brothers who work at a gas station and have a counterfeiting business on the side. Each day a decision is made as to which brother will go to the gas station, and the other one will stay home and run the business in the basement. Each day that the machine is working properly, it is estimated that it is printing 60 $20 bills. However, the machine is somewhat unreliable and breaks down frequently. If the machine is not working at the beginning of the day, Buck can have it in working order at the beginning of the next day with probability 0.6. If Bill works on the machine, the probability reduces to 0.5. If Bill operates the machine, when it is working, it will be in working condition the next day with probability 0.6. If Buck operates the
machine, it will be broken the next day with probability 0.6. Assume for simplicity that all breakdowns happen at the end of the day. The brothers now would like to decide which one should stay home each day to maximize the long-run average reward. Use policy iteration to compute the optimal policy.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started