Question
An Uber driver only works in two main cities: City A and City B. The driver may choose to wait and accept a ride in
An Uber driver only works in two main cities: City A and City B. The driver may choose to wait and accept a ride in each city or go without any ride to the other city. When the driver is in city A, the probability of finding a ride within the city is 1/2, otherwise, the ride is from city A to city B. When the driver is in city B, the probability of finding a ride within the city is 2/3, otherwise, the ride is from city B to city A. The driver on average earns $8 per ride in city A, $15 per ride in city B, and $20 per ride between cities A and B. Nonetheless, Uber charges the driver $5 if the driver decides to change cities without taking a ride. Assume that when the driver choose to change cities, this counts as a ride.
Formulate this problem as a Markov decision process in which the objective is to maximize the expected total earnings for the next five rides (Assume the driver will head back home after finishing these rides). Please indicate the five basic components of the MDP (decision epochs, states, actions, transition probabilities, and rewards/costs).
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started