Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Reinforcement Learning: The Q - learning Algorithm Please write a code in Python to produce the same outputs as in the pictures but on a
Reinforcement Learning: The Qlearning Algorithm
Please write a code in Python to produce the same outputs as in the pictures but on a bigger grid like x or x Please use Python and DO NOT use open AIs gym package!
The taxi driving problem:
There are four designated locations in the grid world indicated by Red Green Yellow and Blue
When the episode starts, the taxi starts off at a random square and the passenger is at a random location R G Y or B
The taxi drives to the passengers location, picks up the passenger, drives to the passengers destination another one of the four specified locations and then drops off the passenger. While doing so our taxi driver needs to drive carefully to avoid hitting any wall, marked as Once the passenger is dropped off, the episode ends.
What are the actions the agent can choose from at each step?
drive down
drive up
drive right
drive left
pick up a passenger
drop off a passenger
And the states?
possible taxi positions, because the world is a x grid.
possible locations of the passenger, which are R G Y B plus the case when the passenger is in the taxi.
destination locations
Which gives us x x states
What about rewards?
default perstep reward. Why and not simply Because we want to encourage the agent to spend the shortest time, by penalizing each extra step. This is what you expect from a taxi driver, dont you?
reward for delivering the passenger to the correct destination.
reward for executing a pickup or dropoff at the wrong location.
Random agent baseline
Before you start implementing any complex algorithm, you should always build a baseline model.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started