Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Gridworld - Q Learning Create a 5 5 grid world An agent to move around Four possible actions Have a goal state. Reward a Goal

Gridworld - Q Learning
Create a 55 grid world
An agent to move around
Four possible actions
Have a goal state.
Reward a Goal =5 and Another
terminal state =-5
Elsewhere Reward =0
Any action that takes you outside
boundary, Reward =-1
Run 100,000 episodes
Keep a random no. seed
Plot the converged policy and value function for this grid world.
Do it for =0.1,0.5 and 0.9, take epsilon =0.1.
For gamma =0.9, plot the no. of steps to reach the goal across
episodes for epsilon =0.1,0.3 and 0.5.
For all the above, keep the learning rate alpha =0.1.
image text in transcribed

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Design Query Formulation And Administration Using Oracle And PostgreSQL

Authors: Michael Mannino

8th Edition

1948426951, 978-1948426954

More Books

Students also viewed these Databases questions