Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Consider applying the Q learning algorithm to the same grid world as in Problem 1. Assume that the table of q values is initialized to

Consider applying the Q learning algorithm to the same grid world as in Problem 1. Assume that the table of q values is initialized to 0. Assume the agent begins in State S7 and then travels clockwise around the perimeter of the grid until it reaches the absorbing goal state, completing the first training episode. Assume that = 0.8 and that = 1.

(a) Determine which q(, ) values are modified as a result of this episode, and give their revised values.

(b) Assume that the agent now performs a second identical episode. Determine which q(, ) values are modified as a result of this episode, and give their revised values.

(c) Assume that the agent now performs a third identical episode. Determine which q(, ) values are modified as a result of this episode, and give their revised values.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

The Structure Of The Relational Database Model

Authors: Jan Paredaens ,Paul De Bra ,Marc Gyssens ,Dirk Van Gucht

1st Edition

3642699588, 978-3642699580

More Books

Students also viewed these Databases questions