Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Question 2 (RL) [50 points - each part 12.5 points]: Consider the following grid world with five different states. The actions are move east, west,

image text in transcribedimage text in transcribedimage text in transcribed

Question 2 (RL) [50 points - each part 12.5 points]: Consider the following grid world with five different states. The actions are move east, west, south, north, and exit if it is in a terminal state. (a) We would like to use Model-based learning using the following four observations. What is the estimated Transition and reward based on these observations? (b) Implement direct evaluation as a model-free based learning based on those four observations and calculate the value states for each state. Assume =0.9. (c) We would like to use TD learning and Q-learning to find the values of these states. Suppose that we have the following observed transitions (s,a,s,r) : (B, East, C,3), (C, South, E, 3), (C, East, E,4) , (D, West, C,1), (A,South,C,3) The initial value of each state is 0 . Assume that =0.9 and =0.4. What are the learned values from TD learning after all five observations? Show the process of computing these values. (d) What are the learned Q-values from Q-learning after all five observations? Show the process of computing these values

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Modern Database Management

Authors: Jeff Hoffer, Ramesh Venkataraman, Heikki Topi

12th edition

133544613, 978-0133544619

More Books

Students also viewed these Databases questions

Question

When is it appropriate to use a root cause analysis

Answered: 1 week ago