Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 30, 2024

Question 2 (RL) [50 points - each part 12.5 points]: Consider the following grid world with five different states. The actions are move east, west,

image text in transcribed

Question 2 (RL) [50 points - each part 12.5 points]: Consider the following grid world with five different states. The actions are move east, west, south, north, and exit if it is in a terminal state. (a) We would like to use Model-based learning using the following four observations. What is the estimated Transition and reward based on these observations? (b) Implement direct evaluation as a model-free based learning based on those four observations and calculate the value states for each state. Assume =0.9. (c) We would like to use TD learning and Q-learning to find the values of these states. Suppose that we have the following observed transitions (s,a,s,r) : (B, East, C,3), (C, South, E, 3), (C, East, E,4) , (D, West, C,1), (A,South,C,3) The initial value of each state is 0 . Assume that =0.9 and =0.4. What are the learned values from TD learning after all five observations? Show the process of computing these values. (d) What are the learned Q-values from Q-learning after all five observations? Show the process of computing these values

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Modern Database Management

Modern Database Management

Authors: Jeff Hoffer, Ramesh Venkataraman, Heikki Topi

12th edition

133544613, 978-0133544619

More Books

Students also viewed these Databases questions

Question

★★★★★

Globe Corporation, a new environmental control company, initiated a performance based stock option plan for its management on January 1, 2010. The plan provided for the granting of a variable number...

Answered: 1 week ago

Question

★★★★★

Percentage of Completion Problem: The Nifty Construction Company entered into a long-term contract in 2013. The contract price was $1,600,000, and the company initially estimated the total cost of...

Answered: 1 week ago

Question

★★★★★

Which attributes were difficult to locate or in unexpected places in the dataset?

Answered: 1 week ago

Question

★★★★★

The Vernom Corporation produces and sells to wholesalers a highly successful line of summer lotion and insect repellents. Vernom has decided to diversify to stabilize sales throughout the year. A...

Answered: 1 week ago

Question

★★★★★

Question 2 (RL) [50 points - each part 12.5 points]: Consider the following grid world with five different states. The actions are move east, west, south, north, and exit if it is in a terminal...

Answered: 1 week ago

Question

★★★★★

I found an article from indeed which is a source I really enjoy. I have found multiple jobs on that job board and even posted several. We have found multiple applicants through indeed as well. Some...

Answered: 1 week ago

Question

★★★★★

When is it appropriate to use a root cause analysis

Answered: 1 week ago

Question

★★★★★

.Calculate Enterprise value using the information below ( $ million) Market cap: $3million debt: $2million cash: $2million account receivable: $6Million 2.Which one is correct? ( ) higher PEG...

Answered: 1 week ago

Question

★★★★★

Propose and explain intrinsic and extrinsic strategies that facilitate employee motivation to accomplish the mission and vision goals at Hilton Hotels.

Answered: 1 week ago

Question

★★★★★

A paintball gun uses a liquid CO2 cannister as a propellant for the balls of paint. The tank is at 293 K, with the CO2 near the exit of the cannister valve in gaseous form. At this condition, the...

Answered: 1 week ago

Question

★★★★★

Georgia, Inc. has collected the following data on one of its products. The actual cost of direct materials used is: Direct materials standard (2 lbs @ $3/lb) Total direct materials cost...

Answered: 1 week ago

Previous Question Next Question