Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

MDPs (6 parts, 50 points total). The following problems take place in various scenarios of the gridworld MDP. In all cases, A is the start

image text in transcribed

MDPs (6 parts, 50 points total). The following problems take place in various scenarios of the gridworld MDP. In all cases, A is the start state and double-rectangle states are exit states. From an exit state, the only action available is Exit, which results in the listed reward and ends the game (by moving into a terminal state X, not shown). From non-exit states, the agent can choose either Left or Right actions, which move the agent in the corresponding direction. There are no living rewards; the only non-zero rewards come from exiting the grid. Throughout this problem, assume that value iteration begins with initial values V0(s)=0 for all states s. First, consider the following minigrid. For now, the discount is y=1 and legal movement actions will always succeed (and so the state transition function is deterministic). a) (5 points) What is the optimal value V(A) ? b) (5 points) When running value iteration, remember that we start with V0(s)=0 for all s. What is the first iteration k for which Vk(A) will be non-zero? c) (5 points) What will Vk(A) be when it is first non-zero? d) (10 points) After how many iterations k will we have Vk(A)=V(A) ? If they will never become equal, write never. Now the situation is as before, but the discount is less than 1. e) (15 points) If =0.5, what is the optimal value V(A) ? f) (10 points) For what range of values y of the discount will it be optimal to go right from A? Remember that 01. Write all or none if all or no legal values of have this property

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Principles Programming And Performance

Authors: Patrick O'Neil, Elizabeth O'Neil

2nd Edition

1558605800, 978-1558605800

More Books

Students also viewed these Databases questions

Question

Define compactness.

Answered: 1 week ago