Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

1. Optimal Policy (4pt) An agent lives in the 23 world shown above. Once it reaches the top right cell, the only action it can

image text in transcribed

1. Optimal Policy (4pt) An agent lives in the 23 world shown above. Once it reaches the top right cell, the only action it can take is to exit, receiving a reward of +10. In any other cell, the agent has the option to go either east, west, north, or south. Furthermore, if it tries to move outside of the borders of the grid, it will bounce off the wall and stay put. In all these cases, it receives the reward of the cell that it lands on as shown on the figure. We assume, a stochastic transition model where 70% of the time the agent goes in the direction it selects; the rest of the time it moves at right angles to the intended direction (15\% to the right and 15% to the 1eft ). If an intended or unintended actions is impossible it is still tried but would result in remaining in the same state and collecting the reward associtaed with that cell. Assuming no discounts (=1), please answer the following questions: (i) What is the optimal policy for r=0 ? Justify your answer, by explaining intuitively why the value of r leads to this policy. (ii) What is the optimal policy for r=+3 ? Justify your answer, by explaining intuitively why the value of r leads to this policy

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Lab Manual For Database Development

Authors: Rachelle Reese

1st Custom Edition

1256741736, 978-1256741732

More Books

Students also viewed these Databases questions

Question

What is Larmors formula? Explain with a suitable example.

Answered: 1 week ago

Question

6. How do histories influence the process of identity formation?

Answered: 1 week ago