Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

. Consider the following gridworld. The available actios at any given state are North, East, West and South There are 2 states with +5 and

image text in transcribed

. Consider the following gridworld. The available actios at any given state are North, East, West and South There are 2 states with +5 and -5 rewards as shown in the figure. They are also terminal states where the agent can take an exit action The grey cell is a blocked state where your agent can't move. In a state where taking an action bumps the agent to a nearby wrall doesn't change the state of the agent, e., the agent ends up in the same cell. The discount facto in this gridworld is 0.9 and the transition probability of taking an action at a given state is 08. The agent can end up in a different state than expected with equal probability. You can take the exit action at a terminal state with probability 1. (16 Points) +5 -5 (a) Pertorm1 iteration of Value iteration algorithm. Draw the policy in the gridworld marked with arrowft iteration. Show your caleulations for each state. +5 -5 (b) Perform 2 iteration of Value iteration algorithm. Draw the policy in the gridworld marked teration Show your +5 -5 . Consider the following gridworld. The available actios at any given state are North, East, West and South There are 2 states with +5 and -5 rewards as shown in the figure. They are also terminal states where the agent can take an exit action The grey cell is a blocked state where your agent can't move. In a state where taking an action bumps the agent to a nearby wrall doesn't change the state of the agent, e., the agent ends up in the same cell. The discount facto in this gridworld is 0.9 and the transition probability of taking an action at a given state is 08. The agent can end up in a different state than expected with equal probability. You can take the exit action at a terminal state with probability 1. (16 Points) +5 -5 (a) Pertorm1 iteration of Value iteration algorithm. Draw the policy in the gridworld marked with arrowft iteration. Show your caleulations for each state. +5 -5 (b) Perform 2 iteration of Value iteration algorithm. Draw the policy in the gridworld marked teration Show your +5 -5

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

More Books

Students also viewed these Databases questions