Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Question 2 - Value Iteration [35 points] In this question, you will be using an applet to improve your understanding of value iteration. You can

image

Question 2 - Value Iteration [35 points] In this question, you will be using an applet to improve your understanding of value iteration. You can find the applet at https://artint info/demos/mdp/vi.html Note: modern browsers don't seem to like Java. There are workarounds, but a vastly less painful way to access the applet is to make sure you have the Java appletviewer installed (which you should if you have the JDK installed), and then from your command line run appletviewer https://artint . info/ demos/mdp/vi . html (You may need to first navigate to the directory where the appletviewer program is located.) There are some questions listed on that website; for this assignment, please disregard those questions and only answer the following ones. In this assignment, we are using a discount factor of 0.9, initial values of UCI(s) = 0 for all s, and the "absorbing states" option (explained in detail on the website with the applet) We will refer to states as (x,y), meaning the state in the x-th column and the y-th row: e.g. (1,1) for the state at the top left, and (10,1) for the state at the top right. (a) (10 points) The figure below shows the values U.")(s) in each state, that is, the values after one step of value iteration. We will focus on the entry in a single state, namely state (10,8), the state to the right of the absorbing state with reward 10 (which is located at (9,9)). Show in detail how UO( (10,8) ) is computed using the values U"(s). Value Iteration Step Discount 13 . Resch Meeting States

Question 2 - Value Iteration [35 points] In this question, you will be using an applet to improve your understanding of value iteration. You can find the applet at https://artint.info/demos/mdp/vi.html Note: modern browsers don't seem to like Java. There are workarounds, but a vastly less painful way to access the applet is to make sure you have the Java appletviewer installed (which you should if you have the JDK installed), and then from your command line run appletviewer https://artint.info/demos/mdp/vi.html (You may need to first navigate to the directory where the appletviewer program is located.) There are some questions listed on that website; for this assignment, please disregard those questions and only answer the following ones. In this assignment, we are using a discount factor of 0.9, initial values of U(s) = 0 for all s, and the "absorbing states" option (explained in detail on the website with the applet). We will refer to states as (x,y), meaning the state in the x-th column and the y-th row: e.g. (1,1) for the state at the top left, and (10,1) for the state at the top right. (a) (10 points) The figure below shows the values U(s) in each state, that is, the values after one step of value iteration. We will focus on the entry in a single state, namely state (10,8), the state to the right of the absorbing state with reward 10 (which is located at (9,8)). Show in detail how U(10,8)) is computed using the values U)(s). Value Iteration 01 01 01 01 Disco Step Resel Inal Value Absorbing States

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Introduction to Data Mining

Authors: Pang Ning Tan, Michael Steinbach, Vipin Kumar

1st edition

321321367, 978-0321321367

More Books

Students also viewed these Algorithms questions

Question

Calculate the missing value.

Answered: 1 week ago