Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

A soccer robot R is on a fast break toward the goal, starting in position 1. From positions 1 through 3, it can either

A soccer robot R is on a fast break toward the goal, starting in position 1. From positions 1 through 3, it (b) (4 points) Denote by Q'(s. a) the value of a q-state (s, a), which is the expected utility when starting

A soccer robot R is on a fast break toward the goal, starting in position 1. From positions 1 through 3, it can either shoot (S) or dribble the ball forward (D). From 4 it can only shoot. If it shoots, it either scores a goal (state G) or misses (state M). If it dribbles, it either advances a square or loses the ball, ending up in state M. R 1 2 3 P(G|k, S) In this Markov Decision Process (MDP), the states are 1, 2, 3, 4, G, and M, where G and M are terminal states. The transition model depends on the parameter y, which is the probability of dribbling successfully (i.e., advancing a square). Assume a discount of 71. For k {1,2,3,4}, we have = 6 and rewards are 0 for all other transitions. 4 P(M|k, S) = 1 - P(k+1|k, D) = y k 6 P(M|k, D) = 1-y, R(k, S, G) = 1 Goal (a) (3 points) Denote by V" the value function for the specific policy T. What is V" (1) for the policy that always shoots? (b) (4 points) Denote by Q'(s. a) the value of a q-state (s, a), which is the expected utility when starting with action a at states, and thereafter acting optimally. What is Q'(3. D) in terms of y? (c) (5 points) Denote by V (s) the value of a state s at iteration t, which is the expected utility when starting in states and acting optimally. Using y, complete the first two iterations (t = 1.2) of value iteration. Iteration 0 corresponds to having value 0 in every state: V(1) = V(2) = V(3) = V(4) = 0. Hint: Recall that V1(s) = max P(s's, a) (R(s. a, s') + V (s')). aEA (d) (3 points) For what range of values of y is Q' (3, S) 2 Q*(3, D)?

Step by Step Solution

3.42 Rating (152 Votes )

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Financial Accounting and Reporting a Global Perspective

Authors: Michel Lebas, Herve Stolowy, Yuan Ding

4th edition

978-1408066621, 1408066629, 1408076861, 978-1408076866

More Books

Students also viewed these Accounting questions

Question

4. How does a sex-linked gene differ from a sex-limited genepg105

Answered: 1 week ago

Question

6. How does an epigenetic change differ from a mutationpg105

Answered: 1 week ago