Answered step by step
Verified Expert Solution
Question
1 Approved Answer
A soccer robot R is on a fast break toward the goal, starting in position 1. From positions 1 through 3, it can either
A soccer robot R is on a fast break toward the goal, starting in position 1. From positions 1 through 3, it can either shoot (S) or dribble the ball forward (D). From 4 it can only shoot. If it shoots, it either scores a goal (state G) or misses (state M). If it dribbles, it either advances a square or loses the ball, ending up in state M. R 1 2 3 P(G|k, S) In this Markov Decision Process (MDP), the states are 1, 2, 3, 4, G, and M, where G and M are terminal states. The transition model depends on the parameter y, which is the probability of dribbling successfully (i.e., advancing a square). Assume a discount of 71. For k {1,2,3,4}, we have = 6 and rewards are 0 for all other transitions. 4 P(M|k, S) = 1 - P(k+1|k, D) = y k 6 P(M|k, D) = 1-y, R(k, S, G) = 1 Goal (a) (3 points) Denote by V" the value function for the specific policy T. What is V" (1) for the policy that always shoots? (b) (4 points) Denote by Q'(s. a) the value of a q-state (s, a), which is the expected utility when starting with action a at states, and thereafter acting optimally. What is Q'(3. D) in terms of y? (c) (5 points) Denote by V (s) the value of a state s at iteration t, which is the expected utility when starting in states and acting optimally. Using y, complete the first two iterations (t = 1.2) of value iteration. Iteration 0 corresponds to having value 0 in every state: V(1) = V(2) = V(3) = V(4) = 0. Hint: Recall that V1(s) = max P(s's, a) (R(s. a, s') + V (s')). aEA (d) (3 points) For what range of values of y is Q' (3, S) 2 Q*(3, D)?
Step by Step Solution
★★★★★
3.42 Rating (152 Votes )
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started