Question: 1. Given the grid world in figure 18.12, if the reward on reaching on the goal is 100 and = 0.9, calculate manually Q(s,
1. Given the grid world in figure 18.12, if the reward on reaching on the goal is 100 and γ = 0.9, calculate manually Q∗(s, a), V∗(S), and the actions of optimal policy.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
