Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

You go to a Halloween party at a mysterious haunted house. Through exploration, you discover that it has the following characteristics. You can either be

You go to a Halloween party at a mysterious haunted house. Through exploration, you discover
that it has the following characteristics. You can either be scared or not scared, and you can either be
upstairs or downstairs. If you are scared, running up or down the stairs costs you a unit of energy
(reward =1) but changes your scared state. If you arent scared, you can run up or down the stairs
with a unit energy cost but also a +1 reward for continuing to not be scared (so overall reward =0).
There are more ghosts downstairs than upstairs, so sitting still in a scared state is worse downstairs
(reward =3) than upstairs (reward =2). If you sit still while you are not scared, a ghost will pop
out and scare you 25% of the time (2 reward). Otherwise you will remain happy (+2 reward).
(a) Formulate this problem as an MDP (for the sake of uniformity, formulate it as a continuing
discounted problem with gamma =0.9.) The rewards are specified in the description. Explicitly
give the state set, action set, transition probabilities and reward expectations Ra
ss(expected
rewards for stateactionnext-state triples as a three-argument function R : S\times A\times S -> R, hint:
See equation 3.6 in Reinforcement Learning Second Edition Sutton and Bartos )
(b) Starting with the policy of running from every state, perform ONE step of policy iteration (by
hand). This means evaluate the given policy and then do policy improvement ONCE. Show
all steps. What is the resulting policy? (Hint: use iterative policy evaluation.)
(c) Explicitly write the Bellman optimality equations for this problem.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

More Books

Students also viewed these Databases questions

Question

Describe how facial expressions influence our feelings.

Answered: 1 week ago

Question

2. Are you varying your pitch (to avoid being monotonous)?

Answered: 1 week ago

Question

3. Are you varying your speaking rate and volume?

Answered: 1 week ago