Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Consider training an MDP using the following sequence of states, actions, and rewards: S1, reward 0, action 1 S2, reward 10, action 1 S2, reward

Consider training an MDP using the following sequence of states, actions, and rewards:

S1, reward 0, action 1

S2, reward 10, action 1

S2, reward 10, action 2

S1, reward 0, action 1

S2, reward 10, action 2

S1, reward 0, action 2

S3, reward 0, action 1

S3, reward 0, action 1

S4, reward 100, action 1

S4, reward 100, action 2

S2, reward 10

(a) Suppose you use certainty equivalent learning to calculate the J values. Fill in the table below, using discount factor = 0.5.

State: S1 S2 S3 S4
J* Value:

(b) Suppose you instead use Q-learning. Assume that all Q-values are initialized to 0. Fill in the table below to show how the Q-values change after the first six transitions, using discount factor = 0.5 and learning rate = 0.5.

State, Action Pair: (S1, 1) (S1, 2) (S2, 1) (S2, 2) (S3, 1) (S3, 2) (S4, 1) (S4, 2)
Q-value at start: 0 0 0 0 0 0 0 0

Q-value after Observing:

S1, reward 0, action 1 S2

Q-value after observing:

S2, reward 10, action 1 S2

Q-value after observing:

S2, reward 10, action 2 S1

Q-value after observing:

S1, reward 0, action 1 S2

Q-value after observing:

S2, reward 10, action 2 S1

S1, reward 0, action 2 S3

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Systems Design Implementation And Management

Authors: Carlos Coronel, Steven Morris

14th Edition

978-0357673034

More Books

Students also viewed these Databases questions

Question

Tell the merits and demerits of Mendeleev's periodic table.

Answered: 1 week ago