Consider four different ways to derive the value of k from k in Qlearning (note that for Q learning with varying k, there must be a different count k for each stateaction pair) (i) Let k 1 k (ii) Let k 10 (9 k) (iii) Let k 0 1 (iv) Let k 0 1 for the first 10,000 steps, k 0 01 for the next 10,000 st...

The Answer is in the image, click to view ...

Consider four different ways to derive the value of k from k in Qlearning (note that for

Question:

Consider four different ways to derive the value of αk from k in Qlearning (note that for Q-learning with varying αk, there must be a different count k for each state–action pair).

(i) Let αk = 1/k.

(ii) Let αk = 10/(9 + k).

(iii) Let αk = 0.1.

(iv) Let αk = 0.1 for the first 10,000 steps, αk = 0.01 for the next 10,000 steps,

αk = 0.001 for the next 10,000 steps, αk = 0.0001 for the next 10,000 steps, and so on.

(a) Which of these will converge to the true Q-value in theory?

(b) Which converges to the true Q-value in practice (i.e., in a reasonable number of steps)? Try it for more than one domain.

(c) Which are able to adapt if the environment changes slowly?

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For book-img-for-question

Artificial Intelligence: Foundations Of Computational Agents

ISBN: 9781009258197

3rd Edition

Authors: David L. Poole , Alan K. Mackworth

See More Books

Question Posted: Oct 22, 2024 03:01 AM

See More Questions

Consider four different ways to derive the value of k from k in Qlearning (note that for

Question:

Step by Step Answer:

Artificial Intelligence: Foundations Of Computational Agents

Students also viewed these Business questions