Compare the different parameter settings for Q-learning for the game of Example 13.2 (page 585) (the monster
Question:
Compare the different parameter settings for Q-learning for the game of Example 13.2 (page 585) (the “monster game” in AIPython (aipython.org))
In particular, compare the following situations:
(i) step size
(c) = 1/c and the Q-values are initialized to 0.0.
(ii) step size
(c) = 10/(9 +
c) varies, and the Q-values are initialized to 0.0.
(iii) α varies (using whichever of (i) and (ii) is better) and the Q-values are initialized to 5.0.
(iv) α is fixed to 0.1 and the Q-values are initialized to 0.0.
(v) α is fixed to 0.1 and the Q-values are initialized to 5.0.
(vi) Some other parameter settings.
For each of these, carry out multiple runs and compare
(a) the distributions of minimum values
(b) the zero crossing
(c) the asymptotic slope for the policy that includes exploration
(d) the asymptotic slope for the policy that does not include exploration (to test this, after the algorithm has explored, set the exploitation parameter to 100%
and run additional steps).
Which of these settings would you recommend? Why?
Step by Step Answer:
Artificial Intelligence: Foundations Of Computational Agents
ISBN: 9781009258197
3rd Edition
Authors: David L. Poole , Alan K. Mackworth