Exercise 11.8 Compare the different parameter settings for the game of Example 11.8 (page 464). In particular
Question:
Exercise 11.8 Compare the different parameter settings for the game of Example 11.8 (page 464). In particular compare the following situations:
(a) α varies, and the Q-values are initialized to 0.0.
(b) α varies, and the Q-values are initialized to 5.0.
(c) α is fixed to 0.1, and the Q-values are initialized to 0.0.
(d) α is fixed to 0.1, and the Q-values are initialized to 5.0.
(e) Some other parameter settings.
For each of these, carry out multiple runs and compare the distributions of minimum values, zero crossing, the asymptotic slope for the policy that includes exploration, and the asymptotic slope for the policy that does not include exploration.
To do the last task, after the algorithm has converged, set the exploitation parameter to 100% and run a large number of additional steps.
Step by Step Answer:
Artificial Intelligence Foundations Of Computational Agents
ISBN: 9780521519007
1st Edition
Authors: David L. Poole, Alan K. Mackworth