Exercise 11.7 For the plot of the total reward as a function of time as in Figure
Question:
Exercise 11.7 For the plot of the total reward as a function of time as in Figure 11.12 (page 474), the minimum and zero crossing are only meaningful statistics when balancing positive and negative rewards is reasonable behavior. Suggest what should replace these statistics when zero is not an appropriate definition of reasonable behavior. [Hint: Think about the cases that have only positive reward or only negative reward.]
Step by Step Answer:
Related Book For
Artificial Intelligence Foundations Of Computational Agents
ISBN: 9780521519007
1st Edition
Authors: David L. Poole, Alan K. Mackworth
Question Posted: