Exercise 9.2 Consider the following decision network: This diagram models a decision about whether to cheat at

Question:

Exercise 9.2 Consider the following decision network:

This diagram models a decision about whether to cheat at two different time instances.
Suppose P(watched) = 0.4, P(trouble1|cheat1, watched) = 0.8, and Trouble1 is true with probability 0 for the other cases. Suppose the conditional probability P(Trouble2|Cheat2, Trouble1,Watched) is given by the following table:
Cheat2 Trouble1 Watched P(Trouble2 = t)
t t t 1.0 t t f 0.3 t f t 0.8 t f f 0.0 f t t 0.3 f t f 0.3 f f t 0.0 f f f 0.0 Suppose the utility is given by Trouble2 Cheat2 Utility t t 30 t f 0 f t 100 f f 70

(a) What is an optimal decision function for the variable Cheat2? Show what factors are created. Please try to do it by hand, and then check it with the AIspace.org applet.

(b) What is an optimal policy? What is the value of an optimal policy? Show the tables created.

(c) What is an optimal policy if the probability of being watched goes up?

(d) What is an optimal policy when the rewards for cheating are reduced?

(e) What is an optimal policy when the instructor is less forgiving (or less forgetful)
of previous cheating?

Fantastic news! We've Found the answer you've been seeking!