Show that if the reward function (r) of a Markov decision problem is bounded in absolute value

Question:

Show that if the reward function $r$ of a Markov decision problem is bounded in absolute value by a constant $c$, then for any policy $\mathbf{u}$, the infinite horizon discounted value function of $\mathbf{u}$ with discount factor $\alpha$ is bounded in absolute value by $c /(1-\alpha)$.

Fantastic news! We've Found the answer you've been seeking!