InÖnite dynamic programming
Theorem
Under the above conditions there is a stationary optimal strategyπ.The
value function solves the Bellman equation:
V(s)= max
a 2 Φ(s)
(r(s,a)+βV(f(s,a))).
As r is bounded the value function is also bounded and if a bounded
function solves the equation then it is the value function of the problem.
DeÖnition
A Markovian strategy is stationary if it is independent of the time
parametert.