552 Dynamic Programming
Consider the first subproblem by starting at the final stage,i=1. If the input to this
stages 2 is specified, then according to the principle of optimality,x 1 must be selected
to optimizeR 1. Irrespective of what happens to the other stages,x 1 must be selected
such thatR 1 (x 1 , s 2 ) s an optimum for the inputi s 2. If the optimum is denoted asf 1 ∗,
we have
f 1 ∗(s 2 ) =opt
x 1
[R 1 (x 1 , s 2 )] (9.11)
This is called aone-stage policysince once the input states 2 is specified, the optimal
values ofR 1 ,x 1 , ands 1 are completely defined. Thus Eq. (9.11) is a parametric equation
giving the optimumf 1 ∗as a function of the input parameters 2.
Next, consider the second subproblem by grouping the last two stages together.
Iff 2 ∗denotes the optimum objective value of the second subproblem for a specified
value of the inputs 3 , we have
f 2 ∗(s 3 ) =opt
x 1 ,x 2
[R 2 (x 2 , s 3 )+R 1 (x 1 , s 2 )] (9.12)
The principle of optimality requires thatx 1 be selected so as to optimizeR 1 for a given
s 2. Sinces 2 can be obtained oncex 2 ands 3 are specified, Eq. (9.12) can be written
as
f 2 ∗(s 3 ) =opt
x 2
[R 2 (x 2 , s 3 )+f 1 ∗(s 2 )] (9.13)
Thusf 2 ∗represents the optimal policy for thetwo-stage subproblem. It can be seen
that the principle of optimality reduced the dimensionality of the problem from two
[in Eq. (9.12)] to one [in Eq. (9.13)]. This can be seen more clearly by rewriting
Eq. (9.13) using Eq. (9.10) as
f 2 ∗(s 3 ) =opt
x 2
[R 2 (x 2 , s 3 )+f 1 ∗{t 2 (x 2 , s 3 )}] (9.14)
In this form it can be seen that for a specified inputs 3 , the optimum is determined solely
by a suitable choice of the decision variablex 2. Thus the optimization problem stated
in Eq. (9.12), in which bothx 2 andx 1 are to be simultaneously varied to produce the
optimumf 2 ∗, is reduced to two subproblems defined by Eqs. (9.11) and (9.13). Since
the optimization of each of these subproblems involves only a single decision variable,
the optimization is, in general, much simpler.
This idea can be generalized and theith subproblem defined by
fi∗(si+ 1 ) = opt
xi,xi− 1 ,...,x 1
[Ri(xi, si+ 1 )+Ri− 1 (xi− 1 , si) +· · · +R 1 (x 1 , s 2 ) ] (9.15)
which can be written as
fi∗(si+ 1 ) =opt
xi
[Ri(xi, si+ 1 )+fi∗− 1 (si)] (9.16)
wherefi∗− 1 denotes the optimal value of the objective function correspo nding to the last
i−1 stages, andsiis the input to the stagei− 1. The original problem in Eq. (9.15)
requires the simultaneous variation ofidecision variables,x 1 , x 2 ,... , xi, to determine