350 Nonlinear Programming II: Unconstrained Optimization Techniques
6.13 Quasi-Newton Methods
The basic iterative process used in the Newton’s method is given by Eq. (6.86):
Xi+ 1 =Xi−[Ji]−^1 ∇ f(Xi) (6.93)
where the Hessian matrix [Ji] is composed of the second partial derivatives off
and varies with the design vectorXifor a nonquadratic (general nonlinear) objective
functionf. The basic idea behind the quasi-Newton or variable metric methods is to
approximate either [Ji] by another matrix [Ai] or [Ji]−^1 by another matrix [Bi], using
only the first partial derivatives off. If [Ji]−^1 is approximated by [Bi], Eq. (6.93) can
be expressed as
Xi+ 1 =Xi−λ∗i[Bi] ∇f(Xi) (6.94)
whereλ∗i can be considered as the optimal step length along the direction
Si= −[Bi] ∇f(Xi) (6.95)
It can be seen that the steepest descent direction method can be obtained as a special
case of Eq. (6.95) by setting [Bi] =[I].
Computation of[Bi]. To implement Eq. (6.94), an approximate inverse of the Hes-
sian matrix, [Bi]≡[Ai]−^1 , is to be computed. For this, we first expand the gradient of
fabout an arbitrary reference point,X 0 , using Taylor’s series as
∇f(X)≈ ∇f (X 0 )+[J 0 ](X−X 0 ) (6.96)
If we pick two pointsXiandXi+ 1 and use [Ai] to approximate [J 0 ], Eq. (6.96) can
be rewritten as
∇fi+ 1 = ∇ f(X 0 )+[Ai](Xi+ 1 −X 0 ) (6.97)
∇fi= ∇ f(X 0 )+[Ai](Xi−X 0 ) (6.98)
Subtracting Eq. (6.98) from (6.97) yields
[Ai]di=gi (6.99)
where
di=Xi+ 1 −Xi (6.100)
gi= ∇fi+ 1 − ∇fi (6.101)
The solution of Eq. (6.99) fordican be written as
di=[Bi]gi (6.102)
where [Bi]=[Ai]−^1 denotes an approximation to the inverse of the Hessian matrix,
[J 0 ]−^1. It can be seen that Eq. (6.102) represents a system ofnequations inn^2 unknown
elements of the matrix [Bi]. Thus forn> 1 , the choice of [Bi] is not unique and one
would like to choose [Bi] that is closest to [J 0 ]−^1 , in some sense. Numerous techniques