348 Nonlinear Programming II: Unconstrained Optimization Techniques
6.12 Marquardt Method
The steepest descent method reduces the function value when the design vectorXiis
away from the optimum pointX∗. The Newton method, on the other hand, converges
fast when the design vectorXiis close to the optimum pointX∗. The Marquardt method
[6.15] attempts to take advantage of both the steepest descent and Newton methods.
This method modifies the diagonal elements of the Hessian matrix, [Ji], as
[J ̃i]=[Ji]+αi[I] (6.88)
where [I] is an identity matrix andαiis a positive constant that ensures the positive
definiteness of [J ̃i] when [Ji] is not positive definite. It can be noted that whenαiis
sufficiently large (on the order of 10^4 ), the termαi[ dominates [I] Ji] and the inverse
of the matrix [J ̃i] becomes
[J ̃i]−^1 [=[Ji]+αi[ ]I]−^1 ≈[αi[ ]I]−^1 =
1
αi
[I] (6.89)
Thus if the search directionSiis computed as
Si= −[J ̃i]−^1 ∇fi (6.90)
Sibecomes a steepest descent direction for large values ofαi. In the Marquardt method,
the value ofαiis taken to be large at the beginning and then reduced to zero gradually
as the iterative process progresses. Thus as the value ofαidecreases from a large value
to zero, the characteristics of the search method change from those of a steepest descent
method to those of the Newton method. The iterative process of a modified version of
Marquardt method can be described as follows.
1.Start with an arbitrary initial pointX 1 and constantsα 1 (on the order of
104 ), c 1 ( 0 <c 1 < 1 ), c 2 (c 2 > 1 ),andε(on the order of 10−^2 ). Set the iteration
number asi=1.
2.Compute the gradient of the function,∇fi= ∇ f(Xi).
3 .Test for optimality of the pointXi. If||∇fi|| = ||∇ f(Xi) | ≤| ε,Xiis optimum
and hence stop the process. Otherwise, go to step 4.
4.Find the new vectorXi+ 1 as
Xi+ 1 =Xi+Si=Xi− [[Ji]]+αi[ ]I]−^1 ∇fi (6.91)
5 .Compare the values offi+ 1 andfi. Iffi+ 1 < fi, go to, step 6. Iffi+ 1 ≥fi, go
to step 7.
6.Setαi+ 1 =c 1 αi, i=i+ 1 , and go to step 2.
7.Setαi=c 2 αiand go to step 4.
Anadvantage of this method is the absence of the step sizeλialong the search
directionSi. In fact, the algorithm above can be modified by introducing an optimal
step length in Eq. (6.91) as
Xi+ 1 =Xi+λ∗iSi=Xi−λ∗i[[Ji]+αi[ ]I]−^1 ∇fi (6.92)