Pattern Recognition and Machine Learning

(Jeff_L) #1

Appendix E Lagrange Multipliers


Lagrange multipliers, also sometimes calledundetermined multipliers, are used to
find the stationary points of a function of several variables subject to one or more
constraints.
Consider the problem of finding the maximum of a functionf(x 1 ,x 2 )subject to
a constraint relatingx 1 andx 2 , which we write in the form


g(x 1 ,x 2 )=0. (E.1)

One approach would be to solve the constraint equation (E.1) and thus expressx 2 as
a function ofx 1 in the formx 2 =h(x 1 ). This can then be substituted intof(x 1 ,x 2 )
to give a function ofx 1 alone of the formf(x 1 ,h(x 1 )). The maximum with respect
tox 1 could then be found by differentiation in the usual way, to give the stationary
valuex 1 , with the corresponding value ofx 2 given byx 2 =h(x 1 ).
One problem with this approach is that it may be difficult to find an analytic
solution of the constraint equation that allowsx 2 to be expressed as an explicit func-
tion ofx 1. Also, this approach treatsx 1 andx 2 differently and so spoils the natural
symmetry between these variables.
A more elegant, and often simpler, approach is based on the introduction of a
parameterλcalled a Lagrange multiplier. We shall motivate this technique from
a geometrical perspective. Consider aD-dimensional variablexwith components
x 1 ,...,xD. The constraint equationg(x)=0then represents a(D−1)-dimensional
surface inx-space as indicated in Figure E.1.
We first note that at any point on the constraint surface the gradient∇g(x)of
the constraint function will be orthogonal to the surface. To see this, consider a point
xthat lies on the constraint surface, and consider a nearby pointx+that also lies
on the surface. If we make a Taylor expansion aroundx,wehave


g(x+)g(x)+T∇g(x). (E.2)

Because bothxandx+lie on the constraint surface, we haveg(x)=g(x+)and
henceT∇g(x) 0. In the limit‖‖→ 0 we haveT∇g(x)=0, and becauseis


707
Free download pdf