Mathematical Methods for Physics and Engineering : A Comprehensive Guide

(Darren Dugan) #1

31.6 THE METHOD OF LEAST SQUARES


The other possibility is thatλis an independent parameter and not a function

of the parametersa. In this case, the extended log-likelihood function is


lnL=Nlnλ−λ+

∑N

i=1

lnP(xi|a), (31.89)

where we have omitted terms not depending onλora. Differentiating with


respect toλand setting the result equal to zero, we find that the ML estimate of


λis simply


λˆ=N.

By differentiating (31.89) with respect to the parametersaiand setting the results


equal to zero, we obtain the usual ML estimatesaˆiof their values. In this case,


however, the errors in our estimates will be larger, in general, than those in the


standard likelihood approach, since they must include the effect of statistical


uncertainty in the parameterλ.


31.6 The method of least squares

The method of least squares is, in fact, just a special case of the method of


maximum likelihood. Nevertheless, it is so widely used as a method of parameter


estimation that it has acquired a special name of its own. At the outset, let us


suppose that a data sample consists of a set of pairs (xi,yi),i=1, 2 ,...,N.For


example, these data might correspond to the temperatureyimeasured at various


pointsxialong some metal rod.


For the moment, we will suppose that thexiare known exactly, whereas there

exists a measurement error (ornoise)nion each of the valuesyi. Moreover, let


us assume that the true value ofyat any positionxis given by some function


y=f(x;a) that depends on theMunknown parametersa.Then


yi=f(xi;a)+ni.

Our aim is to estimate the values of the parametersafrom the data sample.


Bearing in mind the central limit theorem, let us suppose that theniare drawn

from aGaussiandistribution with no systematic bias and hence zero mean. In the


most general case the measurement errorsnimightnotbe independent but be


described by anN-dimensional multivariate Gaussian with non-trivial covariance


matrixN, whose elementsNij=Cov[ni,nj] we assume to be known. Under these


assumptions it follows from (30.148), that the likelihood function is


L(x,y;a)=

1
(2π)N/^2 |N|^1 /^2

exp

[
−^12 χ^2 (a)

]
,
Free download pdf