Mathematical Methods for Physics and Engineering : A Comprehensive Guide

31.7 HYPOTHESIS TESTING

In the last equality, we rewrote the expression in matrix notation by defining the

column vectorfwith elementsfi=f(xi;a). The valueχ^2 (aˆ) at this minimum can

be used as a statistic to test the null hypothesisH 0 , as follows. TheNquantities

yi−f(xi;a) are Gaussian distributed. However, provided the functionf(xj;a)is

linear in the parametersa, the equations (31.98) that determine the least-squares

estimateˆaconstitute a set ofMlinear constraints on theseNquantities. Thus,

as discussed in subsection 30.15.2, the sampling distribution of the quantityχ^2 (aˆ)

will be achi-squared distribution withN−Mdegrees of freedom(d.o.f), which has

the expectation value and variance

E[χ^2 (ˆa)] =N−M and V[χ^2 (aˆ)] = 2(N−M).

Thus we would expect the value ofχ^2 (aˆ) to lie typically in the range (N−M)±
√
2(N−M). A value lying outside this range may suggest that the assumed model

for the data is incorrect. A very small value ofχ^2 (aˆ) is usually an indication that

the model has too many free parameters and has ‘over-fitted’ the data. More

commonly, the assumed model is simply incorrect, and this usually results in a

value ofχ^2 (ˆa) that is larger than expected.

One can choose to perform either a one-tailed or a two-tailed test on the

value ofχ^2 (ˆa). It is usual, for a given significance levelα, to define the one-tailed

rejection region to beχ^2 (aˆ)>k,wheretheconstantksatisfies
∫∞

k

P(χ^2 n)dχ^2 n=α (31.127)

andP(χ^2 n) is the PDF of the chi-squared distribution withn=N−Mdegrees of

freedom (see subsection 30.9.4).

An experiment produces the following data sample pairs(xi,yi):

xi: 1 .85 2.72 2.81 3.06 3.42 3.76 4.31 4.47 4.64 4. 99 yi: 2 .26 3.10 3.80 4.11 4.74 4.31 5.24 4.03 5.69 6. 57

where thexi-values are known exactly but eachyi-value is measured only to an accuracy ofσ=0. 5. At the one-tailed5%significance level, test the null hypothesisH 0 that the underlying model for the data is a straight liney=mx+c.

These data are the same as those investigated in section 31.6 and plotted in figure 31.9. As
shown previously, the least squares estimates of the slopemand interceptcare given by

mˆ=1. 11 and ˆc=0. 4. (31.128)

Since the error on eachyi-value is drawn independently from a Gaussian distribution with
standard deviationσ, we have

χ^2 (a)=

∑N

i=1

[

yi−f(xi;a) σ

] 2

=

∑N

i=1

[y i−mxi−c σ

] 2

. (31.129)

Inserting the values (31.128) into (31.129), we obtainχ^2 (m,ˆcˆ)=11.5. In our case, the
number of data points isN= 10 and the number of fitted parameters isM= 2. Thus, the

Mathematical Methods for Physics and Engineering : A Comprehensive Guide

31.7 HYPOTHESIS TESTING

∑N

[

] 2

=

∑N

] 2

. (31.129)

Get our desktop app

Company

Features

Documentation

Resources