STATISTICS
Suppose that two classes of students take the same mathematics examination and the
following percentage marks are obtained:
Class1:6662345577805560694750
Class2:64907656817270
Assuming that the two sets of examinations marksare drawn from Gaussian distributions,
test the hypothesisH 0 :σ^21 =σ^22 at the5%significance level.
The variances of the two samples ares^21 =(12.8)^2 ands^22 =(10.3)^2 and the sample sizes
areN 1 =11andN 2 = 7. Thus, we have
u^2 =
N 1 s^21
N 1 − 1
= 180.2andv^2 =
N 2 s^22
N 2 − 1
= 123. 8 ,
where we have takenu^2 to be the larger value. Thus,F=u^2 /v^2 =1.46 to two decimal
places. Since the first sample contains elevenvalues and the secondcontains seven values,
we taken 1 =10andn 2 = 6. Consulting table 31.4, wesee that, at the 5% significance
level,Fcrit=4.06. Since our value lies comfortably below this, we conclude that there is
no statistical evidence for rejecting the hypothesis that the two samples were drawn from
Gaussian distributions with a common variance.
It is also common to define the variablez=^12 lnF, the distribution of which
can be found straightfowardly from (31.126). This is a useful change of variable
since it can be shown that, for large values ofn 1 andn 2 , the variablezis
distributed approximately as a Gaussian with mean^12 (n− 21 −n− 11 ) and variance
1
2 (n
− 1
2 +n
− 1
1 ).
31.7.7 Goodness of fit in least-squares problems
We conclude our discussion of hypothesis testing with an example of a goodness-
of-fit test. In section 31.6, we discussed the use of the method of least squares in
estimating the best-fit values of a set of parametersain a given modely=f(x;a)
for a data set (xi,yi),i=1, 2 ,...,N. We have not addressed, however, the question
of whether the best-fit modely=f(x;aˆ) does, in fact, provide a good fit to the
data. In other words, we have not considered thus far how to verify that the
functional formfof our assumed model is indeed correct. In the language of
hypothesis testing, we wish to distinguish between the two hypotheses
H 0 : model is correct and H 1 : model is incorrect.
Given the vague nature of the alternative hypothesisH 1 , we clearly cannot use
the generalised likelihood-ratio test. Nevertheless, it is still possible to test the
null hypothesisH 0 at a given significance levelα.
The least-squares estimates of the parametersaˆ 1 ,aˆ 2 ,...,aˆM, as discussed in
section 31.6, are those values that minimise the quantity
χ^2 (a)=
∑N
i,j=1
[yi−f(xi;a)](N−^1 )ij[yj−f(xj;a)] = (y−f)TN−^1 (y−f).