608 STATISTICS AND PROBABILITY
Face Observed Expected
frequency frequency
149 40
235 40
332 40
446 40
549 40
629 40
Determine theχ^2 -value for this distribution.
[10.2]
- The numbers of telephone calls received by
the switchboard of a company in 200 five-
minute intervals are shown in the distribution
below.
Number of Observed Expected
calls frequency frequency
01116
14442
25352
34642
42426
51214
676
732
Calculate theχ^2 -value for this data.
[3.16]
63.2 Fitting data to theoretical
distributions
For theoretical distributions such as the binomial,
Poisson and normal distributions, expected frequen-
cies can be calculated. For example, from the theory
of the binomial distribution, the probability of hav-
ing 0, 1, 2,...,ndefective items in a sample ofn
items can be determined from the successive terms
of (q+p)n, wherepis the defect rate andq= 1 −p.
These probabilities can be used to determine the
expected frequencies of having 0, 1, 2,...,ndefec-
tive items. As a result of counting the number of
defective items when sampling, the observed fre-
quencies are obtained. The expected and observed
frequencies can be compared by means of a Chi-
square test and predictions can be made as to whether
the differences are due to random errors, due to
some fault in the method of sampling, or due to the
assumptions made.
As for normal andtdistributions, a table is avail-
able for relating various calculated values ofχ^2 to
those likely because of random variations, at vari-
ous levels of confidence. Such a table is shown in
Table 63.1. In Table 63.1, the column on the left
denotes the number of degrees of freedom,ν, and
when theχ^2 -values refer to fitting data to theoreti-
cal distributions, the number of degrees of freedom
is usually (N−1), whereNis the number of rows
in the table from which χ^2 is calculated. How-
ever, when the population parameters such as the
mean and standard deviation are based on sample
data, the number of degrees of freedom is given
byν=N− 1 −M, whereMis the number of esti-
mated population parameters. An application of
this is shown in Problem 4.
The columns of the table headedχ^20. 995 ,χ^20. 99 ,
...give the percentile ofχ^2 -values corresponding
to levels of confidence of 99.5%, 99%,...(i.e. lev-
els of significance of 0.005, 0.01,...). On the far
right of the table, the columns headed...,χ^20. 01 ,
χ^20. 005 also correspond to levels of confidence of...
99%, 99.5%, and are used to predict the ‘too good
to be true’ type results, where the fit obtained is so
good that the method of sampling must be suspect.
The method in whichχ^2 -values are used to test the
goodness of fit of data to probability distributions is
shown in the following problems.
Problem 2. As a result of a survey carried out
of 200 families, each with five children, the dis-
tribution shown below was produced. Test the
null hypothesis that the observed frequencies
are consistent with male and female births being
equally probable, assuming a binomial distribu-
tion, a level of significance of 0.05 and a ‘too
good to be true’ fit at a confidence level of 95%.
Number of boys (B) Number of
and girls (G) families
5B, 0G 11
4B, 1G 35
3B, 2G 69
2B, 3G 55
1B, 4G 25
OB, 5G 5
To determine the expected frequencies
Using the usual binomial distribution symbols, let p
be the probability of a male birth andq= 1 −pbe
the probability of a female birth. The probabilities
of having 5 boys, 4 boys,..., 0 boys are given by the
successive terms of the expansion of (q+p)n. Since