CHI-SQUARE AND DISTRIBUTION-FREE TESTS 611
J
o−e (o−e)^2
(o−e)^2
e
1 1 0.0250
5 25 0.3906
− 8 64 1.2308
− 1 1 0.0357
1 1 0.0909
2 4 1.0000
0 0 0.0000
χ^2 =
∑
{
(o−e)^2
e
}
= 2. 773
To test the significance of theχ^2 -value
The number of degrees of freedom isν=N−1,
whereNis the number of rows in the table above,
givingν= 7 − 1 =6. The percentile value ofχ^2 is
determined from Table 63.1, for (χ^20. 99 ,ν=6), and
is 16.8. Since the calculated value ofχ^2 (i.e. 2.773)
is smaller than the percentile value,the hypothesis
that the grit deposition is according to a Poisson
distribution is accepted. For a confidence level
of 99%, the (χ^20. 01 ,ν=6) value is obtained from
Table 63.1, and is 0.872. Since the calculated value
ofχ^2 is greater than this value,the fit is not ‘too
good to be true’.
Problem 4. The diameters of a sample of 500
rivets produced by an automatic process have the
following size distribution.
Diameter Frequency
(mm)
4.011 12
4.015 47
4.019 86
4.023 123
4.027 107
4.031 97
4.035 28
Test the null hypothesis that the diameters of
the rivets are normally distributed at a level of
significance of 0.05 and also determine if the
distribution gives a ‘too good’ fit at a level of
confidence of 90%.
To determine the expected frequencies
In order to determine the expected frequencies, the
mean and standard deviation of the distribution
are required. These population parameters,μand
σ, are based on sample data, x ̄ and s, and an
allowance is made in the number of degrees of free-
dom used for estimating the population parameters
from sample data.
The sample mean,
x ̄=
12(4.011)+47(4.015)+86(4.019)+123(4.023)
+107(4.027)+97(4.031)+28(4.035)
500
=
2012. 176
500
= 4. 024
The sample standard deviationsis given by:
s=
√ √ √ √ √ √ √
⎡
⎢
⎢
⎣
12(4. 011 − 4 .024)^2 +47(4. 015 − 4 .024)^2
+···+28(4. 035 − 4 .024)^2
500
⎤
⎥
⎥
⎦
=
√
0. 017212
500
= 0. 00587
The class boundaries for the diameters are 4.009
to 4.013, 4.013 to 4.017, and so on, and are shown
in column 2 of Table 63.2. Using the theory of the
normal probability distribution, the probability for
each class and hence the expected frequency is
calculated as shown in Table 63.2.
In column 3, thez-values corresponding to the
class boundaries are determined usingz=
x− ̄x
s
which in this case is z=
x− 4. 024
0. 00587
. The area
between az-value in column 3 and the mean of the
distribution atz=0 is determined using the table of
partial areas under the standardized normal distribu-
tion curve given in Table 58.1 on page 561, and is
shown in column 4. By subtracting the area between
the mean and thez-value of the lower class boundary
from that of the upper class boundary, the area and
hence the probability of a particular class is obtained,
and is shown in column 5. There is one exception
in column 5, corresponding to class boundaries of
4.021 and 4.025, where the areas are added to give
the probability of the 4.023 class. This is because
these areas lie immediately to the left and right of the
mean value. Column 6 is obtained by multiplying the
probabilities in column 5 by the sample number, 500.
The sum of column 6 is not equal to 500 because
the area under the standardized normal curve for
z-values of less than−2.56 and more than 2.21 are
neglected. The error introduced by doing this is 10
in 500, i.e. 2%, and is acceptable in most problems
of this type. If it is not acceptable, each expected
frequency can be increased by the percentage error.