J
Statistics and probability
63
Chi-square and distribution-free tests
63.1 Chi-square values
The significance tests introduced in Chapter 62 rely
very largely on the normal distribution. For large
sample numbers wherez-values are used, the mean
of the samples and the standard error of the means of
the samples are assumed to be normally distributed
(central limit theorem). For small sample numbers
wheret-values are used, the population from which
samples are taken should be approximately normally
distributed for thet-values to be meaningful.Chi-
square tests(pronouncedKYand denoted by the
Greek letterχ), which are introduced in this chapter,
do not rely on the population or a sampling statistic
such as the mean or standard error of the means being
normally distributed. Significance tests based onz-
andt-values are concerned with the parameters of a
distribution, such as the mean and the standard devi-
ation, whereas Chi-square tests are concerned with
the individual members of a set and are associated
withnon-parametric tests.
Observed and expected frequencies
The results obtained from trials are rarely exactly
the same as the results predicted by statistical the-
ories. For example, if a coin is tossed 100 times, it
is unlikely that the result will be exactly 50 heads
and 50 tails. Let us assume that, say, 5 people each
toss a coin 100 times and note the number of, say,
heads obtained. Let the results obtained be as shown
below.
Person ABCDE
Observed frequency 43 54 60 48 57
Expected frequency 50 50 50 50 50
A measure of the discrepancy existing between
the observed frequencies shown in row 2 and the
expected frequencies shown in row 3 can be deter-
mined by calculating the Chi-square value. The
Chi-square value is defined as follows:
χ^2 =
∑{(o−e)^2
e
}
,
where o and e are the observed and expected
frequencies respectively.
Problem 1. Determine the Chi-square value
for the coin-tossing data given above.
Theχ^2 value for the given data may be calculated
by using a tabular approach as shown below.
Person Observed Expected
frequency, frequency,
oe
A 43 50
B 54 50
C 60 50
D 48 50
E 57 50
o−e (o−e)^2
(o−e)^2
e
− 7 49 0.98
4 16 0.32
10 100 2.00
− 2 4 0.08
7 49 0.98
χ^2 =
∑
{
(o−e)^2
e
}
= 4. 36
Hence the Chi-square valueχ^2 = 4. 36.
If the value ofχ^2 is zero, then the observed and
expected frequencies agree exactly. The greater the
difference between theχ^2 -value and zero, the greater
the discrepancy between the observed and expected
frequencies.
Now try the following exercise.
Exercise 226 Problems on determining Chi-
square values
- A dice is rolled 240 times and the observed
and expected frequencies are as shown.