Basic Statistics

(Barry) #1
CHI-SQUARE TESTS FOR FREQUENCY TABLES: TWO-BY-TWO TABLES 153

1.5 ..


0 1 2 3 4 5 6
X*

Figure 11.1 Chi-square distribution with 1 d.f.

In Table 11.6 we are free to fill in frequencies in just one cell and still keep all
the same totals, so that the d.f.’s are 1. For as soon as we know that 11 respondents
smoked and had low vital capacity, then, to keep all the row and column totals as
given in the table, we can fill in the rest of Table 11.6 by subtraction. For example,
since the total number of smokers is 30, the number of smokers without a low vital
capacity must be 30 - 11 = 19.
In our examples, the proper d.f. to use, then, is 1, so we look for 10.178 in the first
line of Table A.4. The 7.88 under .995 indicates that 99.5% of all the chi-squares are
<7.88; so our x2 = 10.178 is expected to occur < .5% of the time.
If we wish to use Q = .05, we certainly would reject the null hypothesis. Even if
we used an cy = .01 or .005, we would still reject the null hypothesis. We will reject
the null hypothesis if the computed chi-square is greater than or equal to the tabled
chi-square in Table A.4 for 1 d.f. and for our chosen value of a. Figure 1 1.1 depicts
the value of the chi-square distribution with 1 d.f. From Figure 1 1.1 it is obvious that
a value 210.178 is highly unlikely. That is, we will decide that there is an association
between smoking and low vital capacity in 50-year-olds in the population.
The chi-square test that we just made is in a sense a two-sided test. That is, we will
reject the null hypothesis of smoking being independent of low vital capacity if the
vital capacity of smokers was either too high or too low. The odds ratio can be used to
give the direction of the association. Alternatively, examining the differences between
the observed frequencies and the expected frequencies can provide information on
how to interpret the results. In Table 1 1.6 we can see that the smokers had a higher
observed number of respondents who had low vital capacity (1 1) than was expected
(5.25), so the association is one of smoking being positively associated with low vital
capacity. Examining these differences between the observed and expected is quite
easy for tables with only two rows and columns but is sometimes more difficult when
the table has more than two rows or columns. The analysis of tables with more than
two rows and/or columns is explained in Section 1 1.4.

Free download pdf