Basic Statistics

(Barry) #1
150 CATEGORICAL DATA: ANALYSIS OF TWO-WAY FREQUENCY TABLES

Note that the ties in Table 1 1.4 (a and d) are ignored. An approximate standard error of
the paired odds ratio is estimated by first taking the natural logarithm of OR = 2.875
to obtain 1.056 (see Schlesselman [1982]). Next, for paired samples we compute


seln(0R) = /= bc = /x = ,4105


and 95% confidence intervals

ln(0R) & 1.96(.4105) = 1.056 ?c 305


or
,251 < ln(w) < 1.86
Taking the antilogarithm of the endpoints of the confidence interval for ln(w) yields an
approximate 95% confidence interval of 1.28 < w < 6.42 again using e.251 and
Here again, the lower confidence limit is > 1 and indicates a positive association.
Other methods of computing confidence intervals for matched samples are available
(van Belle et al. [2004]).

11.3 CHI-SQUARE TESTS FOR FREQUENCY TABLES:
TWO-BY-TWO TABLES

First we present the use of chi-square tests for frequency tables with two rows and
two columns. Then the use of chi-square tests when there are more than two rows or
two columns is discussed.

11.3.1 Chi-square Test for a Single Sample: Two-by-Two Tables

In Table 1 1.1 the observed frequencies are shown from a single sample of 50-year-olds,
and two measures of association (relative risk and the odds ratio) were presented that
can be used in analyzing such data. We turn now to the chi-square test for association,
a widely used test for determining whether any association exists. Results for this
test are widely available in statistical programs.
The question we now ask is whether or not there is a significant association between
smoking and vital capacity in 50-year-olds. The null hypothesis that we will test is
that smoking and vital capacity are independent (i.e., are not associated).
To perform this test, we first calculate what is called the expected frequencies for
each of the four cells of the table. This is done as follows: If low vital capacity is
independent of smoking, the proportion of smokers with low vital capacity should
equal the proportion of nonsmokers. Equivalently, it should equal the proportion
with low vital capacity for the combined smokers and nonsmokers (21/120 = ,175).
In the first row of Table 11.1, a = 11 and b = 10. We call A and B the expected
frequencies for the first row, and choose them so that


A 21
30 120

-- - - (11.1)

Free download pdf