The Essentials of Biostatistics for Physicians, Nurses, and Clinicians

(Ann) #1
132 CHAPTER 8 Contingency Tables

So the investigators were right in thinking that the status of the disease
had a confounding effect on the result in the combined table, and the
analysis should have done only on the separate groups. Thus, Simpson ’ s
paradox is not a true paradox, but rather a misunderstanding about the
proportions in the tables.
Another way to deal with this to avoid the occurrence of Simpson ’ s
paradox would be stratifi cation. Make sure that there are suffi ciently
large numbers of terminal and nonterminal patients. Also through ran-
domization we can make sure that an equal number in each group get
the new treatment as get the old. The stratifi cation can force any ratio
of nonterminal to terminal; a 1 to 1 balance is not necessary, but an
approach that creates a near 1 to 1 balance will do the job.


8.3 THE GENERAL R × C TABLE


The R × C table is a generalization of the 2 × 2, where the column
variable can have two or more categories denoted by C , and the row
variable can also have two or more categories denoted by R. The chi -
square statistic has the same form, but as mentioned earlier, the asymp-
totic distribution under the null hypothesis is a central chi - square, with
( R − 1)( C − 1) degrees of freedom, compared with 1 for the 2 × 2 table.
To illustrate, we will look at an example of a 3 × 3 table. The data
is a sample taken from a registry of women with breast cancer. The
research problem is to see if there is a relationship with the ethnicity
of the patient and the stage of the cancer. The three ethnicities consid-
ered are Caucasian, African American, and Asian. The three stages are
called in situ , local, and distant. The data in the 3 × 3 table is given
next in Table 8.5.
The chi - square statistic is again obtained by taking the observed
minus expected squared divide by the expected in each of the nine cells
and summing them together. We see that the Asians seem to be very
different from their expected value under the independence model.
Also, the in situ stage has all ethnicities, with totals very different from
their expected values. The chi - square statistic is 552.0993. For the chi -
square with degrees of freedom = ( 3 − 1)(3 − 1 ) = 2 × 2 = 4. A value
of 16.266 corresponds to a p - value of 0.001. So the p - value for a chi -
square value of 552.0993.
Thus far, all the tables we have studied had plenty of counts in each
cell. So the chi - square test is highly appropriate and gives results very

Free download pdf