The odds ratio has at least two things in its favor. In the first place, it can be calculated in sit-
uations in which a true risk ratio cannot be. In a retrospective study, where we find a group of
people with heart attacks and of another group of people without heart attacks, and look back to
see if they took aspirin, we can’t really calculate risk. Risk is future oriented. If we give 1000
people aspirin and withhold it from 1000 others, we can look at these people ten years down the
road and calculate the risk (and risk ratio) of heart attacks. But if we take 1000 people with (and
without) heart attacks and look backward, we can’t really calculate risk because we have sam-
pled heart attack patients at far greater than their normal rate in the population (50% of our sam-
ple has had a heart attack, but certainly 50% of the population does not suffer from heart attacks).
But we can always calculate odds ratios. And, when we are talking about low probability events,
such as having a heart attack, the odds ratio is usually a very good estimate of what the risk ratio
would be.^9 (Sackett, Deeks, & Altman (1996), referred to above, agree that this is one case where
an odds ratio is useful—and it is useful primarily because in this case it is so close to a relative
risk.) The odds ratio is equally valid for prospective, retrospective, and cross-sectional sampling
designs. That is important. However, when you do have a prospective study the risk ratio can be
computed and actually comes closer to the way we normally think about risk.
A second important advantage of the odds ratio is that taking the natural log of the odds
ratio [ln(OR)] gives us a statistic that is extremely useful in a variety of situations. Two of these
are logistic regression and log-linear models, both of which are discussed later in the book.
I don’t expect most people to be excited by the fact that a logarithmic transformation of the
odds ratio has interesting statistical properties, but that is a very important point nonetheless.
Odds Ratios in 2 3 kTables
When we have a simple 2 3 2 table the calculation of the odds ratio (or the risk ratio) is
straightforward. We simply take the ratio of the two odds (or risks). But when the table is a
2 3 ktable things are a bit more complicated because we have three or more sets of odds,
and it is not clear what should form our ratio. Sometimes odds ratios here don’t make much
sense, but sometimes they do—especially when the levels of one variable form an ordered
series. The data from Jankowski’s study of sexual abuse offer a good illustration. These
data are reproduced in Table 6.11.
Because this study was looking at how adult abuse is influenced by earlier childhood
abuse, it makes sense to use the group who suffered no childhood abuse as the reference
group. We can then take the odds ratio of each of the other groups against this one. For example,
162 Chapter 6 Categorical Data and Chi-Square
(^9) The odds ratio can be defined as where OR 5 odds ratio, RR 5 relative risk, p 1 is the
population proportion of heart attacks in one group, and p 2 is the population proportion of heart attacks in the
other group. When those two proportions are close to 0, they nearly cancel each other and OR.RR.
OR=RR (^) A^1122 pp^21 B,
Table 6.11 Adult sexual abuse related to prior childhood sexual abuse
Abused as Adult
Number of Child
Abuse Categories No Yes Total Risk Odds
0 512 54 566 .095 .106
1 227 37 264 .140 .163
2 59 15 74 .203 .254
3–4 18 12 30 .400 .667
Total 816 118 934 .126 .145