REFERENCE
Chapter 11: Analysis of Matched
Data Using Logistic Regression
In contrast, consider a case-control study
involving 100 matched pairs. Suppose that the
outcome variable is lung cancer and that con-
trols are matched to cases on age, race, sex, and
location. Suppose also that smoking status, a
potential confounder denoted as SMK, is not
matched but is nevertheless determined for
both cases and controls, and that the primary
exposure variable of interest, labeled asE,is
some dietary characteristic, such as whether or
not a subject has a high-fiber diet.
Because the study design involves matching, a
logistic model to analyze this data must control
for the matching by using dummy variables
to reflect the different matching strata, each
of which involves a different matched pair.
Assuming the model has an intercept, the
model will need 99 dummy variables to incor-
porate the 100 matched pairs. Besides these
variables, the model contains the exposure var-
iableE, the covariable SMK, and perhaps even
an interaction term of the formESMK.
To obtain the number of parameters in the
model, we must count the one intercept,
the coefficients of the 99 dummy variables,
the coefficient ofE, the coefficient of SMK,
and the coefficient of the product termE
SMK. The total number of parameters is 103.
Because there are 100 matched pairs in the
study, the total number of subjects is, there-
fore, 200. This situation requiresconditional
ML estimationbecause the number of para-
meters, 103, is quitelargerelative to the num-
ber of subjects, 200.
A detailed discussion of logistic regression for
matched data is provided in Chap. 11.
EXAMPLE: Conditional Preferred
Case-control study
100 matched pairs
D¼lung cancer
Matching variables:
age, race, sex, location
Other variables:
SMK (a confounder)
E(dietary characteristic)
Logistic model for matching:
uses dummy variables for
matching strata
99 dummy variables for 100
strata
E, SMK, andESMK also in
model
Number of parameters¼
1 þ 99 þ 3 ¼ 103
"""
intercept dummy E, SMK,ESMK
variables
largerelative to 100 matched
pairs) n = 200
Presentation: III. Unconditional vs. Conditional Methods 109