hypertensive subject with a 220 cholesterol count isexp(0.1960)¼1.2166. The
estimated odds ratio for CAT¼1 vs. CAT¼0 for a nonhypertensive subject with a
220 cholesterol count isexp(2.5278)¼12.5262. The table titled “Contrast Results”
gives the chi-square test statistic (53.16) andp-value (<0.0001) for the likelihood ratio
test on the two interaction terms.
C. Events/Trials Format
The Evans County datasetevans.datcontains individual level data. Each observation
represents an individual subject. PROC LOGISTIC and PROC GENMOD also accom-
modate summarized binomial data in which each observation contains a count of the
number of events and trials for a particular pattern of covariates. The dataset EVANS2
summarizes the 609 observations of the EVANS data into eight observations, where
each observation contains a count of the number of events and trials for a particular
pattern of covariates. The dataset contains five variables described below:
CASES – number of coronary heart disease cases
TOTAL – number of subjects at risk in the stratum
CAT – serum catecholamine level (1¼high, 0¼normal)
AGEGRP – dichotomized age variable (1¼age55, 0¼age<55)
ECG – electrocardiogram abnormality (1¼abnormal, 0¼normal)
The code to produce the dataset is shown next. The dataset is small enough that it can
be easily entered manually.
DATA EVANS2;
INPUT CASES TOTAL CAT AGEGRP ECG;
CARDS;
17 274 0 0 0
15 122 0 1 0
7 59001
5 32011
1 8100
9 39110
3 17101
14 58 1 1 1
;
To run a logistic regression on the summarized data EVANS2, the response is put into
anEVENTS/TRIALSform for either PROC LOGISTIC or PROC GENMOD. The
model is stated as follows:
logit PðCHD¼ 1 jXÞ¼b 0 þb 1 CATþb 2 AGEGRPþb 3 ECG
The code to run the model in PROC LOGISTIC using the dataset EVANS2 is:
PROC LOGISTIC DATA¼EVANS2;
MODEL CASES/TOTAL¼CAT AGEGRP ECG;
RUN;
608 Appendix: Computer Programs for Logistic Regression