The variables are defined as follows:
FRACTURE – A dichotomous variable coded 1 for a knee fracture, 0 for no knee
fracture (obtained from X-ray)
FLEX–Adichotomousvariablefortheabilitytoflextheknee,coded0¼yes,1¼no
WEIGHT – A dichotomous variable for the ability to put weight on the knee,
coded 0¼yes, 1¼no
AGECAT – A dichotomous variable for patient’s age, coded 0 if age<55, 1 if
age 55
HEAD – A dichotomous variable for injury to the knee head, coded 0¼no,
1 ¼yes
PETELLAR – A dichotomous variable for injury to the patellar, coded 0¼no,
1 ¼yes
We first illustrate how to perform analyses of these datasets using SAS, followed by
SPSS, and finally Stata. Not all of the output produced from each procedure will be
presented, as some of the output is extraneous to our discussion.
SAS
Analyses are carried out in SAS by using the appropriate SAS procedure on a SAS
dataset. Each SAS procedure begins with the word PROC. The following SAS proce-
dures are used to perform the analyses in this appendix.
- PROC LOGISTIC – This procedure can be used to run logistic regression
(unconditional and conditional), general polytomous logistic regression, and
ordinal logistic regression using the proportional odds model. - PROC GENMOD – This procedure can be used to run generalized linear models
(GLM – including unconditional logistic regression and ordinal logistic
regression) and GEE models. - PROC GLIMMIX – This procedure can be used to run generalized linear mixed
models (GLMMs).
The capabilities of these procedures are not limited to performing the analyses listed
above. However, our goal is to demonstrate only the types of modeling presented in
this text.
Unconditional Logistic Regression
A. PROC LOGISTIC
The first illustration presented is an unconditional logistic regression with PROC
LOGISTIC using the Evans County data. The dichotomous outcome variable is CHD
and the predictor variables are: CAT, AGE, CHL, ECG, SMK, and HPT. Two interac-
tion terms, CH and CC, are also included. CH is the product: CATHPT, while CC is
the product: CATCHL. The variables representing the interaction terms have
already been included in the datasets.
The model is stated as follows:
logit PðCHD¼ 1 jXÞ¼b 0 þb 1 CATþb 2 AGEþb 3 CHLþb 4 ECGþb 5 SMK
þb 6 HPTþb 7 CHþb 8 CC
602 Appendix: Computer Programs for Logistic Regression