Appendix:
Computer
Programs for
Logistic
Regression
In this appendix, we provide examples of computer programs to carry out uncondi-
tional logistic regression, conditional logistic regression, polytomous logistic regres-
sion, ordinal logistic regression, and GEE logistic regression. This appendix does not
give an exhaustive survey of all computer packages currently available, but rather is
intended to describe the similarities and differences among a sample of the most
widely used packages. The software packages that we consider are SAS version 9.2,
SPSS version 16.0, and Stata version 10.0. A detailed description of these packages is
beyond the scope of this appendix. Readers are referred to the built-in Help functions
for each program for further information.
The computer syntax and output presented in this appendix are obtained from
running models on five datasets. We provide each of these datasets on an accompa-
nying disk in four forms: (1) as text datasets (with a.datextension), (2) as SAS
version 9 datasets (with a.sas7bdatextension), (3) as SPSS datasets (with a.sav
extension), and (4) as Stata datasets (with a.dtaextension). Each of the four datasets
is described below. We suggest making backup copies of the datasets prior to use to
avoid accidentally overwriting the originals.
DATASETS
- Evans County dataset (evans.dat)
Theevans.datdataset is used to demonstrate a standard logistic regression
(unconditional). The Evans County dataset is discussed in Chap. 2. The data are from
a cohort study in which 609 white males were followed for 7 years, with coronary
heart disease as the outcome of interest. The variables are defined as follows:
ID – The subject identifier. Each observation has a unique identifier since there
is one observation per subject.
CHD – A dichotomous outcome variable indicating the presence (coded 1) or
absence (coded 0) of coronary heart disease.