CAT – A dichotomous predictor variable indicating high (coded 1) or normal
(coded 0) catecholamine level.
AGE – A continuous variable for age (in years).
CHL – A continuous variable for cholesterol.
SMK – A dichotomous predictor variable indicating whether the subject ever
smoked (coded 1) or never smoked (coded 0).
ECG – A dichotomous predictor variable indicating the presence (coded 1) or
absence (coded 0) of electrocardiogram abnormality.
DBP – A continuous variable for diastolic blood pressure.
SBP – A continuous variable for systolic blood pressure.
HPT – A dichotomous predictor variable indicating the presence (coded 1) or
absence (coded 0) of high blood pressure. HPT is coded 1 if the systolic
blood pressure is greater than or equal to 160 or the diastolic blood
pressure is greater than or equal to 95.
CH and CC – Product terms of CATHPT and CATCHL, respectively.
- MI dataset (mi.dat)
This dataset is used to demonstrate conditional logistic regression. The MI dataset is
discussed in Chap. 11. The study is a case-control study that involves 117 subjects in
39 matched strata. Each stratum contains three subjects, one of whom is a case
diagnosed with myocardial infarction while the other two are matched controls. The
variables are defined as follows:
MATCH – A variable indicating the subject’s matched stratum. Each stratum
contains one case and two controls and is matched on age, race, sex, and
hospital status.
PERSON – The subject identifier. Each observation has a unique identifier since
there is one observation per subject.
MI – A dichotomous outcome variable indicating the presence (coded 1) or
absence (coded 0) of myocardial infarction.
SMK – A dichotomous variable indicating whether the subject is (coded 1) or is
not (coded 0) a current smoker.
SBP – A continuous variable for systolic blood pressure.
ECG – A dichotomous predictor variable indicating the presence (coded 1) or
absence (coded 0) of electrocardiogram abnormality.
- Cancer dataset (cancer.dat)
This dataset is used to demonstrate polytomous and ordinal logistic regression. The
cancer dataset, discussed in Chaps. 12 and 13, is part of a study of cancer survival
(Hill et al., 1995). The study involves 288 women who had been diagnosed with
endometrial cancer. The variables are defined as follows:
ID – The subject identifier. Each observation has a unique identifier since there
is one observation per subject.
GRADE – A three-level ordinal outcome variable indicating tumor grade.
The grades are well differentiated (coded 0), moderately differentiated
(coded 1), and poorly differentiated (coded 2).
RACE – A dichotomous variable indicating whether the race of the subject is
black (coded 1) or white (coded 0).
600 Appendix: Computer Programs for Logistic Regression