Introductory Biostatistics

(Chris Devlin) #1

interaction terms: mother’s weightsmoking, mother’s weighthypertension,
mother’s weightuterine irritability. The basic idea is to see ifanyof the other
variables would modify the e¤ect of the mother’s weight on the response (hav-
ing a low-birth-weight baby).



  1. With the original four variables, we obtained lnL¼ 16 :030.

  2. With all seven variables, four original plus three products, we obtained
    lnL¼ 14 :199.


Therefore, we have


wLR^2 ¼ 2 ½lnLðbb^;seven variablesÞlnLðbb^;four original variablesފ
¼ 3 : 662 ;3df;pvalueb 0 : 10

indicating a rather weak level of interactions.


Stepwise Regression In many applications our major interest is to identify
important risk factors. In other words, we wish to identify from many available
factors a small subset of factors that relate significantly to the outcome (e.g.,
the disease under investigation). In that identification process, of course, we
wish to avoid a large type I (false positive) error. In a regression analysis, a type
I error corresponds to including a predictor that has no real relationship to the
outcome; such an inclusion can greatly confuse interpretation of the regression
results. In a standard multiple regression analysis, this goal can be achieved by
using a strategy that adds to or removes from a regression model one factor at
a time according to a certain order of relative importance. Therefore, the two
important steps are as follows:



  1. Specify a criterion or criteria for selecting a model.

  2. Specify a strategy for applying the criterion or criteria chosen.


The process follows the outline of Chapter 5 for logistic regression, combin-
ing the forward selection and backward elimination in the stepwise process,
with selection at each step based on the likelihood ratio chi-square test. SAS’s
PROC PHREG does have an automatic stepwise option to implement these
features.


Example 11.18 Refer to the data for low-birth-weight babies in Example
11.11 (Table 11.14) with all four covariates: mother’s weight, smoking, hyper-
tension, and uterine irritability. This time we perform a stepwise regression
analysis in which we specify that a variable has to be significant at the 0.10
level before it can enter into the model and that a variable in the model has to
be significant at 0.15 for it to remain in the model (most standard computer
programs allow users to make these selections; default values are available).


424 ANALYSIS OF SURVIVAL DATA

Free download pdf