Introductory Biostatistics

(Chris Devlin) #1

  1. Repeat steps 2 and 3 for those variables not yet in the model. At any
    subsequent step, if none meets the criterion in step 3, no more variables
    are included in the model and the process is terminated.
    Backward elimination procedure

  2. Fit the multiple regression model containing all available independent
    variables:

  3. Select the least important factor according to a certain predetermined
    criterion; this is done by considering one factor at a time and treat it as
    though it were the last variable to enter.

  4. Test for the significance of the factor selected in step 2 and determine,
    according to a certain predetermined criterion, whether or not to delete
    this factor from the model.

  5. Repeat steps 2 and 3 for those variables still in the model. At any sub-
    sequent step, if none meets the criterion in step 3, no more variables are
    removed in the model and the process is terminated.
    Stepwise regression procedure. Stepwise regression is a modified version
    of forward regression that permits reexamination, at every step, of the
    variables incorporated in the model in previous steps. A variable entered
    at an early stage may become superfluous at a later stage because of its
    relationship with other variables now in the model; the information it
    provides becomes redundant. That variable may be removed, if meeting
    the elimination criterion, and the model is refitted with the remaining var-
    iables, and the forward process goes on. The entire process, one step for-
    ward followed by one step backward, continues until no more variables
    can be added or removed.


Criteria For the first step of the forward selection procedure, decisions are
based on individual score test results [ttest,ðn 2 Þdf ]. In subsequent steps,
both forward and backward, the decision is made as follows. Suppose that
there arerindependent variables already in the model and a decision is needed
in the forward selection process. Two regression models are now fitted, one
with allrcurrentX’s included to obtain the regression sum of squares (SSR 1 )
and one with allrX’s plus theXunder investigation to obtain the regression
sum of squares (SSR 2 ). Define the mean square due to addition (or elimination)
as


MSR¼

SSR 2 SSR 1


1


Then the decision concerning the candidate variable is based on



MSR


MSE


anFtest atð 1 ;nr 1 Þdegrees of freedom.


304 CORRELATION AND REGRESSION

Free download pdf