Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

(vip2019) #1
c. Yes. The high VDP value onF 3 BASEsuggests
that this product term is involved in a collinearity
problem. Such a problem was not previously found
or even considered when using SAS’s LOGISTIC
procedure to evaluate interaction. If it is decided
thatF 3 BASEis collinear with other terms, then it
should be dropped from the model before any
further modeling is carried out.
d. The “best” choice is iii.


  1. a. Suggested strategy: For each subject in the dataset,
    compute DeltaBetas for the variablesFandBASE
    in your initial model and in your final “best” model.
    Using plots of these DeltaBetas for each model,
    identify any subjects whose plot is “extreme”
    relative to the entire dataset. Do not use Cook’s
    distance‐type measures since such measures
    combine the influence of all variables in the model,
    whereas the study focus is on the effect ofFand/or
    BASEvariables. One problem with using
    DeltaBetas, however, is that such measures detect
    influence on a log OR scale rather than an
    OR¼exp[b].
    b. Any subject who is identified to be an influential
    observation may nevertheless be correctly
    measured on all predictor variables, so the
    researcher must still decide whether such a subject
    should be included in the study. A conservative
    approach is to drop from the data only those
    influential subjects whose measurements have
    errata that cannot be corrected.

  2. There is no well‐established method for reducing the
    number of tests performed when carrying out a
    modeling strategy to determine a “best” model. One
    approach is to drop from the model any collection of
    variables found to be not significant using a “chunk”
    test. Bonferroni‐type corrections are questionable
    because the researcher does not know in advance how
    many tests will be performed.


Chapter 9 1. The data listing is in subject-specific (SS) format.
Even though the data listing is not provided as part of
the question, the fact that one of the predictors is a
continuous variable indicates that it would not be
convenient or useful to try to determine the distinct
covariate patterns in the data.



  1. There are 186 covariate patterns (i.e., unique profiles).
    The main reason for this is that the model contains the
    continuous variable AGE. If, instead, AGE was a
    binary variable, the model would only contain 2^4 or 16
    covariate patterns.


Chapter 9 679
Free download pdf