c. Yes. The high VDP value onF 3 BASEsuggests
that this product term is involved in a collinearity
problem. Such a problem was not previously found
or even considered when using SAS’s LOGISTIC
procedure to evaluate interaction. If it is decided
thatF 3 BASEis collinear with other terms, then it
should be dropped from the model before any
further modeling is carried out.
d. The “best” choice is iii.
- a. Suggested strategy: For each subject in the dataset,
compute DeltaBetas for the variablesFandBASE
in your initial model and in your final “best” model.
Using plots of these DeltaBetas for each model,
identify any subjects whose plot is “extreme”
relative to the entire dataset. Do not use Cook’s
distance‐type measures since such measures
combine the influence of all variables in the model,
whereas the study focus is on the effect ofFand/or
BASEvariables. One problem with using
DeltaBetas, however, is that such measures detect
influence on a log OR scale rather than an
OR¼exp[b].
b. Any subject who is identified to be an influential
observation may nevertheless be correctly
measured on all predictor variables, so the
researcher must still decide whether such a subject
should be included in the study. A conservative
approach is to drop from the data only those
influential subjects whose measurements have
errata that cannot be corrected. - There is no well‐established method for reducing the
number of tests performed when carrying out a
modeling strategy to determine a “best” model. One
approach is to drop from the model any collection of
variables found to be not significant using a “chunk”
test. Bonferroni‐type corrections are questionable
because the researcher does not know in advance how
many tests will be performed.
Chapter 9 1. The data listing is in subject-specific (SS) format.
Even though the data listing is not provided as part of
the question, the fact that one of the predictors is a
continuous variable indicates that it would not be
convenient or useful to try to determine the distinct
covariate patterns in the data.
- There are 186 covariate patterns (i.e., unique profiles).
The main reason for this is that the model contains the
continuous variable AGE. If, instead, AGE was a
binary variable, the model would only contain 2^4 or 16
covariate patterns.
Chapter 9 679