Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

(vip2019) #1

We recommend (linear or logistic
models):


 Require CNI>> 30


 Focus on VDPs for largest CNI
 Address largest CNI before
other CNIs


Sequential approach
+
Drop variable, refit model, readdress
collinearity, continue until no collinearity

Option 1 – (most popular) Correct-
ing Collinearity:


EXAMPLE
Drop a variable from the model
Example:VDPs identifyX 1 ;X 2 ;X 1 X 2
+
DropX 1 X 2

Dropping a collinear variable:

 Does not mean variable is
nonsignificant


 Indicates dropped variable
cannot be assessed with other
collinear variables


Option 2 – Correction Collinearity:
Define a new (interpretable)
variable:


 Does not make sense for
product terms


 Can combine height and weight
into BMI¼height/weight^2


For either linear or logistic models, we recom-
mend that the largest CNI be “considerably”
larger than 30 before deciding that one’s
model is unreliable, and then focusing on
VDPs corresponding to the largest CNI before
addressing any other CNI.

This viewpoint is essentially a sequential
approach in that it recommends addressing
the most likely collinearity problem before con-
sidering any additional collinearity problems.

Once a collinearity problem has been deter-
mined, the most popular option for correcting
the problem is to drop one of the variables
identified (by the VDPs) to be a source of the
problem. If, for example, the VDPs identify two
main effects and their product, the typical solu-
tion is to drop the product term from the model.

Nevertheless, when such a term is dropped
from the model, this does not mean that this
term is nonsignificant, but rather that having
such a term with other variables in the model
makes the model unreliable. So, by dropping
an interaction term in such a case, we indicate
this interaction cannot be assessed, rather than
it is nonsignificant.

A second option for correcting collinearity is to
define a new variable from the variables caus-
ing the problem, provided this new variable is
(conceptually and/or clinically) interpretable.

Combining collinear variables will rarely make
sense if a product term is a source of the prob-
lem. However, if, for example, main effect vari-
ables such as height and weight were involved,
then the “derived” variable BMI (¼height/
weight^2 ) might be used to replace both height
and weight in the model.

Presentation: IV. Collinearity 273
Free download pdf