Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

We recommend (linear or logistic
models):

Require CNI>> 30

Focus on VDPs for largest CNI
Address largest CNI before
other CNIs

Sequential approach + Drop variable, refit model, readdress collinearity, continue until no collinearity

Option 1 – (most popular) Correct-
ing Collinearity:

EXAMPLE Drop a variable from the model Example:VDPs identifyX 1 ;X 2 ;X 1 X 2 + DropX 1 X 2

Dropping a collinear variable:

Does not mean variable is
nonsignificant

Indicates dropped variable
cannot be assessed with other
collinear variables

Option 2 – Correction Collinearity:
Define a new (interpretable)
variable:

Does not make sense for
product terms

Can combine height and weight
into BMI¼height/weight^2

For either linear or logistic models, we recommend that the largest CNI be “considerably” larger than 30 before deciding that one’s model is unreliable, and then focusing on VDPs corresponding to the largest CNI before addressing any other CNI.

This viewpoint is essentially a sequential approach in that it recommends addressing the most likely collinearity problem before con- sidering any additional collinearity problems.

Once a collinearity problem has been deter- mined, the most popular option for correcting the problem is to drop one of the variables identified (by the VDPs) to be a source of the problem. If, for example, the VDPs identify two main effects and their product, the typical solu- tion is to drop the product term from the model.

Nevertheless, when such a term is dropped from the model, this does not mean that this term is nonsignificant, but rather that having such a term with other variables in the model makes the model unreliable. So, by dropping an interaction term in such a case, we indicate this interaction cannot be assessed, rather than it is nonsignificant.

A second option for correcting collinearity is to define a new variable from the variables caus- ing the problem, provided this new variable is (conceptually and/or clinically) interpretable.

Combining collinear variables will rarely make sense if a product term is a source of the problem. However, if, for example, main effect variables such as height and weight were involved, then the “derived” variable BMI (¼height/ weight^2 ) might be used to replace both height and weight in the model.

Presentation: IV. Collinearity 273

Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

Get our desktop app

Company

Features

Documentation

Resources