Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

(vip2019) #1

Get to know your data!


Perform thorough descriptive ana-
lyses before modeling.


 Useful for finding data errors
 Gain insight about your data

Descriptive analyses include the
following:


 Frequency tables
 Summary statistics
 Correlations
 Scatter plots
 Histograms

VII. Hierarchically Well-
Formulated Models


Initial model structure: HWF


Model contains all lower-order
components


It is important to do thorough descriptive ana-
lyses before modeling. Get to know your data!
It is possible to run many models and not know
that you have only two smokers in your dataset.
Also descriptive analyses are useful for finding
errors. An individual’s age may be incorrectly
recorded at 699 rather than 69 and you may
never know that from reading model output.

Descriptive analyses include obtaining fre-
quency tables for categorical variables, uni-
variate summary statistics (means, variance,
quartiles, max, min, etc.) for continuous vari-
able, bivariate cross tables, bivariate correla-
tions, scatter plots, and histograms. Descriptive
analyses can be performed both before and after
the variable specification stage. Often more
insight is gained from a descriptive analysis
than from modeling.

When choosing theVand Wvariables to be
included in the initial model, the investigator
must ensure that the model has a certain struc-
ture to avoid possibly misleading results. This
structure is called ahierarchically well-formulated
model, abbreviated as HWF, which we define
and illustrate in this section.

A hierarchically well-formulated model is a
model satisfying the following characteristic:
Given any variable in the model, all lower-
order components of the variable must also
be contained in the model.

To understand this definition, let us look at an
example of a model that isnothierarchically
well formulated. Consider the model given in
logit form as logit P(X) equalsaplusbEplus
g 1 V 1 plusg 2 V 2 plus the product termsd 1 EV 1
plusd 2 EV 2 plusd 3 EV 1 V 2.

For this model, let us focus on the three-factor
product termEV 1 V 2. This term has the follow-
ing lower-order components:E, V 1 ,V 2 ,EV 1 ,
EV 2 , andV 1 V 2. Note that the last component
V 1 V 2 is not contained in the model. Thus, the
model is not hierarchically well formulated.

EXAMPLE
NotHWF model:
logitP


X


¼aþbEþg 1 V 1 þg 2 V 2
þd 1 EV 1 þd 2 EV 2 þd 3 EV 1 V 2

Components ofEV 1 V 2 :
E,V 1 ,V 2 ,EV 1 ,EV 2 ,V 1 V 2
"not in model

Presentation: VII. Hierarchically Well-Formulated Models 181
Free download pdf