Key differenceGEE vs. GLM score
equations: GEE allow for multiple
responses per subject
GEE model parameters – three
types:
- Regression parameters (b)
Express relationship between
predictors and outcome. - Correlation parameters (a)
Express within-cluster
correlation; user specifiesCi. - Scale factor(f)
Accounts for extra variation
ofY.
The key difference between these estimating
equations and the score equations presented
in the previous section is that these estimating
equations are generalized to allow for multiple
responses from each subject rather than just
one response.Yiandminow represent acollec-
tionof responses (i.e., vectors) andWirepre-
sents the variance–covariance matrix for all of
theith subject’s responses.
There are three types of parameters in a GEE
model. These are as follows.
- Theregression parameters (b) express the
relationship between the predictors and the
outcome. Typically, for epidemiological ana-
lyses, it is the regression parameters (or regres-
sion coefficients) that are of primary interest.
The other parameters contribute to the accu-
racy and integrity of the model but are
often considered “nuisance parameters”. For
a logistic regression, it is the regression param-
eter estimates that allow for the estimation of
odds ratios. - The correlation parameters (a) express
the within-cluster correlation. To run a GEE
model, the user specifies a correlation struc-
ture (Ci), which provides a framework for the
modeling of the correlation between responses
from the same subject. The choice of correla-
tion structure can affect both the estimates
and the corresponding standard errors of the
regression parameters. - Thescale factor(f) accounts for overdisper-
sion or underdispersion of the response. Over-
dispersion means that the data are showing
more variation in the response variable than
what is assumed from the modeling of the
mean–variance relationship.
526 14. Logistic Regression for Correlated Data: GEE