XII. Generalizing the
“Score-like”
Equations to Form
GEE Models
GEE models:
For cluster-correlated data
model parameters:
b and a
correlation
parameters
regression
parameters
Matrix notation used to describe
GEE
Matrices needed specific to each
subject (cluster): Yi, mi, Di, Ci,
andWi
Yi¼
Yi 1
Yi 2
...
Yini
8
>>
>>
>>
<
>>
>>
>>
:
9
>>
>>
>>
=
>>
>>
>>
;
vector ofith subject’s
observed responses
mi¼
mi 1
mi 2
...
mini
8
>>
>>
>>
<
>>>
>>
>:
9
>>
>>
>>
=
>>>
>>
>;
vector ofith subject’s
mean responses
Ci¼working correlation matrix
(nini)
The estimating equations we have presented so
far have assumed one response per subject.
The estimating equations for GEE are “score-
like” equations that can be used when there are
several responses per subject or, more gener-
ally, when there are clustered data that con-
tains within-cluster correlation. Besides the
regression parameters (b) that are also present
in a GLM, GEE models contain correlation
parameters (a) to account for within-cluster
correlation.
The most convenient way to describe GEE
involves the use of matrices. Matrices are
needed because there are several responses
per subject and, correspondingly, a correlation
structure to be considered. Representing these
estimating equations in other ways becomes
very complicated.
Matrices and vectors are indicated by the use
of bold letters. The matrices that are needed
are specific for each subject (i.e.,ith subject),
where each subject has ni responses. The
matrices are denoted asYi,mi,Di,Ci, andWi
and defined as follows:
Yiis the vector (i.e., collection) of theith sub-
ject’s observed responses.
mi is a vector of the ith subject’s mean
responses. The mean responses are modeled
as functions of the predictor variables and the
regression coefficients (as in GLM).
Ciis theninicorrelation matrix containing
the correlation parameters.Ciis often referred
to as the working correlation matrix.
524 14. Logistic Regression for Correlated Data: GEE