Infant Care Study: Sample Data
From three infants: five (of nine)
observations listed for each
IDNO MO OUTCOME BIRTHWGT GENDER DIARRHEA
00282 1 0 2000 Male 0
00282 2 0 2000 Male 0
00282 3 1 2000 Male 1
   
00282 8 0 2000 Male 1
00282 9 0 2000 Male 0
....................................
00283 1 0 2950 Female 0
00283 2 0 2950 Female 0
00283 3 1 2950 Female 0
   
00283 8 0 2950 Female 0
00283 9 0 2950 Female 0
....................................
00287 1 1 3250 Male 1
00287 2 1 3250 Male 1
00287 3 0 3250 Male 0
   
00287 8 0 3250 Male 0
00287 9 0 3250 Male 0
....................................
IDNO: identification number
MO: observation month (provides
order to subject-specific measure-
ments)
OUTCOME: dichotomizedz-score
(values can change month to month)
Independent variables:
- Time-dependent variable: can
 vary month to month within a
 cluster
 DIARRHEA: dichotomized
 variable for presence of
 symptoms
- Time-independent variables:
 do not vary month to month
 within a cluster
 BIRTHWGT
 GENDER
On the left, we present data on three infants to
illustrate the layout for correlated data. Five
of nine monthly observations are listed per
infant. In the complete data on 136 infants,
each child had at least 5 months of observa-
tions, and 126 (92.6%) had complete data for
all 9 months.The variable IDNO is the number that identi-
fies each infant. The variable MO indicates
which month the outcome measurement was
taken. This variable is used to provide order for
the data within a cluster. Not all clustered data
have an inherent order to the observations
within a cluster; however, in longitudinal stud-
ies such as this, specific measurements are
ordered over time.The variable OUTCOME is the dichotomized
weight-for-heightz-score indicating the pres-
ence or absence of wasting. Notice that the
outcome can change values from month to
month within a cluster.The independent variable DIARRHEA can also
change values month to month. If symptoms of
diarrhea are present in a given month, then the
variableiscoded1;otherwiseitiscoded0.DIAR-
RHEA is thus atime-dependentvariable. This
contrasts with the variables BIRTHWGT and
GENDER, which do not vary within a cluster
(i.e., do not change month to month).BIRTHWGT
and GENDER aretime-independentvariables.494 14. Logistic Regression for Correlated Data: GEE
