is defined for each person as the cumulative risk over the follow-up period:
r¼
X
xo
where summation is over each square in Figure 10.1 entered by the follow-up
line,xthe time spent in a square, andothe corresponding death rate for the
given age–period combination. For the cohort, the individual values ofrare
added to give the expected number of deaths:
m¼
X
r
For various statistical analyses, the observed number of deathsd may be
treated as a Poisson variable with meany¼mr, whereris the relative risk of
being a member of the cohort as compared to the standard population.
10.2 TESTING GOODNESS OF FIT
A goodness-of-fit test is used when one wishes to decide if an observed distri-
bution of frequencies is incompatible with some hypothesized distribution. The
Poisson is a very special distribution; its mean and its variance are equal.
Therefore, given a sample of count data fxigni¼ 1 , we often wish to know
whether these data provide su‰cient evidence to indicate that the sample
did not come from a Poisson-distributed population. The hypotheses are as
follows:
H 0 : The sampled population is distributed as Poisson.
HA: The sampled population is not distributed as Poisson.
The most frequent violation is an overdispersion; the variance is larger than the
mean. The implication is serious; the analysis assumes that the Poisson model
often underestimates standard error(s) and thus wrongly inflates the level of
significance.
The test statistic is the familiar Pearson chi-square:
X^2 ¼
Xk
i
ðOiEiÞ^2
Ei
whereOiandEirefer to theith observed and expected frequencies, respectively
(we used the notationsxijandmmcijijin Chapter 6). In this formula,kis the num-
ber of groups for which observed and expected frequencies are available. When
the null hypothesis is true, the test statistic is distributed as chi-square with
ðk 2 Þdegrees of freedom; 1 degree of freedom was lost because the mean
needs to be estimated and 1 was lost because of the constraint
P
Oi¼
P
Ei.It
354 METHODS FOR COUNT DATA