Introductory Biostatistics

(Chris Devlin) #1

with


b 0 ¼b 0 þ

y 1
y 0

wherey 1 is the probability that a case was sampled andy 0 is the probability
that a control was sampled. This result indicates the following points for a
case–control study.



  1. Wecannotestimate individual risks, or relative risk, unlessy 0 andy 1 are
    known, which are unlikely. The value of the intercept provided by the
    computer output is meaningless.

  2. However, since we have the sameb 1 as with a prospective model, we still
    can estimate the odds ratio and if the rare disease assumption applies, can
    interpret the numerical result as an approximate relative risk.


9.1.6 Overdispersion


This section introduces a new issue, the issue ofoverdispersion, which is of
practical importance. However, the presentation also involves somewhat more
advanced statistical concept, such as invoking the variance of the binomial
distribution, which was introduced very briefly in Chapter 3. Because of that,
student readers, especially beginners, may decide to skip without having any
discontinuity. Logistic regression is based on thepoint binomialorBernouilli
distribution; its mean ispand the variance isðpÞð 1 pÞ. If we use the variance/
mean ratio as a dispersion parameter, it is 1 in a standard logistic model, less
than 1 in an underdispersed model, and greater than 1 in an overdispersed
model. Overdispersion is a common phenomenon in practice and it causes
concerns because the implication is serious; the analysis, which assumes the
logistic model, often underestimates standard error(s) and thus wrongly inflates
the level of significance.


Measuring and Monitoring Dispersion After a logistic regression model is
fitted, dispersion is measured by the scaled deviance or scaled Peason chi-
square; it is the deviance or Pearson chi-square divided by the degrees of free-
dom. The deviance is defined as twice the di¤erence between the maximum
achievable log likelihood and the log likelihood at the maximum likelihood
estimates of the regression parameters. Suppose that data are with replications
consisting ofmsubgroups (with identical covariate values); then the Pearson
chi-square and deviance are given by


XP^2 ¼


X


i

X


j

ðrijnipijÞ^2
nipij

XD^2 ¼


X


i

X


j

rijlog

rij
nipij

SIMPLE REGRESSION ANALYSIS 323
Free download pdf