Andrew M. Jones 585
robust variant of GLM that is less sensitive to outliers. In response to the prob-
lem of selecting the appropriate link and variance functions, Basu and Rathouz
(2005) suggest a flexible semiparametric extension of the GLM model. Their model
incorporates a Box–Cox transformation into the link function which includes the
log-link as a special case along with other power functions ofy. The model, which
is labeled the extended estimating equations (EEE) approach, also allows for flexi-
ble specifications of the variance using the power variance and quadratic variance
families to nest common distributions, such as the Poisson, gamma, inverse Gaus-
sian and negative binomial. Basuet al.(2006) apply the EEE method to claims data
on the incremental costs associated with heart failure.
12.4 Methods for dealing with unobserved heterogeneity
and dependence
12.4.1 Deviations and conditional estimates
Consider a linear panel data regression model with repeated measurements (t=
1,...,Ti) for a sample ofnindividuals (i=1,...,n):
yit=x′itβ+ui+εit. (12.16)
Correlation between the unobservable individual effects (u)and the regressors (x)
will lead to an omitted variable bias and inconsistent estimates of theβs. The
individual effects can be swept from the equation by transforming variables into
deviations from their within-group means, or by using orthogonal deviations,
based on the mean of the future values of the variables. Applying least squares
to the mean deviations gives the covariance or within-groups estimator ofβ. Sim-
ilarly, the model could be estimated in first differences to eliminate the individual
effects. Identification ofβrests on there being sufficient variation over time so the
estimators may perform poorly when there is insufficient variation.
Many of the outcomes used in health economics are binary or ordered cate-
gorical measures, such as SAH. Fixed effects panel data methods, that allow for
a correlation between the individual effect and the regressors of the model, are
not, however, readily available for categorical data due to the incidental parameter
problem, which means that the individual effect cannot in general be swept out of
the model by taking deviations. For binary data the problem can be surmounted by
using the conditional fixed effects logit, which uses a sufficient statistic to eliminate
the individual effect from the log-likelihood function (Chamberlain, 1980). In the
case of the logistic regression, the within-individual sum ofyitis a sufficient statis-
tic and conditional maximum likelihood (ML) estimates are consistent. Although
the conditional logit provides consistent parameter estimates, the approach has
practical drawbacks for the researcher. First, by only using observations that have
within-individual variation in the outcome and in the regressors, the method often
leads to a substantial reduction in sample size. Second, it is hard to calculate par-
tial effects of a variable of interest due to the inherent lack of information on