604 Panel Data Methods
selective admission that is influenced by unobservables, such as unmeasured sever-
ity, that are also associated with the quality of outcomes. Gewekeet al.(2003) find
evidence of patient selection among 78,848 Medicare patients treated in 114 hos-
pitals in Los Angeles county between 1989 and 1992. They focus on patients aged
over 65 with a diagnosis of pneumonia taken from administrative data on hospital
discharges collected by the State of California Office of Statewide Health Planning
and Development. The quality of the clinical outcomes is measured by deaths
in hospital within ten days of admission. A structural probit model for deaths is
coupled with a reduced form multinomial probit model for the patient’s choice of
hospital, allowing correlation in the error terms to capture patient selection. The
system of equations is estimated by a Bayesian approach using the MCMC simu-
lator of the posterior distribution. Gibbs sampling with data augmentation breaks
the estimation into steps, first simulating the latent dependent variables and then
estimating the linear simultaneous equations system. The model is identified by
using distances from the patients’ homes to the hospitals as an instrument. The
raw data and simple probits do not show a relationship between hospital size and
mortality rates, but the MCMC results reveal a U-shaped relationship, with better
quality in the smallest and largest hospitals.
Debet al.(2006b) start with a conventional two-part model for medical expendi-
ture. This is applied to data on ambulatory care, which has 17% zero observations,
and hospital care, which has 94% zero observations and which exhibits positive
skewness and excess kurtosis. The data are drawn from the US MEPS for 1996–
2001, giving six repeated cross-sections and 20,460 observations. The standard
two-part set-up is used with a binary choice equation and a conditional regression
for the logarithm of expenditure. However, this is augmented by a multinomial
probit model to allow for the endogenous selection of insurance plans, which fall
into three categories: HMO, PPO (preferred provider organization) and FFS (fee-
for-service). To capture the possibility of selection bias the error terms from the
insurance equations (u)are assumed to be linearly associated with the error terms
in the two parts of the model for expenditure:
ε 1 =u′δ+υ
ε 2 =u′π+τ.
(12.38)
This assumes that theεs are only conditionally independent givenuand relaxes the
usual assumption that the two parts of the model are independent. Like Geweke
et al.(2003), the full system of equations is estimated by Bayesian MCMC and
Bayes factors are used to construct a test for the exogeneity of the choice of insur-
ance plan. This test shows evidence of substantial selection bias. Having estimated
the model, the authors show how to define and compute estimated treatment
effects for the impact of insurance plan on expenditure, using data augmentation
to impute the latent variables. It should be noted that the approach used to com-
pute the treatment effects involves a standard retransformation for log-scale data
and therefore relies on a strong assumption about the absence of heteroskedasticity
in the expenditure data (Manning and Mullahy, 2001).