Palgrave Handbook of Econometrics: Applied Econometrics

(Grace) #1

480 Discrete Choice Modeling


Health Care data (our appelation) was used in Riphahn, Wambach and Million
(2003) to analyze utilization of the German health care system. The dataset used is
an unbalanced panel of 7,293 individual families observed over seven periods. It is
part of the GSOEP, which can be downloaded from the archive site of theJour-
nal of Applied Econometrics(http://qed.econ.queensu.ca/jae/2003-v18.4/riphahn-
wambach-million/). We will use these to illustrate the single equation and panel
data binary and ordered choice models and models for counts presented in sections
11.3–11.6. The second dataset is also widely used to illustrate multinomial choice
models. These data, from Hensher and Greene (e.g., 2003), are a survey of 210
travelers between Sydney and Melbourne who chose among four modes, air, train,
bus and car. We will use these data to illustrate a few multinomial choice models
in section 11.7.


11.3 Binary choice


The second fundamental building block in the development of discrete choice
models (after the model of random utility) is the basic model for choice between
two alternatives. We would formulate this in a random utility framework with the
utility of two choices:


Ui,1=x′i,1β+z′iγ+εi,1

Ui,2=x′i,2β+z′iγ+εi,2.

For convenience at this point, we assume there is a single choice made, soTi=1.
The utility functions are in the index form, with characteristics and attributes and
common (generic) coefficients. The random terms,εi,1andεi,2, represent unmea-
sured influences on utility. (Looking forward, without these random terms, the
model would imply that with sufficient data (and consistent parameter estimators),
utility could be “observed” exactly, which seems improbable at best.) Consistent
with the earlier description, the analyst observes the choice most preferred by the
individual, that is, the one with the greater utility, say choice 1. Thus, the observed
outcome reveals that:
Ui,1>Ui,2,


or:
x′i,1β+z′iγ+εi,1>x′i,2β+z′iγ+εi,2,


or:


(x′i,1β−x′i,2β)+(z′iγ−z′iγ)>(εi,2−εi,1), (11.1)

or:
(xi,1−xi,2)′β>(εi,2−εi,1).


This exercise reveals several identification problems in the model as stated so far.
First, we have implicitly assumed that, in the event that the two utilities are equal,
the consumer chooses alternative 2. This is a normalization: recall that we assumed

Free download pdf