William Greene 481
earlier that the individual makes exactly one choice. Second, it is evident that, in
describing the choice of process in this fashion, it is the relative values of the
attributes of the choices that matter: the difference betweenxi,1andxi,2is the
determinant of the observed outcome, not the specific values of either. Third, note
that the choice of invariant component,zi, has fallen out of the choice process.
The implication is that, unless the characteristics influence the utilities differently,
it is not possible to measure their impact on the choice process. Finally,εi,1andεi,2
are random variables with so far unspecified means and variances. With respect to
the means, if they areμ 1 andμ 2 , onlyμ 2 −μ 1 enters the choice. As such, if the
means wereμ 1 +φandμ 2 +φ, the same outcome would be observed. These means
cannot be measured with observed data, so at least one is normalized to zero.
Finally, consider the outcome of scaling both utilities by an arbitrary constant,
σ. The new random components would beσεi,1=ε∗i,1andσεi,2=ε∗i,2, andβ
andγwould be scaled likewise. However, this scaling of the model would have
no impact on the observed outcome in the last line of equation (11.1). The same
choice would be observed whatever positive valueσtakes. Thus, there is yet one
more indeterminacy in the model. This can be resolved in several ways. The most
common expedient is to normalize the scaling of the random components to one.
Combining all of these, we obtain a conventional form of the model for the
choice between two alternatives:
Ui=μ+(xi)′β+z′iγ+εi, E[εi|Xi,zi]=0, Var[εi|Xi,zi]=1.
di 1 =1ifUi>0 anddi 1 =0 otherwise,
di 2 = 1 −di 1.
In a more familiar arrangement, we would have:
d∗i=x′iβ+z′iγ+εi
di =1ifd∗i>0, anddi=0 otherwise, (11.2)
wheredi=1 indicates that choice 1 is selected and where the correspondence to
the components of the more detailed model is direct.
11.3.1 Regression models
The preceding sections describe an underlying theoretical platform for a binary
choice, based on a model of random utility. In order to translate it into an econo-
metric model, we will add the assumptions behind the stochastic component of
the specification,εi. To this point, the specification is semiparametric. We have
not assumed anything specific about the underlying distribution, only thatεirep-
resents the random (from the point of view of the econometrician) element in the
utility function of individuali. The restrictions imposed (zero mean, unit variance)
are normalizations related to the identification issue and are not intended to be
substantive restrictions on behavior. (Indeed, the unit variance assumption turns
out to be unnecessary for some treatments. We will return to this below.)