484 Discrete Choice Modeling
The log-likelihood function for the observed data is:lnL=∑n
i= 1
lnProb(di|xi,zi)=∑
di= 1
ln Prob(di= 1 |xi,zi)+∑
di= 0
ln Prob(di= 0 |xi,zi)=∑n
i= 1 lnF[(^2 di−^1 )(x′
iβ+z′
iγ)].Estimation by maximizing the log-likelihood is straightforward for this model. The
gradient of the log-likelihood is:
∂lnL∂(
β
γ)=∑n
i= 1
( 2 di− 1 )
F′[( 2 di− 1 )(x′iβ+z′iγ)]
F[( 2 di− 1 )(x′iβ+z′iγ)](
xi
zi)
=∑n
i= 1
gi=g.The maximum likelihood estimators of the parameters are found by equatinggto
zero, an optimization problem that requires an iterative solution.^6 For convenience
in what follows, we will define:
qi=( 2 di− 1 ),wi=(
xi
zi)
,θ=(
β
γ)
,ti=qiw′iθ,Fi=F(ti),F′i=dFi/dti=fi.(Thus,Fiis the cumulative density function (c.d.f.) andfiis the density for the
assumed distribution.) It follows that:
gi=qiFi′(ti)wi=qifiwi.Statistical inference about the parameters is made using one of the three con-
ventional estimators of the asymptotic covariance matrix: the Berndt, Hall, Hall
and Hausman (BHHH) (1974) estimator, based on the outer products of the first
derivatives:
VBHHH=[∑n
i= 1 gig′
i]− 1
,the actual Hessian:
VH=[
−∑n
i= 1∂^2 lnL
∂θ∂θ′]− 1
=[
−∑n
i= 1FiF′′i−(Fi′)^2
Fi^2,wiw′i]− 1
,or the expected Hessian, which can be shown to equal:
VEH=[
−∑n
i= 1 Edi(
∂^2 lnL
∂θ∂θ′)]− 1
=[
−∑n
i= 1f(w′iθ)f(−w′iθ)
Fi( 1 −Fi)
wiw′i]− 1
.It has become common, evende rigueur, to compute a “robust” covariance matrix
for the MLE usingVH×V−BHHH^1 ×VH, under the assumption that the MLE is robust
to failures of the specification of the model. In fact, there is no obvious failure of