508 Discrete Choice Modeling
the two models. The first two rows of partial effects in Table 11.3 compare the
partial effects computed at the means of the variables, shown in the first row, to
the average partial effects, computed by averaging the individual partial effects,
shown in the second row. As might be expected, the differences between them are
inconsequential.
The log-likelihood for the probit model is slightly larger than for the logit: how-
ever, it is not possible to compare the two on this basis as the models are non-nested.
The Vuong statistic, based onvi=lnLi(logit)−lnLi(probit), equals−7.44, which
favors the probit model. The aggregated prediction of the pooled logit model is
shown in the following table, using the usual prediction rule,P∗=0.5.
Predicted
Actual 01
0 378 9,757
1 394 16,797
Thus, we obtain correct prediction of( 378 +16, 797)/27, 326=62.9% of the
observations. In spite of this apparently good model performance, the pseudo-R^2
is only 1−(−17673.10)/(−18019.55)=0.01923. This suggests a disconnection
between these two measures of model performance. As a final check on the model
itself, we tested the null hypothesis that the five coefficients other than the con-
stant term are zero in the probit specification. The likelihood ratio test is based on
the statistic:
λLR= 2 [−17670.94− 27326 (.37089 ln .37089+.62911 ln .62911)]=697.22.
The Wald statistic based on the full model isλWALD= 686.991. The LM statistic is
computed as:
λLM=g′ 0 X(G′ 0 G 0 )−^1 X′g 0 ,
whereg 0 is the derivative of the log-likelihood when the model contains only a
constant term. This is equal toqitφ(qitβ 0 )/#(qitβ 0 ), whereβ 0 =#−^1 (.62911)=
.32949. Then theith row ofGisgit,0times the corresponding row ofX. The value
of the LM statistic is 715.97. The 5% critical value from the chi-squared distribution
with 5 degrees of freedom is 11.07 so, in all three cases, the null hypothesis that
the slopes are zero is soundly rejected.
The second set of probit estimates was computed using the Gibbs sampler and a
noninformative prior. We used only 500 replications, and discarded the first 100 for
the burn-in. The similarity to the maximum likelihood estimates is what one would
expect given the large sample size. We note, however, that, notwithstanding the
striking similarity of the Gibbs sampler to the MLE, this is not an efficient method
of estimating the parameters of a probit model. The estimator requires generation