Palgrave Handbook of Econometrics: Applied Econometrics

(Grace) #1
William Greene 485

the assumptions of the model (distribution, omitted variables, heteroskedasticity,
and correlation across observations) for which the MLE remains consistent, so the
virtue of the “corrected” covariance matrix is questionable (see Freedman, 2006).
For the two distributions considered here, the derivatives are relatively simple.
For the logistic:


F(t)=%(t), f(t)=F′(t)=%(t)[ 1 −%(t)], F′′(t)=F′(t)[ 1 − 2 %(t)].

For the normal distribution (probit model), the counterparts are:


F(t)=#(t), f(t)=F′(t)=φ(t), F′′(t)=−tφ(t).

In both cases,f(t)=f(−t)andF(−t)= 1 −F(t). For estimation and inference
purposes, a further convenient result is, for the logistic distribution:


−[F(t)F′′(t)−(F′(t))^2 ]/F(t)^2 =%(t)( 1 −%(t)) >0 for allt,

while for the normal distribution:


−[F(t)F′′(t)−(F′(t))^2 ]/F(t)^2 =t[φ(t)/#(t)]+[φ(t)/#(t)]^2 >0 for allt.^7

The implication is that both the second derivatives matrix and the expected sec-
ond derivatives matrix are negative definite for all values of the parameters and
data. Optimization using Newton’s method or the method of scoring will always
converge to the unique maximum of the log-likelihood function, so long as the
weighting matrix (VBHHH,VHorVEH) is not singular.^8


11.3.2.2 Residuals and predictions


Two additional useful results are obtained from the necessary conditions for maxi-
mizing the log-likelihood function. First, the component of the score function that
corresponds to the constant term is:


∑n
i= 1
qi
F′(qiwi′θ)
F(qiw′iθ)

=0.

The terms in this sum are thegeneralized residualsof the model. As do the ordinary
residuals in the regression model, the generalized residuals sum to zero at the MLE.
These terms have been used for specification testing in this model (see Chesher
and Irish, 1987). For the logit model, it can be shown that the result above implies
that:
1
n


∑n
i= 1 di=

1
n

∑n
i= 1 F(w


iθ),

whenFis evaluated at the MLEs of the parameters. The implication is that the
average of the predicted probabilities from the logit model will equal the proportion
of the observations that are equal to one. A similar (albeit inexact) outcome will be
seen in empirical results for the probit model. The theoretical result has not been
shown analytically.

Free download pdf