William Greene 497
correlated with the included variables,xit. Since the model is nonlinear, the least
squares estimator is unuseable. The log-likelihood is:
lnL=
∑n
i= 1
∑Ti
t= 1
lnF[qit(αi+x′itβ+zi′γ)].
In principle, direct (brute force) maximization of the function with respect to
(α 1 ,...,αn,β,γ) can be used to obtain estimates of the parameters and of their
asymptotic standard errors. However, several issues arise.
- The number of individual intercept parameters may be excessive. In our appli-
cation, e.g., there are 7,293 families. Direct maximization of the log-likelihood
function for this many parameters is likely to be difficult. This purely practical
issue does have a straightforward solution and is, in fact, not an obstacle to
estimation (see Greene, 2001, 2008a, Ch. 23). - As in the case of the linear model, it is not possible to estimate the parameters
that apply to time invariant variables,zi. In the linear case, the transformation
to group mean deviations turns these variables into columns of zeros. A similar
problem arises in this nonlinear model. - Groups of observations in which the outcome variable,dit, is always one or
always zero fort=1,...,Timust be dropped from the sample. - The full MLE for this model is inconsistent, a consequence of theincidental
parameters problem(see Neyman and Scott, 1948; Lancaster, 2000). The problem
arises because the number ofαiparameters in the model rises withn. With
smallTorTithis produces a bias in the estimator ofβthat does not diminish
with increases inn. The best known case, that of the logit model withT=
2, was documented by Andersen (1970), Hsiao (1986) and Abrevaya (1997),
who showed analytically that, withT= 2, the MLE ofθfor the binary logit
model in the presence of the fixed effects will converge to 2θ. Results for other
distributions and other values ofThave not been obtained analytically, and are
based on Monte Carlo studies. Table 11.1, extracted from Greene (2001, 2004a,
2004b), demonstrates the effect in the probit, logit, and ordered probit model
discussed in section 11.5. (The conditional estimator is discussed below.) The
model contains a continuous variable,xit 1 , and a dummy variable,xit 2. The
Table 11.1 Means of empirical sampling distributions,N=1,000 individuals based on
200 replications. Table entry isβ 1 ,β 2.
T= 2 T= 3 T= 5 T= 8 T= 10 T= 20
β 1 β 2 β 1 β 2 β 1 β 2 β 1 β 2 β 1 β 2 β 1 β 2
Logit 2.020, 2.027 1.698, 1.668 1.379, 1.323 1.217, 1.156 1.161, 1.135 1.069, 1.062
Logit-Ca 0.994, 1.048 1.003, 0.999 0.996, 1.017 1.005, 0.988 1.002, 0.999 1.000, 1.004
Probit 2.083, 1.938 1.821, 1.777 1.589, 1.407 1.328, 1.243 1.247, 1.169 1.108, 1.068
Ord. probit 2.328, 2.605 1.592, 1.806 1.305, 1.415 1.166, 1.220 1.131, 1.158 1.058, 1.068
aEstimates obtained using the conditional likelihood function – fixed effects not estimated.