762 Microeconometrics: Methods and Developments
multinomial data (whereuis now a vector) and normal mixtures for linear and
nonlinear panel data. While often the unobserved heterogeneity is interpreted as a
random intercept, this can be generalized to random slopes (a random coefficients
model). An alternative is finite mixtures models, used particularly in duration and
count data analysis.
Panel data offer the opportunity to permituto be dependent onx. In that case
uitis decomposed into a time-varying component that is independent ofxitand a
time-invariant component that may be correlated withxit. Fixed effects estimators
for these models have been discussed in section 14.5.3. It is important to note that
in nonlinear models these methods identifyβbut not ASF, so that the APEs are
only estimated up to scale.
Panel data also offer the possibility of distinguishing between persistence in
behavior over time due to unobserved heterogeneity and persistence in behav-
ior over time due to true state dependence. For example, rather than the static
linear modelyit=x′itβ+ui+εit, where correlation ofuiwithxitcauses prob-
lems, a more appropriate model may be a dynamic modelyit=ρyi,t− 1 +x′itβ+εit,
where there is now no complication of unobserved heterogeneity. These models
have quite different structural interpretations with quite different policy conse-
quences. For example, high persistence of unemployment given regressors may be
due to stigma attached to being unemployed (state dependence) or may be due to
unobserved low ability (unobserved heterogeneity).
The treatment effects literature allows for unobserved heterogeneity. By assum-
ing that selection is on observables it is possible to estimate ATET(x), which is the
APE for the treatment variable.
Wooldridge (2005, 2008) proposes the use of proxy variables to identify the ASF
and APE. For simplicity, consider the linear modely=x′β+u.IfE[u|x]=0, then
m(x)=E[y|x]=x′βso unobserved heterogeneity causes no problem. Now consider
an omitted variables situation whereu=z′γ+εwith E[ε|x]=0 but E[z|x] = 0.
The ASF ism(x)=x′β+E[z]′γ, whereas the conditional mean E[y|x]=β+E[z|x]′γ.
These terms differ unless E[z|x]=E[z], the case where the unobserved heterogeneity
is independent ofx. A weaker assumption than independence is to assume that
there is a proxy variablewforzwith the properties that (i)xandzare independent
conditional onwso E[z|x,w]=E[z|x], and (ii) E[y|x,w,z,ε]=E[y|x,w,ε], so that
zis redundant in the original model. Then E[y|x,w]=x′β+E[z|w]′γ, which can
be identified by regression ofyonxandz. Taking the expected value with respect
towthen gives the desired ASF. Wooldridge (2005) generalizes this approach to
nonlinear models and argues that, even though failure to control for unobserved
heterogeneity may lead to inconsistent parameter estimates, it is still possible in
some cases to consistently estimate the ASF and APE.
There is also a growing literature on heterogeneity in nonparametric models: see,
for example, Matzkin (2008). A simple approach is to start with the conditional
cumulative distribution function (c.d.f.)F(y|x), which can be nonparametrically
estimated. Defineu=F(y|x), thenuis uniformly distributed on(0, 1)and hence
uncorrelated withx. Inverting yieldsy =F−^1 (u|x)=G(x,u). This provides a
decomposition into observablesxand unobservablesuthat is independent ofx,