734 Microeconometrics: Methods and Developments
E[yi−μ]=0 leads to the estimator̂μ=y ̄. OLS, ML and just-identified IV estimators
can be interpreted as examples of MM estimators.
In the overidentified case in which the dimensionrofhiis greater thanq, there
are more moment conditions than parameters. Hansen (1982) proposed the GMM
estimator̂θGMMthat minimizes the quadratic form:
Q(θ)=
[
1
N
∑
i
h(wi,θ)
]′
WN
[
1
N
∑
i
h(wi,θ)
]
, (14.2)
whereWNis anr×rsymmetric full rank weighting matrix that is usually data-
dependent. The resulting estimator sets aq×rlinear combination ofN^1
∑
ih(wi,θ)
equal to 0. Under appropriate assumptions, including that (14.1) holds atθ=
θ 0 ,̂θGMMis asymptotically normally distributed with meanθ 0 and estimated
asymptotic variance matrix of “sandwich form”:
̂V[̂θGMM]=^1
N
(
Ĝ′WN̂G
)− 1
̂G′WN̂SWNĜ
(
̂G′WN̂G
)− 1
, (14.3)
where ̂G = N−^1
∑
i∂hi/∂θ
′
∣
∣∣
̂θ and
̂S is a consistent estimate of S 0 =
plimN^1
∑
i
∑
jh(wi,θ 0 )h(wj,θ 0 )
′. Given independence overi,̂Ssimplifies tôS=
1
N
∑
ih(wi,̂θ)h(wi,̂θ)
′, while for clustered observations adaptations similar to those
given in section 14.4.1 are used.
A leading overidentified example is IV estimation. The condition that instru-
mentsziare uncorrelated with the error termui=yi−x′iβin a linear regression
model implies that E[zi(yi−x′iβ)]= 0. In the just-identified case the MM estimator
solves
∑
izi(yi−x
′
iβ)=^0 , which yields the IV estimator. In the overidentified case
the GMM estimator minimizes
[∑
izi(yi−x
′
iβ)
]′
WN
[∑
izi(yi−x
′
iβ)
]
= 0. The
two-stage least squares (2SLS) estimator is the special caseWN=
[
N−^1
∑
iziz
′
i
]− 1
.
Estimators for dynamic panel data models, such as that of Arellano and Bond
(1991), are also overidentified GMM estimators that are in common use.
The GMM estimator reduces to the MM estimator, regardless of the choice of
WN, for just-identified models. For overidentified models the most efficient GMM
estimator based on the moment conditions (14.1), called the optimum GMM
(OGMM) or two-step GMM estimator̂θOGMM, setsWN=̂S−^1 , wherêSis a consis-
tent estimate ofS 0. Given independence overi,̂S=N^1
∑
ih(wi, ̃θ)h(wi, ̃θ)
′, where
̃θis a first-step GMM estimator based on an initial choice ofWN. Then the OGMM
estimator has estimated asymptotic variance
̂V[̂θOGMM]=N−^1 (̂G′̂S−^1 Ĝ)−^1.
Chamberlain (1987) showed that the OGMM estimator is the fully efficient esti-
mator based on condition (14.1). In practice, however, it is found that the optimal
GMM estimator suffers from small sample bias (see Altonji and Segal, 1996), and
other simpler choices ofWNmay be better. This has spawned an active litera-
ture, including Windmeijer (2005) and the empirical likelihood methods given in
section 14.3.2.