746 Microeconometrics: Methods and Developments
although it does requireC→∞. It is essential to control for clustering, as failure
to do so can lead to greatly under estimated standard errors (see Moulton, 1990).
With few clusters the asymptotic normal distribution can perform poorly. Small
sample corrections include using theTC− 1 distribution, using a cluster bootstrap
with asymptotic refinement (Cameron, Gelbach and Miller, 2008), and using an
alternative estimator that, under some assumptions, is exactlyTC− 2 distributed
asNc→∞(Donald and Lang, 2007). Wooldridge (2003) provides a survey that
is updated and newer results given in Wooldridge (2006). Cameron, Gelbach and
Miller (2006) propose an extension of (14.24) to multi-way clustering.
Cluster-robust variance matrix estimators have also been proposed for models
wherehi(θ)is spatially correlated. Driscoll and Kraay (1998) do so for panel data
where the time dimension is large. Conley (1999) presents a quite general estimator.
Models for clustered or spatial data may allow the conditional mean to be
affected, in addition to conditional correlations, and then standard estimators
become inconsistent. For clustered data one can use a model with cluster-specific
fixed effects, analogous to fixed effects for panel data. For spatial data recent
references include Anselin (2001) and Lee (2004).
The theory for robust inference is well-established and, in the independent
observations case at least, is well incorporated into microeconometrics practice.
In particular, for LS problems it is standard to estimate by OLS and then use robust
standard errors, even though there may be efficiency loss compared to doing feasi-
ble generalized least squares (GLS). Note, however, that one can still employ feasible
GLS but then compute robust standard errors that guard against misspecification
of the model for the error variance matrix.
14.4.2 Hypothesis tests
For hypotheses on parameters of the form:
H 0 :c(θ)= 0
Ha:c(θ)= 0 ,
the classical tests in the likelihood framework are the Wald, Lagrange multiplier
(LM), and likelihood ratio tests. For a correctly specified likelihood function these
tests are first-order asymptotically equivalent under the null hypothesis and under
local alternatives, so choice between them is one of convenience.
More recent work has focused on generalization to the non-likelihood frame-
work, and on finite-sample properties of the tests.
The Wald test has become the most popular of these three tests, as it general-
izes easily to non-likelihood models and is most easily robustified as detailed in
section 14.4.1. But it does have the limitation of lack of invariance to parameteri-
zation. For example, a test ofH 0 :θ 1 /θ 2 =1 will lead in finite samples to a Wald
test statistic that differs from that for the equivalent hypothesisH 0 :θ 1 −θ 2 =0.
A bootstrap with asymptotic refinement (see section 14.4.4), should reduce this
invariance.
The LM or score test is less commonly used, in part because it is usually imple-
mented by a convenient auxiliary regression that has poor finite sample properties.