Palgrave Handbook of Econometrics: Applied Econometrics

(Grace) #1
Luc Anselin and Nancy Lozano-Gracia 1231

accounted for. In addition, it is not always obvious how the sample of observations
relates to a population (or super-population), which has repercussions for the type
of asymptotics that can be applied.


26.4.1.1 Spatial observations


The theoretical framework outlined in section 26.2 implies that the proper esti-
mation of the model parameters should be based on observations for individual
transactions. In practice, this is not always possible and many studies instead rely
on spatially aggregated data for units such as block groups, census tracts, and even
counties (e.g., Brasington and Hite, 2005; Capozzaet al., 2005; Chay and Green-
stone, 2005; Huanget al., 2006). This leads to the problem ofecological inference,
also known in geography as the modifiable areal unit problem, or in the statistical
literature as the change of support problem (Gotway and Young, 2002). As shown in
Anselin (2002), the parameters of spatial models estimated at an aggregate level (in
particular the spatial autoregressive coefficient) do not correspond to those at the
individual level. Consequently, estimates of hedonic specifications based on such
aggregate units have only a tenuous basis in micro-theory and rely on a notion of
representative agents (representative housing units) that may be highly unrealistic.
The crucial aspect determining the extent of the problem is the intra-unit hetero-
geneity. If housing units, their characteristics, or the profiles of the household
units that occupy them, vary considerably within a spatial unit, then an aggregate
analysis based on a mean or median characteristic will not be very meaningful.
A second issue related to the change of support problem occurs when observa-
tions on some housing characteristics are not available for each individual unit.
For example, in many instances, data on socioeconomic variables related to the
households, such as income and education, cannot be obtained at the micro-level,
but instead are proxied by spatial aggregates, such as the median income or per-
centage high school graduates at the census tract. All individual observations in
the same census tract thus share the same value for these explanatory variables.
At the very least, this leads to heteroskedastic error terms, but it may also result in
more serious specification problems, as pointed out in Moulton (1990).
In other instances, the change in support problem manifests itself in a mismatch
between the location and scale at which observations are collected for specific
explanatory variables and the location of the housing units. A common example
is the use of interpolated values for environmental variables related to air qual-
ity (e.g., ozone), which are typically collected at a small number of monitoring
stations. In Anselin and Le Gallo (2006), the effect of applying different inter-
polation methods on the resulting estimates of MWTP for air quality is assessed.
In a comparison of Thiessen polygons, inverse distance weighting, kriging and
splines, the geostatistical kriging method yielded the best results in terms of model
fit. More importantly, the differences in both coefficient estimates as well as in the
calculations of MWTP were significant between the various interpolation methods,
suggesting that greater attention to this aspect of the data is warranted.


26.4.1.2 Spatial sampling


The statistical foundations for the analysis of spatial hedonic models derive from
two very different paradigms, related to the way in which the sampling of

Free download pdf