Palgrave Handbook of Econometrics: Applied Econometrics

(Grace) #1
A. Colin Cameron 753

apply only to conditional means (and not the entire distribution). The assumption
implies that:


αATET(x)=E[y 1 i|Xi=x,di= 1 ]−E[y 0 i|Xi=x,di= 0 ], (14.35)

where the second-term conditions ondi=0, rather thandi=1 as in the original
definition (14.33).
The matching approach for estimating treatment effects is based on (14.35) and
compares sample averages ofy 1 andy 0 for individuals with the same level ofx.
This permits treatment effects to be heterogeneous and provides nonparametric
estimates of their average. In practice, however, such estimates become noisy or
impossible asxwill take many values if it is continuous or high dimensional. One
can instead use nonparametric methods, such as kernel weighting, that permit use
of individuals with similar but not exactly the same level ofx. But more common is
to match on the probability of treatment conditional onx, known as the propensity
score:
p(xi)=Pr[di= 1 |xi], (14.36)


since Rosenbaum and Rubin (1983) showed that the conditional independence
assumption carries over to conditioning on the propensity score (that is,y 0 i,y 1 i⊥
di|p(xi)). For example, nearest-neighbor propensity score matching uses:


̂αATET=N− 11


i:di= 1

(y 1 i−y 0 j),

whereN 1 =


∑N
i= 1 diandy 0 jis the outcome for the nearest neighbor, the untreated
observation with propensity score closest to that fory 1 i. Other propensity score
matching methods included kernel and stratification methods that average over
several outcomes with similar propensity score. By estimating the propensity score
using a flexible model, such as a semiparametric binary model or a logit model with
interactions, it is more likely that observables only may determine selection. The
propensity scores must have suitable common support over treatment and controls
in order for matching to be feasible. For ATET it must be thatp(xi)<1, that is, for
any value of the regressors it is possible to not receive treatment, while for ATE the
requirement is that 0<p(xi)<1. Note that if treatment effects are heterogeneous
and matching is valid, the estimates obtained are very problem specific and not
necessarily generalizable to other settings.
An alternative method is to specify and estimate a more restrictive regression
model for the outcome. An obvious model is:


yi=αdi+x′iβ+ui, (14.37)

which imposes the constraint that the treatment effectαis homogeneous. OLS esti-
mation of (14.37) yields a consistent estimate of the treatment effectα, assuming
conditional independence and that (14.37) is correctly specified. This is called the
control function approach, as the regressorsxhere include regressors that control
for selection into treatment (that is, explaind)as well as regressors that directly

Free download pdf