Robert_V._Hogg,_Joseph_W._McKean,_Allen_T._Craig

(Jacob Rumans) #1
10.9. Robust Concepts 639

whereθis a location parameter (functional) andεihas cdfF(t)andpdff(t). Let
FX(t)andfX(t) denote the cdf and pdf ofX, respectively. ThenFX(t)=F(t−θ)
andfX(t)=f(t−θ).
To illustrate the robust concepts, we use the location estimators discussed in
Sections 10.1–10.3: the sample mean, the sample median, and the Hodges–Lehmann
estimator. It is convenient to define these estimators in terms of their estimating
equations. Theestimating equationof the sample mean is given by


∑n

i=1

(Xi−θ) = 0; (10.9.2)

i.e., the solution to this equation isθ̂=X. The estimating equation for the sample
median is given in expression (10.2.34), which, for convenience, we repeat:


∑n

i=1

sgn(Xi−θ)=0. (10.9.3)

Recall from Section 10.2 that the sample median minimizes theL 1 -norm. So in
this section, we denote it aŝθL 1 =medXi. Finally, the estimating equation for the
Hodges–Lehmann estimator is given by expression (10.4.27). For this section, we
denote the solution to this equation by


θ̂
HL=medi≤j

{
Xi+Xj
2

}

. (10.9.4)


Suppose, in general, then that we have a random sampleX 1 ,X 2 ,...,Xn,which
follows the location model (10.9.1) with location parameterθ.Let̂θbe an estimator
ofθ. Hopefully,̂θis not unduly influenced by an outlier in the sample, that is, a
point that is at a distance from the other points in the sample. For a realization of
the sample, this sensitivity to outliers is easy to measure. We simply add an outlier
to the data set and observe the change in the estimator.
More formally, letxn=(x 1 ,x 2 ,...,xn) be a realization of the sample, letxbe
the additional point, and denote the augmented sample byx′n+1=(x′n,x). Then a
simple measure is the rate of change in the estimate due toxrelative to the mass
ofx,(1/(n+ 1)); i.e.,


S(x;̂θ)=

θ̂(xn+1)−̂θ(xn)
1 /(n+1)

. (10.9.5)


This is called thesensitivity curveof the estimateθ̂.
As examples, consider the sample mean and median. For the sample mean, it is
easy to see that


S(x;X)=

xn+1−xn
1 /(n+1)

=x−xn. (10.9.6)

Hence the relative change in the sample mean is a linear function ofx.Thus,if
xis large, then the change in sample mean is also large. Actually, the change is
unbounded inx. Thus the sample mean is quite sensitive to the size of the outlier.

Free download pdf