Robert_V._Hogg,_Joseph_W._McKean,_Allen_T._Craig

(Jacob Rumans) #1
304 Some Elementary Statistical Inferences

Informative discussions of such procedures can be found in Efron and Tibshirani
(1993) and Davison and Hinkley (1997).
To motivate the procedure, suppose for the moment that

θ̂has aN(θ, σ^2
bθ) distribution. (4.9.1)

Then as in Section 4.2, a (1−α)100% confidence interval forθis (̂θL,̂θU), where

̂θL=̂θ−z(1−α/2)σb
θ and
̂θU=θ̂−z(α/2)σb
θ, (4.9.2)

andz(γ)denotes theγ100th percentile of a standard normal random variable; i.e.,
z(γ)=Φ−^1 (γ), where Φ is the cdf of aN(0,1) random variable (see also Exercise
4.9.5). We have gone to a superscript notation here to avoid confusion with the
usual subscript notation on critical values.
Now suppose that̂θandσbθare realizations from the sample and̂θLand̂θUare


calculated as in (4.9.2). Next suppose thatθ̂∗is a random variable with aN(̂θ, σ^2 bθ)
distribution. Then, by (4.9.2),


P(̂θ∗≤θ̂L)=P

(
̂θ∗−θ̂
σbθ

≤−z(1−α/2)

)
=α/ 2. (4.9.3)

Likewise,P(̂θ∗ ≤̂θU)=1−(α/2). Therefore,̂θLand̂θU are theα 2 100th and


(1−α 2 )100th percentiles of the distribution ofθ̂∗. That is, the percentiles of the

N(θ, σ̂ θb^2 ) distribution form the (1−α)100% confidence interval forθ.
We want our final procedure to be quite general, so the normality assumption
(4.9.1) is definitely not desired and, in Remark 4.9.1, we do show that this assump-
tion is not necessary. So, in general, letH(t) denote the cdf of̂θ.
In practice, though, we do not know the functionH(t). Hence the above con-
fidence interval defined by statement (4.9.3) cannot be obtained. But suppose we
could take an infinite number of samplesX 1 ,X 2 ,...;obtain̂θ∗=̂θ(X∗)foreach


sampleX∗; and then form the histogram of these estimatesθ̂∗. The percentiles
of this histogram would be the confidence interval defined by expression (4.9.3).
Since we only have one sample, this is impossible. It is, however, the idea behind
bootstrap procedures.
Bootstrap procedures simply resample from the empirical distribution defined
by the one sample. The sampling is done at random and with replacement and
the resamples are all of sizen, the size of the original sample. That is, suppose
x′ =(x 1 ,x 2 ,...,xn) denotes the realization of the sample. Let F̂ndenote the


empirical distribution function of the sample. Recall thatF̂nis a discrete cdf that
puts massn−^1 at each pointxiand thatF̂n(x) is an estimator ofF(x). Then a
bootstrap sample is a random sample, sayx∗′=(x∗ 1 ,x∗ 2 ,...,x∗n), drawn fromF̂n.
For example, it follows from the definition of expectation that


E(x∗i)=

∑n

i=1

xi
1
n

=
1
n

∑n

i=1

xi=x. (4.9.4)
Free download pdf