Robert_V._Hogg,_Joseph_W._McKean,_Allen_T._Craig

(Jacob Rumans) #1
620 Nonparametric and Robust Statistics

error distributions. Or if the distribution of the errors is thought to be quite close
to a normal distribution, then the normal scores would be a proper choice. Suppose
we use a technique that bases the score selection on the data. These techniques are
calledadaptiveprocedures. Such a procedure could attempt to estimate the score
function; see, for example, Naranjo and McKean (1997). However, large data sets
are often needed for these. There are other adaptive procedures that attempt to
select a score from a finite class of scores based on some criteria. In this section, we
look at an adaptive testing procedure that retains the distribution-free property.
Frequently, an investigator is tempted to evaluate several test statistics associ-
ated with a single hypothesis and then use the one statistic that best supports his
or her position, usually rejection. Obviously, this type of procedure changes the
actual significance level of the test from the nominalαthat is used. However, there
is a way in which the investigator can first look at the data and then select a test
statistic without changing this significance level. For illustration, suppose there are
three possible test statistics,W 1 ,W 2 ,andW 3 ,ofthehypothesisH 0 with respective
critical regionsC 1 ,C 2 ,andC 3 such thatP(Wi∈Ci;H 0 )=α, i=1, 2 ,3. Moreover,
suppose that a statisticQ, based upon the same data, selects one and only one of
the statisticsW 1 ,W 2 ,W 3 ,andthatW is then used to testH 0. For example, we
choose to use the test statisticWiifQ∈Di,i=1, 2 ,3, where the events defined
byD 1 ,D 2 ,andD 3 are mutually exclusive and exhaustive. Now ifQand eachWi
are independent whenH 0 is true, then the probability of rejection, using the entire
procedure (selecting and testing), is, underH 0 ,


PH 0 (Q∈D 1 ,W 1 ∈C 1 )+PH 0 (Q∈D 2 ,W 2 ∈C 2 )+PH 0 (Q∈D 3 ,W 3 ∈C 3 )
=PH 0 (Q∈D 1 )PH 0 (W 1 ∈C 1 )+PH 0 (Q∈D 2 )PH 0 (W 2 ∈C 2 )
+PH 0 (Q∈D 3 )PH 0 (W 3 ∈C 3 )
=α[PH 0 (Q∈D 1 )+PH 0 (Q∈D 2 )+PH 0 (Q∈D 3 )] =α.

That is, the procedure of selectingWiusing an independent statisticQand then
constructing a test of significance levelαwith the statisticWihas overall significance
levelα.


Of course, the important element in this procedure is the ability to be able to
find a selectorQthat is independent of each test statisticW. This can frequently be
done by using the fact that complete sufficient statistics for the parameters, given by
H 0 , are independent of every statistic whose distribution is free of those parameters.
For illustration, if independent random samples of sizesn 1 andn 2 arise from two
normal distributions with respective meansμ 1 andμ 2 and common varianceσ^2 ,
then the complete sufficient statisticsX,Y,and


V=

∑n^1

1

(Xi−X)^2 +

∑n^2

1

(Yi−Y)^2

forμ 1 ,μ 2 ,andσ^2 are independent of every statistic whose distribution is free of
Free download pdf