506 Chapter 11:Goodness of Fit Tests and Categorical Data Analysis
Similarly, the maximum value ofF(x)−Fe(x) is also nonnegative and occurs immediately
before one of the jump pointsy(j); and so
Maximum
x
{F(x)−Fe(x)}=Maximum
j=1,...,n
{
F(y(j))−
j− 1
n
}
(11.6.2)
From Equations 11.6.1 and 11.6.2, we see that
D=Maximum
x
|Fe(x)−F(x)|
=Maximum{Maximum{Fe(x)−F(x)}, Maximum{F(x)−Fe(x)}}
=Maximum
{
j
n
−F(y(j)),F(y(j))−
j− 1
n
,j=1,...,n
}
(11.6.3)
Equation 11.6.3 can be used to compute the value ofD.
Suppose now that theYjare observed and their values are such thatD=d. Since a
large value ofDwould appear to be inconsistent with the null hypothesis thatFis the
underlying distribution, it follows that thep-value for this data set is given by
p-value=PF{D≥d}
where we have writtenPFto make explicit that this probability is to be computed under
the assumption thatH 0 is correct (and soFis the underlying distribution).
The abovep-value can be approximated by a simulation that is made easier by the
following proposition, which shows thatPF{D≥d}does not depend on the underlying
distributionF. This result enables us to estimate thep-value by doing the simulation
with any continuous distributionFwe choose [thus allowing us to use the uniform (0, 1)
distribution].
PROPOSITION 11.6.1
PF{D≥d}is the same for any continuous distributionF.
Proof
PF{D≥d}=PF
{
Maximum
x
∣∣
∣∣#i:Yi≤x
n
−F(x)
∣∣
∣∣≥d
}
=PF
{
Maximum
x
∣
∣∣
∣
#i:F(Yi)≤F(x)
n
−F(x)
∣
∣∣
∣≥d
}
=P
{
Maximum
x
∣
∣∣
∣
#i:Ui≤F(x)
n
−F(x)
∣
∣∣
∣≥d
}