506 Chapter 11:Goodness of Fit Tests and Categorical Data Analysis
Similarly, the maximum value ofF(x)−Fe(x) is also nonnegative and occurs immediately
before one of the jump pointsy(j); and so
Maximum
x{F(x)−Fe(x)}=Maximum
j=1,...,n{
F(y(j))−j− 1
n}
(11.6.2)From Equations 11.6.1 and 11.6.2, we see that
D=Maximum
x|Fe(x)−F(x)|=Maximum{Maximum{Fe(x)−F(x)}, Maximum{F(x)−Fe(x)}}=Maximum{
j
n−F(y(j)),F(y(j))−j− 1
n,j=1,...,n}
(11.6.3)Equation 11.6.3 can be used to compute the value ofD.
Suppose now that theYjare observed and their values are such thatD=d. Since a
large value ofDwould appear to be inconsistent with the null hypothesis thatFis the
underlying distribution, it follows that thep-value for this data set is given by
p-value=PF{D≥d}where we have writtenPFto make explicit that this probability is to be computed under
the assumption thatH 0 is correct (and soFis the underlying distribution).
The abovep-value can be approximated by a simulation that is made easier by the
following proposition, which shows thatPF{D≥d}does not depend on the underlying
distributionF. This result enables us to estimate thep-value by doing the simulation
with any continuous distributionFwe choose [thus allowing us to use the uniform (0, 1)
distribution].
PROPOSITION 11.6.1
PF{D≥d}is the same for any continuous distributionF.
ProofPF{D≥d}=PF{
Maximum
x∣∣
∣∣#i:Yi≤x
n−F(x)∣∣
∣∣≥d}=PF{
Maximum
x∣
∣∣
∣#i:F(Yi)≤F(x)
n−F(x)∣
∣∣
∣≥d}=P{
Maximum
x∣
∣∣
∣#i:Ui≤F(x)
n−F(x)∣
∣∣
∣≥d}