Introduction to Probability and Statistics for Engineers and Scientists

506 Chapter 11:Goodness of Fit Tests and Categorical Data Analysis

Similarly, the maximum value ofF(x)−Fe(x) is also nonnegative and occurs immediately
before one of the jump pointsy(j); and so

Maximum x

{F(x)−Fe(x)}=Maximum j=1,...,n

{ F(y(j))−

j− 1 n

} (11.6.2)

From Equations 11.6.1 and 11.6.2, we see that

D=Maximum x

|Fe(x)−F(x)|

=Maximum{Maximum{Fe(x)−F(x)}, Maximum{F(x)−Fe(x)}}

=Maximum

{ j n

−F(y(j)),F(y(j))−

j− 1 n

,j=1,...,n

} (11.6.3)

Equation 11.6.3 can be used to compute the value ofD.
Suppose now that theYjare observed and their values are such thatD=d. Since a
large value ofDwould appear to be inconsistent with the null hypothesis thatFis the
underlying distribution, it follows that thep-value for this data set is given by

p-value=PF{D≥d}

where we have writtenPFto make explicit that this probability is to be computed under
the assumption thatH 0 is correct (and soFis the underlying distribution).
The abovep-value can be approximated by a simulation that is made easier by the
following proposition, which shows thatPF{D≥d}does not depend on the underlying
distributionF. This result enables us to estimate thep-value by doing the simulation
with any continuous distributionFwe choose [thus allowing us to use the uniform (0, 1)
distribution].

PROPOSITION 11.6.1
PF{D≥d}is the same for any continuous distributionF.

Proof

PF{D≥d}=PF

{ Maximum x

∣∣ ∣∣#i:Yi≤x n

−F(x)

∣∣ ∣∣≥d

}

=PF

{ Maximum x

∣ ∣∣ ∣

#i:F(Yi)≤F(x) n

−F(x)

∣ ∣∣ ∣≥d

}

=P

{ Maximum x

∣ ∣∣ ∣

#i:Ui≤F(x) n

−F(x)

∣ ∣∣ ∣≥d

}

Introduction to Probability and Statistics for Engineers and Scientists

Get our desktop app

Company

Features

Documentation

Resources