Pattern Recognition and Machine Learning

(Jeff_L) #1
11.1. Basic Sampling Algorithms 529

Figure 11.4 In the rejection sampling method,
samples are drawn from a sim-
ple distributionq(z)and rejected
if they fall in the grey area be-
tween the unnormalized distribu-
tionep(z)and the scaled distribu-
tionkq(z). The resulting samples
are distributed according top(z),
which is the normalized version of
ep(z). z 0 z

u 0

kq(z 0 ) kq(z)

p ̃(z)

We next introduce a constantkwhose value is chosen such thatkq(z) ̃p(z)for
all values ofz. The functionkq(z)is called the comparison function and is illus-
trated for a univariate distribution in Figure 11.4. Each step of the rejection sampler
involves generating two random numbers. First, we generate a numberz 0 from the
distributionq(z). Next, we generate a numberu 0 from the uniform distribution over
[0,kq(z 0 )]. This pair of random numbers has uniform distribution under the curve
of the functionkq(z). Finally, ifu 0 > ̃p(z 0 )then the sample is rejected, otherwise
u 0 is retained. Thus the pair is rejected if it lies in the grey shaded region in Fig-
ure 11.4. The remaining pairs then have uniform distribution under the curve of ̃p(z),
Exercise 11.6 and hence the correspondingzvalues are distributed according top(z), as desired.
The original values ofzare generated from the distributionq(z), and these sam-
ples are then accepted with probability ̃p(z)/kq(z), and so the probability that a
sample will be accepted is given by


p(accept) =


{ ̃p(z)/kq(z)}q(z)dz

=

1

k


̃p(z)dz. (11.14)

Thus the fraction of points that are rejected by this method depends on the ratio of
the area under the unnormalized distribution ̃p(z)to the area under the curvekq(z).
We therefore see that the constantkshould be as small as possible subject to the
limitation thatkq(z)must be nowhere less than ̃p(z).
As an illustration of the use of rejection sampling, consider the task of sampling
from the gamma distribution

Gam(z|a, b)=

baza−^1 exp(−bz)
Γ(a)

(11.15)

which, fora> 1 , has a bell-shaped form, as shown in Figure 11.5. A suitable
proposal distribution is therefore the Cauchy (11.8) because this too is bell-shaped
and because we can use the transformation method, discussed earlier, to sample from
it. We need to generalize the Cauchy slightly to ensure that it nowhere has a smaller
value than the gamma distribution. This can be achieved by transforming a uniform
random variableyusingz=btany+c, which gives random numbers distributed
Exercise 11.7 according to.

Free download pdf