Pattern Recognition and Machine Learning

514 10. APPROXIMATE INFERENCE

ep

laplace vb

Posterior mean

FLOPS

Error

104 106

10 −5

100

ep

vb

laplace

Evidence

FLOPS

Error

104 106

10 −204

10 −202

10 −200

Figure 10.17 Comparison of expectation propagation, variational inference, and the Laplace approximation on
the clutter problem. The left-hand plot shows the error in the predicted posterior mean versus the number of
floating point operations, and the right-hand plot shows the corresponding results for the model evidence.

We shall focus on the case in which the approximating distribution is fully fac-
torized, and we shall show that in this case expectation propagation reduces to loopy
belief propagation (Minka, 2001a). To start with, we show this in the context of a
simple example, and then we shall explore the general case.
First of all, recall from (10.17) that if we minimize the Kullback-Leibler diver-
genceKL(p‖q)with respect to a factorized distributionq, then the optimal solution
for each factor is simply the corresponding marginal ofp.
Now consider the factor graph shown on the left in Figure 10.18, which was
Section 8.4.4 introduced earlier in the context of the sum-product algorithm. The joint distribution
is given by
p(x)=fa(x 1 ,x 2 )fb(x 2 ,x 3 )fc(x 2 ,x 4 ). (10.225)
We seek an approximationq(x)that has the same factorization, so that

q(x)∝ ̃fa(x 1 ,x 2 ) ̃fb(x 2 ,x 3 )f ̃c(x 2 ,x 4 ). (10.226)

Note that normalization constants have been omitted, and these can be re-instated at the end by local normalization, as is generally done in belief propagation. Now sup- pose we restrict attention to approximations in which the factors themselves factorize with respect to the individual variables so that

q(x)∝ ̃fa 1 (x 1 ) ̃fa 2 (x 2 ) ̃fb 2 (x 2 ) ̃fb 3 (x 3 ) ̃fc 2 (x 2 ) ̃fc 4 (x 4 ) (10.227)

which corresponds to the factor graph shown on the right in Figure 10.18. Because the individual factors are factorized, the overall distributionq(x)is itself fully factorized. Now we apply the EP algorithm using the fully factorized approximation. Sup- pose that we have initialized all of the factors and that we choose to refine factor

Pattern Recognition and Machine Learning

514 10. APPROXIMATE INFERENCE

Get our desktop app

Company

Features

Documentation

Resources