Pattern Recognition and Machine Learning

(Jeff_L) #1
10.7. Expectation Propagation 509

sides of (10.199) byq\i(θ)and integrating to give


K=


̃fj(θ)q\j(θ)dθ (10.200)

where we have used the fact thatqnew(θ)is normalized. The value ofKcan therefore
be found by matching zeroth-order moments



̃fj(θ)q\j(θ)dθ=


fj(θ)q\j(θ)dθ. (10.201)

Combining this with (10.197), we then see thatK=Zjand so can be found by
evaluating the integral in (10.197).
In practice, several passes are made through the set of factors, revising each
factor in turn. The posterior distributionp(θ|D)is then approximated using (10.191),
and the model evidencep(D)can be approximated by using (10.190) with the factors


fi(θ)replaced by their approximations ̃fi(θ).


Expectation Propagation

We are given a joint distribution over observed dataDand stochastic variables
θin the form of a product of factors

p(D,θ)=


i

fi(θ) (10.202)

and we wish to approximate the posterior distributionp(θ|D)by a distribution
of the form
q(θ)=

1

Z


i

̃fi(θ). (10.203)

We also wish to approximate the model evidencep(D).


  1. Initialize all of the approximating factors ̃fi(θ).

  2. Initialize the posterior approximation by setting


q(θ)∝


i

̃fi(θ). (10.204)


  1. Until convergence:
    (a) Choose a factor ̃fj(θ)to refine.
    (b) Remove ̃fj(θ)from the posterior by division


q\j(θ)=

q(θ)
̃fj(θ)

. (10.205)
Free download pdf