Pattern Recognition and Machine Learning

(Jeff_L) #1
10.7. Expectation Propagation 515

x 1 x 2 x 3

x 4

fa fb

fc

x 1 x 2 x 3

x 4

f ̃a 1 f ̃a 2 f ̃b 2 f ̃b 3
f ̃c 2

f ̃c 4

Figure 10.18 On the left is a simple factor graph from Figure 8.51 and reproduced here for convenience. On
the right is the corresponding factorized approximation.


̃fb(x 2 ,x 3 )= ̃fb 2 (x 2 )f ̃b 3 (x 3 ). We first remove this factor from the approximating
distribution to give

q\b(x)=f ̃a 1 (x 1 ) ̃fa 2 (x 2 ) ̃fc 2 (x 2 ) ̃fc 4 (x 4 ) (10.228)

and we then multiply this by the exact factorfb(x 2 ,x 3 )to give

̂p(x)=q\b(x)fb(x 2 ,x 3 )= ̃fa 1 (x 1 ) ̃fa 2 (x 2 ) ̃fc 2 (x 2 ) ̃fc 4 (x 4 )fb(x 2 ,x 3 ). (10.229)

We now findqnew(x)by minimizing the Kullback-Leibler divergenceKL(̂p‖qnew).
The result, as noted above, is thatqnew(z)comprises the product of factors, one for
each variablexi, in which each factor is given by the corresponding marginal of
̂p(x). These four marginals are given by

̂p(x 1 ) ∝ ̃fa 1 (x 1 ) (10.230)
̂p(x 2 ) ∝ ̃fa 2 (x 2 ) ̃fc 2 (x 2 )


x 3

fb(x 2 ,x 3 ) (10.231)

̂p(x 3 ) ∝


x 2

{
fb(x 2 ,x 3 ) ̃fa 2 (x 2 ) ̃fc 2 (x 2 )

}
(10.232)

̂p(x 4 ) ∝ ̃fc 4 (x 4 ) (10.233)

andqnew(x)is obtained by multiplying these marginals together. We see that the
only factors inq(x)that change when we update ̃fb(x 2 ,x 3 )are those that involve
the variables infbnamelyx 2 andx 3. To obtain the refined factor ̃fb(x 2 ,x 3 )=
̃fb 2 (x 2 )f ̃b 3 (x 3 )we simply divideqnew(x)byq\b(x), which gives

̃fb 2 (x 2 ) ∝


x 3

fb(x 2 ,x 3 ) (10.234)

̃fb 3 (x 3 ) ∝


x 2

{
fb(x 2 ,x 3 ) ̃fa 2 (x 2 ) ̃fc 2 (x 2 )

}

. (10.235)

Free download pdf