Pattern Recognition and Machine Learning

10.7. Expectation Propagation 515

x 1 x 2 x 3

x 4

fa fb

fc

x 1 x 2 x 3

x 4

f ̃a 1 f ̃a 2 f ̃b 2 f ̃b 3 f ̃c 2

f ̃c 4

Figure 10.18 On the left is a simple factor graph from Figure 8.51 and reproduced here for convenience. On
the right is the corresponding factorized approximation.

̃fb(x 2 ,x 3 )= ̃fb 2 (x 2 )f ̃b 3 (x 3 ). We first remove this factor from the approximating distribution to give

q\b(x)=f ̃a 1 (x 1 ) ̃fa 2 (x 2 ) ̃fc 2 (x 2 ) ̃fc 4 (x 4 ) (10.228)

and we then multiply this by the exact factorfb(x 2 ,x 3 )to give

̂p(x)=q\b(x)fb(x 2 ,x 3 )= ̃fa 1 (x 1 ) ̃fa 2 (x 2 ) ̃fc 2 (x 2 ) ̃fc 4 (x 4 )fb(x 2 ,x 3 ). (10.229)

We now findqnew(x)by minimizing the Kullback-Leibler divergenceKL(̂p‖qnew). The result, as noted above, is thatqnew(z)comprises the product of factors, one for each variablexi, in which each factor is given by the corresponding marginal of ̂p(x). These four marginals are given by

̂p(x 1 ) ∝ ̃fa 1 (x 1 ) (10.230) ̂p(x 2 ) ∝ ̃fa 2 (x 2 ) ̃fc 2 (x 2 )

∑

x 3

fb(x 2 ,x 3 ) (10.231)

̂p(x 3 ) ∝

∑

x 2

{ fb(x 2 ,x 3 ) ̃fa 2 (x 2 ) ̃fc 2 (x 2 )

} (10.232)

̂p(x 4 ) ∝ ̃fc 4 (x 4 ) (10.233)

andqnew(x)is obtained by multiplying these marginals together. We see that the only factors inq(x)that change when we update ̃fb(x 2 ,x 3 )are those that involve the variables infbnamelyx 2 andx 3. To obtain the refined factor ̃fb(x 2 ,x 3 )= ̃fb 2 (x 2 )f ̃b 3 (x 3 )we simply divideqnew(x)byq\b(x), which gives

̃fb 2 (x 2 ) ∝

∑

x 3

fb(x 2 ,x 3 ) (10.234)

̃fb 3 (x 3 ) ∝

∑

x 2

{ fb(x 2 ,x 3 ) ̃fa 2 (x 2 ) ̃fc 2 (x 2 )

}

. (10.235)

Pattern Recognition and Machine Learning

Get our desktop app

Company

Features

Documentation

Resources