Pattern Recognition and Machine Learning

(Jeff_L) #1
516 10. APPROXIMATE INFERENCE

Section 8.4.4 These are precisely the messages obtained using belief propagation in which mes-
sages from variable nodes to factor nodes have been folded into the messages from
factor nodes to variable nodes. In particular, ̃fb 2 (x 2 )corresponds to the message
μfb→x 2 (x 2 )sent by factor nodefbto variable nodex 2 and is given by (8.81). Simi-
larly, if we substitute (8.78) into (8.79), we obtain (10.235) in which ̃fa 2 (x 2 )corre-
sponds toμfa→x 2 (x 2 )and ̃fc 2 (x 2 )corresponds toμfc→x 2 (x 2 ), giving the message
̃fb 3 (x 3 )which corresponds toμfb→x 3 (x 3 ).
This result differs slightly from standard belief propagation in that messages are
passed in both directions at the same time. We can easily modify the EP procedure
to give the standard form of the sum-product algorithm by updating just one of the
factors at a time, for instance if we refine only ̃fb 3 (x 3 ), then ̃fb 2 (x 2 )is unchanged
by definition, while the refined version of ̃fb 3 (x 3 )is again given by (10.235). If
we are refining only one term at a time, then we can choose the order in which the
refinements are done as we wish. In particular, for a tree-structured graph we can
follow a two-pass update scheme, corresponding to the standard belief propagation
schedule, which will result in exact inference of the variable and factor marginals.
The initialization of the approximation factors in this case is unimportant.
Now let us consider a general factor graph corresponding to the distribution


p(θ)=


i

fi(θi) (10.236)

whereθirepresents the subset of variables associated with factorfi. We approximate
this using a fully factorized distribution of the form

q(θ)∝


i


k

̃fik(θk) (10.237)

whereθkcorresponds to an individual variable node. Suppose that we wish to refine
the particular term ̃fjl(θl)keeping all other terms fixed. We first remove the term
̃fj(θj)fromq(θ)to give

q\j(θ)∝


i =j


k

̃fik(θk) (10.238)

and then multiply by the exact factorfj(θj). To determine the refined term ̃fjl(θl),
we need only consider the functional dependence onθl, and so we simply find the
corresponding marginal of
q\j(θ)fj(θj). (10.239)
Up to a multiplicative constant, this involves taking the marginal offj(θj)multiplied
by any terms fromq\j(θ)that are functions of any of the variables inθj. Terms that
correspond to other factors ̃fi(θi)fori = jwill cancel between numerator and
denominator when we subsequently divide byq\j(θ). We therefore obtain

f ̃jl(θl)∝


θm=l∈θj

fj(θj)


k


m =l

̃fkm(θm). (10.240)
Free download pdf