Pattern Recognition and Machine Learning

414 8. GRAPHICAL MODELS

Figure 8.53 A lattice, or trellis, diagram show-
ing explicitly theKpossible states (one per row
of the diagram) for each of the variablesxnin the
chain model. In this illustrationK=3. The ar-
row shows the direction of message passing in the
max-product algorithm. For every statekof each
variablexn(corresponding to columnnof the dia-
gram) the functionφ(xn)defines a unique state at
the previous variable, indicated by the black lines.
The two paths through the lattice correspond to
configurations that give the global maximum of the
joint probability distribution, and either of these
can be found by tracing back along the black lines
in the opposite direction to the arrow.

k=1

k=2

k=3 n− 2 n− 1 nn+1

corresponding to the graph shown in Figure 8.38. Suppose we take nodexNto be the root node. Then in the first phase, we propagate messages from the leaf nodex 1 to the root node using

μxn→fn,n+1(xn)=μfn− 1 ,n→xn(xn) μfn− 1 ,n→xn(xn) = max xn− 1

[ lnfn− 1 ,n(xn− 1 ,xn)+μxn− 1 →fn− 1 ,n(xn)

]

which follow from applying (8.94) and (8.93) to this particular graph. The initial message sent from the leaf node is simply

μx 1 →f 1 , 2 (x 1 )=0. (8.99)

The most probable value forxNis then given by

xmaxN =argmax xN

[ μfN− 1 ,N→xN(xN)

]

. (8.100)

Now we need to determine the states of the previous variables that correspond to the same maximizing configuration. This can be done by keeping track of which values of the variables gave rise to the maximum state of each variable, in other words by storing quantities given by

φ(xn) = arg max xn− 1

[ lnfn− 1 ,n(xn− 1 ,xn)+μxn− 1 →fn− 1 ,n(xn)

]

. (8.101)

To understand better what is happening, it is helpful to represent the chain of variables in terms of alatticeortrellisdiagram as shown in Figure 8.53. Note that this is not a probabilistic graphical model because the nodes represent individual states of variables, while each variable corresponds to a column of such states in the diagram. For each state of a given variable, there is a unique state of the previous variable that maximizes the probability (ties are broken either systematically or at random), corresponding to the functionφ(xn)given by (8.101), and this is indicated

Pattern Recognition and Machine Learning

414 8. GRAPHICAL MODELS

Get our desktop app

Company

Features

Documentation

Resources