Pattern Recognition and Machine Learning

13.2. Hidden Markov Models 629

Finally, we note that there is an alternative formulation of the forward-backward
algorithm (Jordan, 2007) in which the backward pass is defined by a recursion based
the quantitiesγ(zn)=α̂(zn)̂β(zn)instead of usinĝβ(zn). Thisα–γrecursion
requires that the forward pass be completed first so that all the quantitieŝα(zn)
are available for the backward pass, whereas the forward and backward passes of
theα–βalgorithm can be done independently. Although these two algorithms have
comparable computational cost, theα–βversion is the most commonly encountered
Section 13.3 one in the case of hidden Markov models, whereas for linear dynamical systems a
recursion analogous to theα–γform is more usual.

13.2.5 The Viterbi algorithm

In many applications of hidden Markov models, the latent variables have some meaningful interpretation, and so it is often of interest to find the most probable sequence of hidden states for a given observation sequence. For instance in speech recognition, we might wish to find the most probable phoneme sequence for a given series of acoustic observations. Because the graph for the hidden Markov model is a directed tree, this problem can be solved exactly using the max-sum algorithm. We recall from our discussion in Section 8.4.5 that the problem of finding the most probable sequence of latent states is not the same as that of finding the set of states that are individually the most probable. The latter problem can be solved by first running the forward-backward (sum-product) algorithm to find the latent variable marginalsγ(zn)and then maximizing each of these individually (Dudaet al., 2001). However, the set of such states will not, in general, correspond to the most probable sequence of states. In fact, this set of states might even represent a sequence having zero probability, if it so happens that two successive states, which in isolation are individually the most probable, are such that the transition matrix element connecting them is zero. In practice, we are usually interested in finding the most probablesequenceof states, and this can be solved efficiently using the max-sum algorithm, which in the context of hidden Markov models is known as theViterbialgorithm (Viterbi, 1967). Note that the max-sum algorithm works with log probabilities and so there is no need to use re-scaled variables as was done with the forward-backward algorithm. Figure 13.16 shows a fragment of the hidden Markov model expanded as lattice diagram. As we have already noted, the number of possible paths through the lattice grows exponentially with the length of the chain. The Viterbi algorithm searches this space of paths efficiently to find the most probable path with a computational cost that grows only linearly with the length of the chain. As with the sum-product algorithm, we first represent the hidden Markov model as a factor graph, as shown in Figure 13.15. Again, we treat the variable nodezN as the root, and pass messages to the root starting with the leaf nodes. Using the results (8.93) and (8.94), we see that the messages passed in the max-sum algorithm are given by

μzn→fn+1(zn)=μfn→zn(zn) (13.66) μfn+1→zn+1(zn+1) = max zn

{ lnfn+1(zn,zn+1)+μzn→fn+1(zn)

} .(13.67)

Pattern Recognition and Machine Learning

13.2.5 The Viterbi algorithm

Get our desktop app

Company

Features

Documentation

Resources