Pattern Recognition and Machine Learning

(Jeff_L) #1
13.2. Hidden Markov Models 621

Figure 13.12 Illustration of the forward recursion (13.36) for
evaluation of theαvariables. In this fragment
of the lattice, we see that the quantityα(zn 1 )
is obtained by taking the elementsα(zn− 1 ,j)of
α(zn− 1 )at stepn− 1 and summing them up with
weights given byAj 1 , corresponding to the val-
ues ofp(zn|zn− 1 ), and then multiplying by the
data contributionp(xn|zn 1 ).


k=1

k=2

k=3
n− 1 n

α(zn− 1 , 1 )

α(zn− 1 , 2 )

α(zn− 1 , 3 )

α(zn, 1 )
A 11

A 21

A 31

p(xn|zn, 1 )

It is worth taking a moment to study this recursion relation in some detail. Note
that there areKterms in the summation, and the right-hand side has to be evaluated
for each of theKvalues ofznso each step of theαrecursion has computational
cost that scaled likeO(K^2 ). The forward recursion equation forα(zn)is illustrated
using a lattice diagram in Figure 13.12.
In order to start this recursion, we need an initial condition that is given by

α(z 1 )=p(x 1 ,z 1 )=p(z 1 )p(x 1 |z 1 )=

∏K

k=1

{πkp(x 1 |φk)}z^1 k (13.37)

which tells us thatα(z 1 k), fork=1,...,K, takes the valueπkp(x 1 |φk). Starting
at the first node of the chain, we can then work along the chain and evaluateα(zn)
for every latent node. Because each step of the recursion involves multiplying by a
K×Kmatrix, the overall cost of evaluating these quantities for the whole chain is
ofO(K^2 N).
We can similarly find a recursion relation for the quantitiesβ(zn)by making
use of the conditional independence properties (13.27) and (13.28) giving

β(zn)=p(xn+1,...,xN|zn)
=


zn+1

p(xn+1,...,xN,zn+1|zn)

=


zn+1

p(xn+1,...,xN|zn,zn+1)p(zn+1|zn)

=


zn+1

p(xn+1,...,xN|zn+1)p(zn+1|zn)

=


zn+1

p(xn+2,...,xN|zn+1)p(xn+1|zn+1)p(zn+1|zn).
Free download pdf