Pattern Recognition and Machine Learning

(Jeff_L) #1
Exercises 649

using modified forms of (13.18 ) and (13.19) given by

πk =

∑R

r=1

γ(z( 1 rk))

∑R

r=1

∑K

j=1

γ(z
(r)
1 j)

(13.124)

Ajk =

∑R

r=1

∑N

n=2

ξ(z
(r)
n− 1 ,j,z

(r)
n,k)

∑R

r=1

∑K

l=1

∑N

n=2

ξ(zn(r−) 1 ,j,zn,l(r))

(13.125)

where, for notational convenience, we have assumed that the sequences are of the
same length (the generalization to sequences of different lengths is straightforward).
Similarly, show that the M-step equation for re-estimation of the means of Gaussian
emission models is given by

μk=

∑R

r=1

∑N

n=1

γ(z(nkr))x(nr)

∑R

r=1

∑N

n=1

γ(z
(r)
nk)

. (13.126)

Note that the M-step equations for other emission model parameters and distributions
take an analogous form.

13.13 ( ) www Use the definition (8.64) of the messages passed from a factor node
to a variable node in a factor graph, together with the expression (13.6) for the joint
distribution in a hidden Markov model, to show that the definition (13.50) of the
alpha message is the same as the definition (13.34).


13.14 ( ) Use the definition (8.67) of the messages passed from a factor node to a
variable node in a factor graph, together with the expression (13.6) for the joint
distribution in a hidden Markov model, to show that the definition (13.52) of the
beta message is the same as the definition (13.35).


13.15 ( ) Use the expressions (13.33) and (13.43) for the marginals in a hidden Markov
model to derive the corresponding results (13.64) and (13.65) expressed in terms of
re-scaled variables.


13.16 ( ) In this exercise, we derive the forward message passing equation for the
Viterbi algorithm directly from the expression (13.6) for the joint distribution. This
involves maximizing over all of the hidden variablesz 1 ,...,zN. By taking the log-
arithm and then exchanging maximizations and summations, derive the recursion

Free download pdf