Pattern Recognition and Machine Learning

(Jeff_L) #1
9.3. An Alternative View of EM 439

3.M step. Re-estimate the parameters using the current responsibilities

μnewk =

1

Nk

∑N

n=1

γ(znk)xn (9.24)

Σnewk =

1

Nk

∑N

n=1

γ(znk)(xn−μnewk )(xn−μnewk )T (9.25)

πknew =

Nk
N

(9.26)

where

Nk=

∑N

n=1

γ(znk). (9.27)


  1. Evaluate the log likelihood


lnp(X|μ,Σ,π)=

∑N

n=1

ln

{K

k=1

πkN(xn|μk,Σk)

}

(9.28)

and check for convergence of either the parameters or the log likelihood. If
the convergence criterion is not satisfied return to step 2.

9.3 An Alternative View of EM


In this section, we present a complementary view of the EM algorithm that recog-
nizes the key role played by latent variables. We discuss this approach first of all
in an abstract setting, and then for illustration we consider once again the case of
Gaussian mixtures.
The goal of the EM algorithm is to find maximum likelihood solutions for mod-
els having latent variables. We denote the set of all observed data byX, in which the
nthrow representsxTn, and similarly we denote the set of all latent variables byZ,
with a corresponding rowzTn. The set of all model parameters is denoted byθ, and
so the log likelihood function is given by

lnp(X|θ)=ln

{

Z

p(X,Z|θ)

}

. (9.29)


Note that our discussion will apply equally well to continuous latent variables simply
by replacing the sum overZwith an integral.
A key observation is that the summation over the latent variables appears inside
the logarithm. Even if the joint distributionp(X,Z|θ)belongs to the exponential
Free download pdf