Pattern Recognition and Machine Learning

578 12.CONTINUOUSLATENTVARIABLES

areassumedindependent,thecomplete-dataloglikelihoodfunctiontakestheform N Inp(X,ZIJL,W,(J2)= L{lnp(xnlzn) +lnp(zn)} n=l

(12.52)

wherethenthrowofthematrixZisgivenbyZn.Wealreadyknowthattheexact

maximumlikelihoodsolutionforJLis givenbythesamplemeanxdefinedby(12.1),

andit isconvenienttosubstituteforJLatthisstage.Makinguseoftheexpressions (12.31)and(12.32)forthelatentandconditionaldistributions,respectively,andtak- ingtheexpectationwithrespecttotheposteriordistributionoverthelatentvariables, weobtain

Notethatthisdependsontheposteriordistributiononlythroughthesufficientstatis- ticsoftheGaussian.ThusintheEstep,weusetheoldparametervaluestoevaluate

M-1WT(Xn - x) (J2M-^1 +lE[zn]lE[zn]T

(12.54)

(12.55)

Exercise 12.15

whichfollowdirectlyfromtheposteriordistribution(12.42)togetherwiththestan- dardresultlE[znz~]= cov[zn]+JE[zn]JE[zn]T.HereMis definedby(12.41).

IntheMstep,wemaximizewithrespecttoWand(J2,keepingtheposterior

statisticsfixed. Maximizationwithrespectto(T2isstraightforward.Forthemaxi- mizationwithrespecttoW wemakeuseof(C.24),andobtaintheM-stepequations

Wnew

2 (Jnew =

[t,exn-X)IlIZn]T] [t,Il[ZnZ~]]-'

1 N

NDL {llxn- xl1 2

2lE[zn]TW~ew(xn- x)
n=l

+Tr(JE[znzJ]W~ewWnew)}.

(12.56)

(12.57)

TheEMalgorithmforprobabilisticPCAproceedsbyinitializingtheparameters andthenalternatelycomputingthesufficientstatisticsofthelatentspaceposterior distributionusing(12.54)and(12.55)intheE stepandrevisingtheparametervalues using(12.56)and(12.57)intheMstep. OneofthebenefitsoftheEMalgorithmforPCAiscomputationalefficiency forlarge-scaleapplications(Roweis,1998).UnlikeconventionalPCAbasedonan

Pattern Recognition and Machine Learning

(12.52)

maximumlikelihoodsolutionforJLis givenbythesamplemeanxdefinedby(12.1),

(12.54)

(12.55)

IntheMstep,wemaximizewithrespecttoWand(J2,keepingtheposterior

1 N

+Tr(JE[znzJ]W~ewWnew)}.

(12.56)

(12.57)

Get our desktop app

Company

Features

Documentation

Resources