Pattern Recognition and Machine Learning

(Jeff_L) #1
578 12.CONTINUOUSLATENTVARIABLES

areassumedindependent,thecomplete-dataloglikelihoodfunctiontakestheform
N
Inp(X,ZIJL,W,(J2)= L{lnp(xnlzn) +lnp(zn)}
n=l

(12.52)

wherethenthrowofthematrixZisgivenbyZn.Wealreadyknowthattheexact

maximumlikelihoodsolutionforJLis givenbythesamplemeanxdefinedby(12.1),


andit isconvenienttosubstituteforJLatthisstage.Makinguseoftheexpressions
(12.31)and(12.32)forthelatentandconditionaldistributions,respectively,andtak-
ingtheexpectationwithrespecttotheposteriordistributionoverthelatentvariables,
weobtain

Notethatthisdependsontheposteriordistributiononlythroughthesufficientstatis-
ticsoftheGaussian.ThusintheEstep,weusetheoldparametervaluestoevaluate

M-1WT(Xn - x)
(J2M-^1 +lE[zn]lE[zn]T

(12.54)

(12.55)

Exercise 12.15


whichfollowdirectlyfromtheposteriordistribution(12.42)togetherwiththestan-
dardresultlE[znz~]= cov[zn]+JE[zn]JE[zn]T.HereMis definedby(12.41).

IntheMstep,wemaximizewithrespecttoWand(J2,keepingtheposterior


statisticsfixed. Maximizationwithrespectto(T2isstraightforward.Forthemaxi-
mizationwithrespecttoW wemakeuseof(C.24),andobtaintheM-stepequations

Wnew

2
(Jnew =

[t,exn-X)IlIZn]T] [t,Il[ZnZ~]]-'


1 N

NDL {llxn- xl1
2


  • 2lE[zn]TW~ew(xn- x)
    n=l


+Tr(JE[znzJ]W~ewWnew)}.


(12.56)

(12.57)

TheEMalgorithmforprobabilisticPCAproceedsbyinitializingtheparameters
andthenalternatelycomputingthesufficientstatisticsofthelatentspaceposterior
distributionusing(12.54)and(12.55)intheE stepandrevisingtheparametervalues
using(12.56)and(12.57)intheMstep.
OneofthebenefitsoftheEMalgorithmforPCAiscomputationalefficiency
forlarge-scaleapplications(Roweis,1998).UnlikeconventionalPCAbasedonan
Free download pdf