Pattern Recognition and Machine Learning

12.2.ProbabilisticpeA 577

dimensionality.Ifwerestrictthecovariancematrixtobediagonal,thenit hasonlyD

independentparameters,andsothenumberofparametersnowgrowslinearlywith dimensionality.However,it nowtreatsthevariablesasiftheywereindependentand hencecannolongerexpressanycorrelationsbetweenthem.ProbabilisticPeApro- videsanelegantcompromiseinwhichtheM mostsignificantcorrelationscanbe capturedwhilestillensuringthatthetotalnumberofparametersgrowsonlylinearly with D. Wecanseethisbyevaluatingthenumberofdegreesoffreedominthe

PPCAmodelasfollows. ThecovariancematrixCdependsontheparametersW,

whichhassizeDxM,anda^2 ,givinga totalparametercountofDM+1.However, wehaveseenthatthereis someredundancyinthisparameterizationassociatedwith rotationsofthecoordinatesysteminthelatentspace.TheorthogonalmatrixRthat expressestheserotationshassizeMxM.Inthefirstcolumnofthismatrixthereare M - 1 independentparameters,becausethecolumnvectormustbenormalizedto unitlength.InthesecondcolumnthereareM - 2 independentparameters,because thecolumnmustbenormalizedandalsomustbeorthogonaltothepreviouscolumn,

andsoon.Summingthisarithmeticseries,weseethatRhasa totalofM(M-1)/2

independentparameters.Thusthenumberofdegreesoffreedominthecovariance matrixCis givenby DM+1 - M(M- 1)/2. (12.51)

Exercise 12.14

Section12.2.4

Section9.4

Thenumberofindependentparametersinthismodelthereforeonlygrowslinearly withD,forfixedM.IfwetakeM = D- 1,thenwerecoverthestandardresult fora fullcovarianceGaussian. Inthiscase,thevariancealongD- 1 linearlyin-

dependentdirectionsis controlledbythecolumnsofW,andthevariancealongthe

remainingdirectionis givenbya^2 .IfM = 0,themodelis equivalenttotheisotropic covariancecase.

12.2.2 EMalgorithmforpeA

Aswehaveseen,theprobabilisticPCAmodelcanbeexpressedintermsofa marginalizationovera continuouslatentspacez inwhichforeachdatapointXn, thereisa correspondinglatentvariableZn. WecanthereforemakeuseoftheEM algorithmtofindmaximumlikelihoodestimatesofthemodelparameters.Thismay seemratherpointlessbecausewehavealreadyobtainedanexactclosed-formso- lutionforthemaximumlikelihoodparametervalues. However,inspacesofhigh dimensionality,theremaybecomputationaladvantagesinusinganiterativeEM procedureratherthanworkingdirectlywiththesamplecovariancematrix.ThisEM procedurecanalsobeextendedtothefactoranalysismodel,forwhichthereisno closed-formsolution. Finally,it allowsmissingdatatobehandledina principled way. WecanderivetheEMalgorithmforprobabilisticPCAbyfollowingthegeneral frameworkforEM.Thuswewritedownthecomplete-dataloglikelihoodand take itsexpectation withrespecttotheposteriordistributionofthelatentdistribution evaluatedusing'old'parametervalues. Maximizationofthisexpectedcomplete- dataloglikelihoodthenyieldsthe'new'parametervalues.Becausethedatapoints

Pattern Recognition and Machine Learning

dimensionality.Ifwerestrictthecovariancematrixtobediagonal,thenit hasonlyD

PPCAmodelasfollows. ThecovariancematrixCdependsontheparametersW,

andsoon.Summingthisarithmeticseries,weseethatRhasa totalofM(M-1)/2

dependentdirectionsis controlledbythecolumnsofW,andthevariancealongthe

12.2.2 EMalgorithmforpeA

Get our desktop app

Company

Features

Documentation

Resources