Section12.2.2
Section12.2.3
Section8.1.4
12.2.ProbabilisticpeA 571
- WecanderiveanEMalgorithmforPCAthatiscomputationallyefficientin
situationswhereonlya fewleadingeigenvectorsarerequiredandthatavoids
havingtoevaluatethedatacovariancematrixasanintermediatestep. - Thecombinationofa probabilisticmodelandEMallowsustodealwithmiss-
ingvaluesinthedataset. - MixturesofprobabilisticPCAmodelscanbeformulatedina principledway
andtrainedusingtheEMalgorithm. - ProbabilisticPCAformsthebasisfora BayesiantreatmentofPCAinwhich
thedimensionalityoftheprincipalsubspacecanbefoundautomaticallyfrom
thedata. - Theexistenceofa likelihoodfunctionallowsdirectcomparisonwithother
probabilisticdensitymodels.Bycontrast,conventionalPCAwillassigna low
reconstructioncosttodatapointsthatareclosetotheprincipalsubspaceeven
iftheyliearbitrarilyfarfromthetrainingdata. - ProbabilisticPCAcanbeusedtomodelclass-conditionaldensitiesandhence
beappliedtoclassificationproblems. - TheprobabilisticPCAmodelcanberungenerativelytoprovidesamplesfrom
thedistribution.
ThisformulationofPCAasa probabilisticmodelwasproposedindependentlyby
TippingandBishop(1997,1999b)andbyRoweis(1998).Asweshallsee later,it is
closelyrelatedtofactoranalysis(Basilevsky,1994).
ProbabilisticPCAisa simpleexampleofthelinear-Gaussianframework, in
whichallofthemarginalandconditionaldistributionsareGaussian.Wecanformu-
lateprobabilisticPCAbyfirstintroducinganexplicitlatentvariablez corresponding
totheprincipal-componentsubspace. Nextwedefinea Gaussianpriordistribution
p(z)overthelatentvariable,togetherwitha Gaussianconditionaldistributionp(xl z)
fortheobservedvariablexconditionedonthevalueofthelatentvariable.Specifi-
cally,thepriordistributionoverz is givenbya zero-meanunit-covarianceGaussian
p(z)=N(zIO,I). (12.31)
Similarly,theconditionaldistributionoftheobservedvariablex,conditionedonthe
valueofthelatentvariablez,is againGaussian,oftheform
p(xlz)=N(xlWz+J-L,a^2 I) (12.32)
Section8.2.2
inwhichthemeanofxisa generallinearfunctionofz governedbytheD xM
matrixWandtheD-dimensionalvectorJ-L.Notethatthisfactorizeswithrespectto
theelementsofx,inotherwordsthisisanexampleofthenaiveBayesmodel.As
weshallseeshortly,thecolumnsofW spana linearsubspacewithinthedataspace
thatcorrespondstotheprincipalsubspace.Theotherparameterinthismodelis the
scalara^2 governingthevarianceoftheconditionaldistribution.Notethatthereis no