Section12.2.2Section12.2.3Section8.1.4
12.2.ProbabilisticpeA 571- WecanderiveanEMalgorithmforPCAthatiscomputationallyefficientin
situationswhereonlya fewleadingeigenvectorsarerequiredandthatavoids
havingtoevaluatethedatacovariancematrixasanintermediatestep. - Thecombinationofa probabilisticmodelandEMallowsustodealwithmiss-
ingvaluesinthedataset. - MixturesofprobabilisticPCAmodelscanbeformulatedina principledway
andtrainedusingtheEMalgorithm. - ProbabilisticPCAformsthebasisfora BayesiantreatmentofPCAinwhich
thedimensionalityoftheprincipalsubspacecanbefoundautomaticallyfrom
thedata. - Theexistenceofa likelihoodfunctionallowsdirectcomparisonwithother
probabilisticdensitymodels.Bycontrast,conventionalPCAwillassigna low
reconstructioncosttodatapointsthatareclosetotheprincipalsubspaceeven
iftheyliearbitrarilyfarfromthetrainingdata. - ProbabilisticPCAcanbeusedtomodelclass-conditionaldensitiesandhence
beappliedtoclassificationproblems. - TheprobabilisticPCAmodelcanberungenerativelytoprovidesamplesfrom
thedistribution.
ThisformulationofPCAasa probabilisticmodelwasproposedindependentlyby
TippingandBishop(1997,1999b)andbyRoweis(1998).Asweshallsee later,it is
closelyrelatedtofactoranalysis(Basilevsky,1994).
ProbabilisticPCAisa simpleexampleofthelinear-Gaussianframework, in
whichallofthemarginalandconditionaldistributionsareGaussian.Wecanformu-
lateprobabilisticPCAbyfirstintroducinganexplicitlatentvariablez corresponding
totheprincipal-componentsubspace. Nextwedefinea Gaussianpriordistribution
p(z)overthelatentvariable,togetherwitha Gaussianconditionaldistributionp(xl z)
fortheobservedvariablexconditionedonthevalueofthelatentvariable.Specifi-
cally,thepriordistributionoverz is givenbya zero-meanunit-covarianceGaussianp(z)=N(zIO,I). (12.31)Similarly,theconditionaldistributionoftheobservedvariablex,conditionedonthe
valueofthelatentvariablez,is againGaussian,oftheformp(xlz)=N(xlWz+J-L,a^2 I) (12.32)
Section8.2.2
inwhichthemeanofxisa generallinearfunctionofz governedbytheD xMmatrixWandtheD-dimensionalvectorJ-L.Notethatthisfactorizeswithrespectto
theelementsofx,inotherwordsthisisanexampleofthenaiveBayesmodel.As
weshallseeshortly,thecolumnsofW spana linearsubspacewithinthedataspace
thatcorrespondstotheprincipalsubspace.Theotherparameterinthismodelis the
scalara^2 governingthevarianceoftheconditionaldistribution.Notethatthereis no