Pattern Recognition and Machine Learning

(Jeff_L) #1
Section12.2.2

Section12.2.3

Section8.1.4


12.2.ProbabilisticpeA 571


  • WecanderiveanEMalgorithmforPCAthatiscomputationallyefficientin
    situationswhereonlya fewleadingeigenvectorsarerequiredandthatavoids
    havingtoevaluatethedatacovariancematrixasanintermediatestep.

  • Thecombinationofa probabilisticmodelandEMallowsustodealwithmiss-
    ingvaluesinthedataset.

  • MixturesofprobabilisticPCAmodelscanbeformulatedina principledway
    andtrainedusingtheEMalgorithm.

  • ProbabilisticPCAformsthebasisfora BayesiantreatmentofPCAinwhich
    thedimensionalityoftheprincipalsubspacecanbefoundautomaticallyfrom
    thedata.

  • Theexistenceofa likelihoodfunctionallowsdirectcomparisonwithother
    probabilisticdensitymodels.Bycontrast,conventionalPCAwillassigna low
    reconstructioncosttodatapointsthatareclosetotheprincipalsubspaceeven
    iftheyliearbitrarilyfarfromthetrainingdata.

  • ProbabilisticPCAcanbeusedtomodelclass-conditionaldensitiesandhence
    beappliedtoclassificationproblems.

  • TheprobabilisticPCAmodelcanberungenerativelytoprovidesamplesfrom
    thedistribution.


ThisformulationofPCAasa probabilisticmodelwasproposedindependentlyby
TippingandBishop(1997,1999b)andbyRoweis(1998).Asweshallsee later,it is
closelyrelatedtofactoranalysis(Basilevsky,1994).
ProbabilisticPCAisa simpleexampleofthelinear-Gaussianframework, in
whichallofthemarginalandconditionaldistributionsareGaussian.Wecanformu-
lateprobabilisticPCAbyfirstintroducinganexplicitlatentvariablez corresponding
totheprincipal-componentsubspace. Nextwedefinea Gaussianpriordistribution
p(z)overthelatentvariable,togetherwitha Gaussianconditionaldistributionp(xl z)
fortheobservedvariablexconditionedonthevalueofthelatentvariable.Specifi-
cally,thepriordistributionoverz is givenbya zero-meanunit-covarianceGaussian

p(z)=N(zIO,I). (12.31)

Similarly,theconditionaldistributionoftheobservedvariablex,conditionedonthe
valueofthelatentvariablez,is againGaussian,oftheform

p(xlz)=N(xlWz+J-L,a^2 I) (12.32)


Section8.2.2


inwhichthemeanofxisa generallinearfunctionofz governedbytheD xM

matrixWandtheD-dimensionalvectorJ-L.Notethatthisfactorizeswithrespectto


theelementsofx,inotherwordsthisisanexampleofthenaiveBayesmodel.As
weshallseeshortly,thecolumnsofW spana linearsubspacewithinthedataspace
thatcorrespondstotheprincipalsubspace.Theotherparameterinthismodelis the
scalara^2 governingthevarianceoftheconditionaldistribution.Notethatthereis no
Free download pdf