Pattern Recognition and Machine Learning

(Jeff_L) #1

Section4.4


Section3.5.3


12.2.ProbabilisticpeA 583

Becausethisintegrationisintractable,wemakeuseoftheLaplaceapproxima-

tion.Ifweassumethattheposteriordistributionissharplypeaked,aswilloccurfor


sufficientlylargedatasets,thenthere-estimationequationsobtainedbymaximizing
themarginallikelihoodwithrespecttoaitakethesimpleform

(12.62)

whichfollowsfrom(3.98),notingthatthedimensionalityofWiisD. Thesere-

estimationsareinterleavedwiththeEMalgorithmupdatesfordeterminingWand


a^2 • TheE-stepequationsareagaingivenby(12.54)and(12.55). Similarly,theM-
stepequationfora^2 isagaingivenby(12.57). TheonlychangeistotheM-step
equationforW,whichismodifiedtogive

(12.63)

whereA= diag(ai)'ThevalueofI-"isgivenbythesamplemean,asbefore.

IfwechooseM = D- 1 then,ifallaivaluesarefinite,themodelrepresents


a full-covarianceGaussian,whileifalltheaigotoinfinitythemodelisequivalent
toanisotropicGaussian,andsothemodelcanencompassallpennissiblevaluesfor
theeffectivedimensionalityoftheprincipalsubspace.Itis alsopossibletoconsider

smallervaluesofM,whichwillsaveoncomputationalcostbutwhichwilllimit


themaximumdimensionalityofthesubspace. Acomparisonoftheresultsofthis
algorithmwithstandardprobabilisticPCAis showninFigure12.14.
BayesianPCAprovidesanopportunitytoillustratetheGibbssamplingalgo-
rithmdiscussedinSection11.3. Figure12.15showsanexampleofthesamples
fromthehyperparametersInaifora datasetinD= 4 dimensionsinwhichthedi-

mensionalityofthelatentspaceisM =3 butinwhichthedatasetis generatedfrom


a probabilisticPCAmodelhavingonedirectionofhighvariance,withtheremaining
directionscomprisinglowvariancenoise.Thisresultshowsclearlythepresenceof
threedistinctmodesintheposteriordistribution.Ateachstepoftheiteration,oneof
thehyperparametershasa smallvalueandtheremainingtwohavelargevalues,so
thattwoofthethreelatentvariablesaresuppressed.DuringthecourseoftheGibbs
sampling,thesolutionmakessharptransitionsbetweenthethreemodes.
Themodeldescribedhereinvolvesa prioronlyoverthematrixW. Afully
BayesiantreatmentofPCA,includingpriorsover1-", a^2 ,andn,andsolvedus-
ingvariationalmethods,isdescribedinBishop(1999b). Fora discussionofvari-
ousBayesianapproachestodetenniningtheappropriatedimensionalityfora PCA
model,seeMinka(2001c).

12.2.4 Factor analysis


Factoranalysisisa linear-Gaussianlatentvariablemodelthatis closelyrelated
toprobabilisticPCA.ItsdefinitiondiffersfromthatofprobabilisticPCAonlyinthat
theconditionaldistributionoftheobservedvariablexgiventhelatentvariablez is
Free download pdf