Pattern Recognition and Machine Learning

(Jeff_L) #1
584 12.CONTINUOUSLATENTVARIABLES



  • •• • • •

  • • • •
    •• •



  • • • •











  • ••



  • • • •

  • • • •













  • ••






  • Figure12.14 'Hinloo'diagramsofthematrixW inwhicheachelement 01 thematrixisdepictedas
    a square(whitelorpositiveandblacklornegativevalues)whoseareaisproportional
    tothemagnitudeofthatelement. Thesyntheticdataselcomprises 300 datapointsin
    D= 10 dimensionssampledfroma Gaussiandistributionhavingstandarddeviation1.0
    in 3 directionsandstandarddeviation0.5intheremaining7 directionsforadatasetin
    D= 10 dimensionshavingAT=3 directionswithlargervariancethantheremaining 7
    directions.Theleft-handplolshowstheresultIrommaximumlikelihoodprobabilisticPCA,
    andtheleft·handplotshowsthecorrespondingresuftfromBayesianpeA.Weseehow
    theBayesianmodelis abletodiscovertheappropriatedimensionalitybysuppressingthe
    6 surplusdegreesoffreedom.




takentohavea diagonalratherthananisotropiccovariancesothat

p(xlz)=N(xlWz+1'.\II) (12.64)


whereillis aDxDdiagonalmatrix.Notethatthefactoranalysismodel,incommon

withprobabilisticPCA.assumesthattheobservedvariablesXl,...,Xoareindepen-


dent.giventhelatentvariablez. Inessence.thefactoranalysismodelis explaining
theobservedcovariancestructureofthedatabyrepresentingtheindependentvari-
anceassociatedwitheachcoordinateinthematrix1J.'andcapturingthecovariance

betweenvariablesinthematrixW. Inthefactoranalysisliterature.thecolumns


ofW.whichcapturethecorrelationsbetweenobservedvariables.arecalledfaclOr
loadings.andthediagonalelementsof1J.'.whichrepresenttheindependentnoise
variancesforeachofthevariables,arecalledllniqllenesses.
TheoriginsoffactoranalysisareasoldasthoseofPCA.anddiscussionsof
factoranalysiscanbefoundinthebooksbyEveritt(1984).Bartholomew(1987),
andBasilevsky(1994). LinksbetweenfactoranalysisandPCAwereinvestigated
byLilwley(1953)andAnderson(1963)whoshowedthatatstationarypointsof
thelikelihoodfunction.fora faclOranalysismodelwith1J.' = (121,thecolumnsof
W arescaledeigenvectorsofthesamplecovariancematrix.and(12istheaverage
ofthediscardedeigenvalues. Later.TippingandBishop(1999b)showedthatthe
maximumoftheloglikelihoodfunctionoccurswhentheeigenvectorscomprising

Warechosentobetheprincipaleigenvectors.


Makinguseof(2.115).weseethatthemarginal distributionfortheobserved
Free download pdf