12.2.ProbabilisticpeA 575Again,weshallassumethattheeigenvectorshavebeenarrangedinorderofdecreas-
ingvaluesofthecorrespondingeigenvalues,sothattheMprincipaleigenvectorsareUl,"" UM.Inthiscase,thecolumnsofW definetheprincipalsubspaceofstan-
dardPCA.Thecorrespondingmaximumlikelihoodsolutionfor(J'2is thengivenby1 D
(J'~L= D-M L Ai
i=M+l(12.46)
Section12.2.2
sothat(J'~Lis theaveragevarianceassociatedwiththediscardeddimensions.
BecauseRis orthogonal,it canbeinterpretedasa rotationmatrixintheM x M
latentspace.IfwesubstitutethesolutionforW intotheexpressionforC,andmakeuseoftheorthogonalitypropertyRRT = I,weseethatCisindependentofR.
Thissimplysaysthatthepredictivedensityisunchangedbyrotationsinthelatent
spaceasdiscussedearlier.FortheparticularcaseofR= I,weseethatthecolumns
ofW aretheprincipalcomponenteigenvectorsscaledbythevarianceparameters
Ai- (J'2. Theinterpretationofthesescalingfactorsisclearoncewerecognizethat
fora convolutionofindependentGaussiandistributions(in thiscasethelatentspace
distributionandthenoisemodel)thevariancesareadditive. ThusthevarianceAi
inthedirectionofaneigenvectorUiiscomposedofthesumofa contributionAi-
(J'2fromtheprojectionoftheunit-variancelatentspacedistributionintodataspacethroughthecorrespondingcolumnofW,plusanisotropiccontributionofvariance
(J'2whichis addedinalldirectionsbythenoisemodel.
Itisworthtakinga momenttostudytheformofthecovariancematrixgiven
by(12.36).Considerthevarianceofthepredictivedistributionalongsomedirection
specifiedbytheunitvectorv,wherevTv= 1,whichisgivenbyvTCv. First
supposethatvisorthogonaltotheprincipalsubspace,inotherwordsit isgivenbysomelinearcombinationofthediscardedeigenvectors. ThenvTV=0 and hence
v TCv= (J'2. Thusthemodelpredictsa noisevarianceorthogonaltotheprincipal
subspace,which,from(12.46),isjusttheaverageofthediscardedeigenvalues.Now
supposethatv= UiwhereUiisoneoftheretainedeigenvectorsdefiningtheprin-
cipalsubspace. ThenvTCv= (Ai - (J'2)+(J'2= Ai. Inotherwords,thismodel
correctlycapturesthevarianceofthedataalongtheprincipalaxes,andapproximates
thevarianceinallremainingdirectionswitha singleaveragevalue(J'2.
Onewaytoconstructthemaximumlikelihooddensitymodelwouldsimplybe
tofindtheeigenvectorsandeigenvaluesofthedatacovariancematrixandthentoevaluateWand(J'2usingtheresultsgivenabove. Inthiscase,wewouldchoose
R = I forconvenience.However,ifthemaximumlikelihoodsolutionisfoundby
numericaloptimizationofthelikelihoodfunction,forinstanceusinganalgorithm
suchasconjugategradients(Fletcher,1987;NocedalandWright,1999;Bishopand
Nabney,2008)orthroughtheEMalgorithm,thentheresultingvalueofRises-sentiallyarbitrary.ThisimpliesthatthecolumnsofW neednotbeorthogonal.If
anorthogonalbasisisrequired,thematrixW canbepost-processedappropriately
(GolubandVanLoan,1996). Alternatively,theEMalgorithmcanbemodifiedin
sucha wayastoyieldorthonormalprincipaldirections,sortedindescendingorder
ofthecorrespondingeigenvalues,directly(AhnandOh,2003).