Pattern Recognition and Machine Learning

(Jeff_L) #1

570 12.CONTINUOUSLATENTVARIABLES


dimensionalcentreddatamatrix,whosenthrowis givenby(xn- X)T.Thecovari-

ancematrix(12.3)canthenbewrittenasS= N-^1 XTX,andthecorresponding


eigenvectorequationbecomes

1 T
-XN XUi= AiUi.

Nowpre-multiplybothsidesbyXtogive
1 T
NXX (XUi)= Ai(XUi)'

IfwenowdefineVi=XUi,weobtain


1 T

-XXVi=AiVi
N

(12.26)

(12.27)

(12.28)

(12.30)

whichis aneigenvectorequationfortheN xNmatrixN-^1 XXT.Weseethatthis


hasthesameN-1eigenvaluesastheoriginalcovariancematrix(whichitselfhasan
additionalD- N+1 eigenvaluesofvaluezero).Thuswecansolvetheeigenvector
probleminspacesoflowerdimensionalitywithcomputationalcostO(N^3 )instead
ofO(D^3 ).Inordertodeterminetheeigenvectors,wemultiplybothsidesof(12.28)
byXTtogive

(
NX^1 T)X (XTVi)= Ai(XTVi) (12.29)

fromwhichweseethat(XTVi)isaneigenvectorofS witheigenvalueAi. Note,
however,thattheseeigenvectorsneednotbenormalized.Todeterminetheappropri-

atenormalization,were-scaleUiex:XTVibya constantsuch thatIluiII=1,which,


assumingVihasbeennormalizedtounitlength,gives

1 T
Ui= (NAi)1/2X Vi·

Insummary,toapplythisapproachwefirstevaluateXXTandthenfinditseigen-


vectorsandeigenvaluesandthencomputetheeigenvectorsintheoriginaldataspace
using(12.30).

12.2. ProbabilisticpeA


TheformulationofPCAdiscussedintheprevioussectionwasbasedona linear
projectionofthedataontoa subspaceoflowerdimensionalitythantheoriginaldata
space. WenowshowthatPCAcanalsobeexpressedasthemaximumlikelihood
solutionofa probabilisticlatentvariablemodel.ThisreformulationofPCA,known
asprobabilisticpeA,bringsseveraladvantagescomparedwithconventionalPCA:


  • ProbabilisticPCArepresentsa constrainedformoftheGaussiandistribution
    inwhichthenumberoffreeparameterscanberestrictedwhilestillallowing
    themodeltocapturethedominantcorrelationsina dataset.

Free download pdf