Pattern Recognition and Machine Learning

(Jeff_L) #1
568 12.CONTINUOUSLATENTVARIABLES

(^10022)
90
00'


B


O

80

000

0

0
70 0 08 0 0

,=~o


0 cPO^00 tj ~
60
50 O~
~OOID
-2 -2
40

(^246) -2 (^02) -2 (^02)
Figure12.6 Illustrationoftheeffectsoflinearpre-processingappliedtotheOldFaithfuldataset.Theploton
theleftshowstheoriginaldata.Thecentreplotshowstheresultofstandardizingtheindividualvariablestozero
meanandunitvariance. Alsoshownaretheprincipalaxesofthisnormalizeddataset,plottedovertherange
±A~/2.Theplotontherightshowstheresultofwhiteningofthedatatogiveit zeromeanandunitcovariance.
whereLisaDxDdiagonalmatrixwithelementsAi,andUisaD xDorthog-
onalmatrixwithcolumnsgivenbyUi. Thenwedefine,foreachdatapointXn,a
transformedvaluegivenby
(12.24)


wherexis thesamplemeandefinedby(12.1).Clearly,theset{Yn}haszeromean,


anditscovarianceis givenbytheidentitymatrixbecause

N
1~LL-1/2UT(Xn - x)(xn- x)TUL-1/2
n=l

L~1/2UTSUL-1/2=L-1/2LL-1/2=I. (12.25)


AppendixA


AppendixA


Thisoperationisknownaswhiteningorsphereingthedataandis illustratedforthe
OldFaithfuldatasetinFigure12.6.
ItisinterestingtocomparePCAwiththeFisherlineardiscriminantwhichwas
discussedinSection4.1.4. Bothmethodscanbeviewedastechniquesforlinear
dimensionalityreduction. However,PCAisunsupervisedanddependsonlyonthe
valuesXnwhereasFisherlineardiscriminantalsousesclass-labelinformation.This
differenceishighlightedbytheexampleinFigure12.7.
Anothercommonapplicationofprincipalcomponentanalysisis todatavisual-
ization.Hereeachdatapointis projectedontoa two-dimensional(M= 2)principal
subspace,sothata datapointXnisplottedatCartesiancoordinatesgivenbyx'J.U1
andx'J.U2,whereUlandU2aretheeigenvectorscorrespondingtothelargestand
secondlargesteigenvalues. Anexampleofsucha plot,fortheoilflowdataset,is
showninFigure12.8.
Free download pdf