568 12.CONTINUOUSLATENTVARIABLES
(^10022)
90
00'
B
O
80
000
0
0
70 0 08 0 0
,=~o
0 cPO^00 tj ~
60
50 O~
~OOID
-2 -2
40
(^246) -2 (^02) -2 (^02)
Figure12.6 Illustrationoftheeffectsoflinearpre-processingappliedtotheOldFaithfuldataset.Theploton
theleftshowstheoriginaldata.Thecentreplotshowstheresultofstandardizingtheindividualvariablestozero
meanandunitvariance. Alsoshownaretheprincipalaxesofthisnormalizeddataset,plottedovertherange
±A~/2.Theplotontherightshowstheresultofwhiteningofthedatatogiveit zeromeanandunitcovariance.
whereLisaDxDdiagonalmatrixwithelementsAi,andUisaD xDorthog-
onalmatrixwithcolumnsgivenbyUi. Thenwedefine,foreachdatapointXn,a
transformedvaluegivenby
(12.24)
wherexis thesamplemeandefinedby(12.1).Clearly,theset{Yn}haszeromean,
anditscovarianceis givenbytheidentitymatrixbecause
N
1~LL-1/2UT(Xn - x)(xn- x)TUL-1/2
n=l
L~1/2UTSUL-1/2=L-1/2LL-1/2=I. (12.25)
AppendixA
AppendixA
Thisoperationisknownaswhiteningorsphereingthedataandis illustratedforthe
OldFaithfuldatasetinFigure12.6.
ItisinterestingtocomparePCAwiththeFisherlineardiscriminantwhichwas
discussedinSection4.1.4. Bothmethodscanbeviewedastechniquesforlinear
dimensionalityreduction. However,PCAisunsupervisedanddependsonlyonthe
valuesXnwhereasFisherlineardiscriminantalsousesclass-labelinformation.This
differenceishighlightedbytheexampleinFigure12.7.
Anothercommonapplicationofprincipalcomponentanalysisis todatavisual-
ization.Hereeachdatapointis projectedontoa two-dimensional(M= 2)principal
subspace,sothata datapointXnisplottedatCartesiancoordinatesgivenbyx'J.U1
andx'J.U2,whereUlandU2aretheeigenvectorscorrespondingtothelargestand
secondlargesteigenvalues. Anexampleofsucha plot,fortheoilflowdataset,is
showninFigure12.8.