Pattern Recognition and Machine Learning

(Jeff_L) #1
12.3.KernelPCA 589

Sofarwehaveassumedthattheprojecteddatasetgivenby¢(xn)haszero


mean,whichingeneralwillnotbethecase. Wecannotsimplycomputeandthen
subtractoffthemean,sincewewishtoavoidworkingdirectlyinfeaturespace,and
soagain,weformulatethealgorithmpurelyin -!ermsofthekernelfunction. The

projecteddatapointsaftercentralizing,denoted¢(xn),aregivenby


andthecorrespondingelementsoftheGrammatrixaregivenby

Knm = ¢(xn)T¢(xm)
1 N
¢(xn)T¢(xm) - N L ¢(xn)T¢(xZ)
Z=l
1 N 1 N N


  • NL¢(XZ)T¢(Xm)+N2LL¢(Xj)T¢(xZ)
    Z=l j=lZ=l
    1 N


k(xn,xm) - NL k(xz,xm)


Z=l
1 N 1 N N


  • N Lk(xn,xz)+N2LLk(Xj,Xl)'
    Z=l j=l1=1


Thiscanbeexpressedinmatrixnotationas

(12.83)

(12.84)

(12.85)

Exercise 12.27


whereINdenotesthe~N xN matrixinwhicheveryelementtakesthe~ valuel/N.


ThuswecanevaluateKusingonlythekernelfunctionandthenuseKtodetermine
theeigenvaluesandeigenvectors.NotethatthestandardPCAalgorithmis recovered
asa specialcaseifweusea linearkernelk(x,x')= xTx/.Figure12.17showsan
exampleofkernelPCAappliedtoa syntheticdataset(Scholkopfet al.,1998).Here
a 'Gaussian'kerneloftheform

k(x,x')=exp(-llx- x/11^2 /0.1) (12.86)


is appliedtoa syntheticdataset.Thelinescorrespondtocontoursalongwhichthe
projectionontothecorrespondingprincipalcomponent,definedby

is constant.

N
¢(X?Vi= L aink(X,xn)
n=l

(12.87)
Free download pdf