Pattern Recognition and Machine Learning

(Jeff_L) #1

596 12.CONTINUOUSLATENTVARIABLES


>..= gf(X)becauseit dependsontheparticularcurvef(>"). Fora continuousdata
densityp(x),a principalcurveis definedasoneforwhicheverypointonthecurve
is themeanofallthosepointsindataspacethatprojecttoit,sothat

JE[Xlgf(X)=>..]=f(>"). (12.92)


Fora givencontinuousdensity,therecanbemanyprincipalcurves.Inpractice,we
areinterestedinfinitedatasets,andwealsowishtorestrictattentiontosmooth
curves.HastieandStuetzle(1989)proposea two-stageiterativeprocedureforfind-
ingsuchprincipalcurves,somewhatreminiscentoftheEMalgorithmforPCA.The
curveisinitializedusingthefirstprincipalcomponent,andthenthealgorithmalter-
natesbetweena dataprojectionstepandcurvere-estimationstep.Intheprojection
step,eachdatapointisassignedtoa valueof>..correspondingtotheclosestpoint
onthecurve. Theninthere-estimationstep,eachpointonthecurveisgivenby
a weightedaverageofthosepointsthatprojecttonearbypointsonthecurve,with
pointsclosestonthecurvegiventhegreatestweight.Inthecasewherethesubspace
is constrainedtobelinear,theprocedureconvergestothefirstprincipalcomponent
andisequivalenttothepowermethodforfindingthelargesteigenvectoroftheco-
variancematrix.Principalcurvescanbegeneralizedtomultidimensionalmanifolds
calledprincipalsurfacesalthoughthesehavefoundlimiteduseduetothedifficulty
ofdatasmoothinginhigherdimensionsevenfortwo-dimensionalmanifolds.
PCAis oftenusedtoprojecta datasetontoa lower-dimensionalspace,forex-
ampletwodimensional,forthepurposesofvisualization.Anotherlineartechnique
witha similaraimismultidimensionalscaling,orMDS(CoxandCox,2000).Itfinds
a low-dimensionalprojectionofthedatasuchastopreserve,ascloselyaspossible,
thepairwisedistancesbetweendatapoints,andinvolvesfindingtheeigenvectorsof
thedistancematrix.InthecasewherethedistancesareEuclidean,it givesequivalent
resultstoPCA.TheMDSconceptcanbeextendedtoa widevarietyofdatatypes
specifiedintermsofa similaritymatrix,givingnonmetricMDS.
Twoothernonprobabilisticmethodsfordimensionalityreductionanddatavi-
sualizationareworthyofmention. Locallylinearembedding,orLLE(Roweisand
Saul,2000)firstcomputesthesetofcoefficientsthatbestreconstructseachdata
pointfromitsneighbours. Thesecoefficientsarearrangedtobeinvarianttorota-
tions,translations,andscalingsofthatdatapointanditsneighbours,andhencethey
characterizethelocalgeometricalpropertiesoftheneighbourhood.LLEthenmaps
thehigh-dimensionaldatapointsdowntoa lowerdimensionalspacewhilepreserv-

ingtheseneighbourhoodcoefficients. Ifthelocalneighbourhoodfora particular


datapointcanbeconsideredlinear,thenthetransformationcanbeachievedusing
a combinationoftranslation,rotation,andscaling,suchastopreservetheangles
formedbetweenthedatapointsandtheirneighbours. Becausetheweightsarein-
varianttothesetransformations,weexpectthesameweightvaluestoreconstructthe
datapointsinthelow-dimensionalspaceasinthehigh-dimensionaldataspace. In
spiteofthenonlinearity,theoptimizationforLLEdoesnotexhibitlocalminima.
Inisometricfeaturemapping,orisomap(Tenenbaumetai.,2000),thegoalis
toprojectthedatatoa lower-dimensionalspaceusingMDS,butwherethedissim-
ilaritiesaredefinedintermsofthegeodesicdistancesmeasuredalongthemani-
Free download pdf