Pattern Recognition and Machine Learning

(Jeff_L) #1
560 12.CONTINUOUSLATENTVARIABLES

Figure12.1 A syntheticdataselobtainedbytakingoneoftheoff-linedigitimagesandcreatingmulti-
plecopiesineachofwhichthedigithasundergonea randomdisplacementandrotation
withinsomelargerimagefield. Theresultingimageseachhave 100 )( 100 = 10.000
pixels.

thatthemanifoldwillbenonlinearbecause.forinstance.ifwetranslatethedigit
pasta particularpixel,thatpixelvaluewillgofromzero(white) 10 one(black)and
backtozeroagain. whichisclearlya nonlinearfunctionofthedigitposition. In
thisexample.!.helranslationandrotationparametersarelatentvariablesbecausewe
observeonlytheimagevectorsandarenottoldwhichvaluesofthetranslationor
rotationvariableswereusedtocreatethem.
Forrealdigitimagedata,therewillbea funherdegreeoffreedomarisingfrom
scaling. Moreovertherewillbemultipleaddilionaldegreesoffreedomassocialed
wilhmorecomplexdeformationsduetothevariabilityinanindividual'swriling
3Swellaslhedifferencesinwritingslylesbetweenindividuals. evenheless.the
numberofsuchdegreesoffreedomwillbesmallcomparedtothedimensionalityof
Ihedataset.
AppendiXA Anotherexampleisprovidedbytheoilflowdataset.inwhich(fora givenge-
ometricalconfigurationofthegas,WOller,andoilphases)thereareonlytwodegrees
offreedomofvariabilitycorrespondingtothefractionofoilinthepipeandthefrac-
tionofwater(thefractionofgasIhenbeingdetermined).Ahhoughthedataspace
comprises 12 measuremenlS,a datasetofpointswilllieclosetoa Iwo-dimensional
manifoldembeddedwithinthisspace.Inthiscase,themanifoldcomprisesscveral
distinctsegmentscorrespondingtodifferentflowregimes.eachsuchsegmentbeing
a (noisy)continuoustwo-dimensionalmanifold.If ourgoalis datacompression.or
densitymodelling,thentherecanbebenefitsinexploilingthismanifoldstruclUre.
In praclice.thedatapointswill notbeconfinedpreciselytoa smooth low-
dimensionalmanifold,andwecaninterpretthedeparturesofdatapointsfromthe
manifoldas·noise'. Thisleadsnaturallytoa generativeviewofsuchmodelsin
whichwefirstselecta poinlwithinthemanifoldaccordingtosomelatentvariable
distributionandthengenerateanobserveddatapointby:lddingnoise,drawnfrom
someconditionaldistributionofthedatavarillblesgiventhelatentvarillbles.
ThcsimplestcontinuouslatentvariablemodelassumesGaussiandistributions
forboththclatentandobservedvariablesandmakesuseofa linear,Gaussiande-
SeCTion8.1..J pendenceoftheobservedvariablesonIheslateofthelatentvariables. Thisleads
toa probabilislicfonnulationofthewell-knowntechniqueofprincipalcomponent
analysis(PeA),aswellas 10 a relatedmodelcalledfactoranalysis.
Section12.1 Inthischapterwwillbeginwilha slandard,nonprobabilistictreatmentofPeA.
andthcnweshowhowPeAarisesnaturallyasthemaximumlikelihoodsolution 10

Free download pdf