Pattern Recognition and Machine Learning

(Jeff_L) #1

Chapter 5


ChapterJJ


12.4.NonlinearLatentVariableModels 597

fold. Forinstance,if twopointslieona circle,thenthegeodesicisthearc-length
distancemeasuredaroundthecircumferenceofthecirclenotthestraightlinedis-
tancemeasuredalongthechordconnectingthem. Thealgorithmfirstdefinesthe
neighbourhoodforeachdatapoint,eitherbyfindingtheKnearestneighboursorby
findingallpointswithina sphereofradiusE. Agraphisthenconstructedbylink-
ingallneighbouringpointsandlabellingthemwiththeirEuclideandistance. The
geodesicdistancebetweenanypairofpointsisthenapproximatedbythesumof
thearclengthsalongtheshortestpathconnectingthem(whichitselfis foundusing
standardalgorithms).Finally,metricMDSis appliedtothegeodesicdistancematrix
tofindthelow-dimensionalprojection.
Ourfocus inthischapterhasbeenon modelsforwhichthe observedvari-
ablesarecontinuous. Wecanalsoconsidermodelshavingcontinuouslatentvari-
ablestogetherwithdiscreteobservedvariables,givingrisetolatenttraitmodels
(Bartholomew,1987). Inthiscase,themarginalizationoverthecontinuouslatent
variables,evenfora linearrelationshipbetweenlatentandobservedvariables, can-
notbeperformedanalytically,andsomoresophisticatedtechniquesarerequired.
Tipping(1999)usesvariationalinferenceina modelwitha two-dimensionallatent
space,allowinga binarydatasettobevisualizedanalogouslytotheuseofPCAto
visualizecontinuousdata. NotethatthismodelisthedualoftheBayesianlogistic
regressionproblemdiscussedinSection4.5. Inthecaseoflogisticregressionwe
haveN observationsofthefeaturevector<l>nwhichareparameterizedbya single
parametervectorw,whereasinthelatentspacevisualizationmodelthereisa single
latentspacevariablex(analogousto<1» andNcopiesofthelatentvariableWn.A
generalizationofprobabilisticlatentvariablemodelstogeneralexponentialfamily
distributionsisdescribedinCollinsetal.(2002).
Wehavealreadynotedthatanarbitrarydistributioncanbeformedbytakinga
Gaussianrandomvariableandtransformingit througha suitablenonlinearity. This
isexploitedina generallatentvariablemodelcalledadensitynetwork(MacKay,
1995;MacKayandGibbs,1999)inwhichthenonlinearfunctionisgovernedbya

multilayeredneuralnetwork.Ifthenetworkhasenoughhiddenunits,it canapprox-


imatea givennonlinearfunctiontoanydesiredaccuracy.Thedownsideofhaving
sucha flexiblemodelis thatthemarginalizationoverthelatentvariables,requiredin
ordertoobtainthelikelihoodfunction,isnolongeranalyticallytractable. Instead,
thelikelihoodisapproximatedusingMonteCarlotechniquesbydrawingsamples
fromtheGaussianprior.Themarginalizationoverthelatentvariablesthenbecomes
a simplesumwithonetermforeachsample. However,becausea largenumber
ofsamplepointsmayberequiredinordertogiveanaccuraterepresentationofthe
marginal,thisprocedurecanbecomputationallycostly.

Ifweconsidermorerestrictedformsforthenonlinearfunction,andmakeanap-


propriatechoiceofthelatentvariabledistribution,thenwecanconstructa latentvari-
ablemodelthatisbothnonlinearandefficienttotrain.Thegenerativetopographic
mapping,orGTM(BishopetaI., 1996;BishopetaI.,1997a;BishopetaI.,1998b)
usesa latentdistributionthatis definedbya finiteregulargridofdeltafunctionsover
the(typicallytwo-dimensional)latentspace.Marginalizationoverthelatentspace
thensimplyinvolvessummingoverthecontributionsfromeachofthegridlocations.
Free download pdf