Pattern Recognition and Machine Learning

(Jeff_L) #1

574 12.CONTINUOUSLATENTVARIABLES


Figure12.10 TheprobabilisticpeAmodelfora datasetofNobser-
vationsofxcanbeexpressedasa directedgraphin
whicheachobservationXnis associatedwitha value
Znof thelatentvariable.

..-+--w


N

12.2.1 MaximumlikelihoodpeA


Wenextconsiderthedeterminationofthemodel parametersusingmaximum

likelihood. Givena datasetX = {xn}ofobserveddatapoints,theprobabilistic


peAmodelcanbeexpressedasa directedgraph,asshowninFigure12.10. The
correspondingloglikelihoodfunctionis given,from(12.35),by

N
Inp(XIJL,W,O'^2 )= Llnp(xnIW,JL,O'^2 )
n=l
N
--2-NDln(2n)- 2 NlnIe[- 1"" 2 L,..(x T^1
n- JL) c-(xn- JL). (12.43)
n=l

SettingthederivativeoftheloglikelihoodwithrespecttoJLequaltozerogivesthe
expectedresultJL= xwherexis thedatameandefinedby(12.1).Back-substituting
wecanthenwritetheloglikelihoodfunctionintheform

N
Inp(XIW,JL,0'2)= -2{DIn(2n)+InIe[+Tr(C-1S)} (12.44)

whereSisthedatacovariancematrixdefinedby(12.3).Becausetheloglikelihood
is a quadraticfunctionofJL,thissolutionrepresentstheuniquemaximum,ascanbe
confirmedbycomputingsecondderivatives.
MaximizationwithrespecttoW and0'2ismorecomplexbutnonethelesshas

anexact closed-formsolution.ItwasshownbyTippingandBishop(1999b)thatall


ofthestationarypointsoftheloglikelihoodfunctioncanbewrittenas

(12.45)

whereUM is aDxM matrixwhosecolumnsaregivenbyanysubset(ofsizeM)
oftheeigenvectorsofthedatacovariancematrixS,theM xMdiagonalmatrix
LM haselementsgivenbythecorrespondingeigenvalues..\,andRisanarbitrary
M xM orthogonalmatrix.
Furthermore,TippingandBishop(1999b)showedthatthemaximumofthelike-
lihoodfunctionis obtainedwhentheMeigenvectorsarechosentobethosewhose
eigenvaluesaretheMlargest(allothersolutionsbeingsaddlepoints).Asimilarre-
sultwasconjecturedindependentlybyRoweis(1998),althoughnoproofwasgiven.
Free download pdf