Pattern Recognition and Machine Learning

638 13. SEQUENTIAL DATA

13.3.1 Inference in LDS

We now turn to the problem of finding the marginal distributions for the latent variables conditional on the observation sequence. For given parameter settings, we also wish to make predictions of the next latent stateznand of the next observation xnconditioned on the observed datax 1 ,...,xn− 1 for use in real-time applications. These inference problems can be solved efficiently using the sum-product algorithm, which in the context of the linear dynamical system gives rise to the Kalman filter and Kalman smoother equations. It is worth emphasizing that because the linear dynamical system is a linear- Gaussian model, the joint distribution over all latent and observed variables is simply a Gaussian, and so in principle we could solve inference problems by using the standard results derived in previous chapters for the marginals and conditionals of a multivariate Gaussian. The role of the sum-product algorithm is to provide a more efficient way to perform such computations. Linear dynamical systems have the identical factorization, given by (13.6), to hidden Markov models, and are again described by the factor graphs in Figures 13.14 and 13.15. Inference algorithms therefore take precisely the same form except that summations over latent variables are replaced by integrations. We begin by consid- ering the forward equations in which we treatzNas the root node, and propagate messages from the leaf nodeh(z 1 )to the root. From (13.77), the initial message will be Gaussian, and because each of the factors is Gaussian, all subsequent messages will also be Gaussian. By convention, we shall propagate messages that are nor- malized marginal distributions corresponding top(zn|x 1 ,...,xn), which we denote by ̂α(zn)=N(zn|μn,Vn). (13.84) This is precisely analogous to the propagation of scaled variablesα̂(zn)given by (13.59) in the discrete case of the hidden Markov model, and so the recursion equa- tion now takes the form

cn̂α(zn)=p(xn|zn)

∫ ̂α(zn− 1 )p(zn|zn− 1 )dzn− 1. (13.85)

Substituting for the conditionalsp(zn|zn− 1 )andp(xn|zn), using (13.75) and (13.76), respectively, and making use of (13.84), we see that (13.85) becomes

cnN(zn|μn,Vn)=N(xn|Czn,Σ) ∫ N(zn|Azn− 1 ,Γ)N(zn− 1 |μn− 1 ,Vn− 1 )dzn− 1. (13.86)

Here we are supposing thatμn− 1 andVn− 1 are known, and by evaluating the integral in (13.86), we wish to determine values forμnandVn. The integral is easily evaluated by making use of the result (2.115), from which it follows that ∫ N(zn|Azn− 1 ,Γ)N(zn− 1 |μn− 1 ,Vn− 1 )dzn− 1

=N(zn|Aμn− 1 ,Pn− 1 ) (13.87)

Pattern Recognition and Machine Learning

638 13. SEQUENTIAL DATA

13.3.1 Inference in LDS

Get our desktop app

Company

Features

Documentation

Resources