634 13. SEQUENTIAL DATA
Figure 13.19 A factorial hidden Markov model com-
prising two Markov chains of latent vari-
ables. For continuous observed variables
x, one possible choice of emission model
is a linear-Gaussian density in which the
mean of the Gaussian is a linear combi-
nation of the states of the corresponding
latent variables.
z
(1)
n− 1 z(1)n z
(1)
n+1
z(2)n− 1 z(2)n z(2)n+1
xn− 1 xn xn+1
Markov chains of latent variables, and the distribution of the observed variable at
a given time step is conditional on the states of all of the corresponding latent vari-
ables at that same time step. Figure 13.19 shows the corresponding graphical model.
The motivation for considering factorial HMM can be seen by noting that in order to
represent, say, 10 bits of information at a given time step, a standard HMM would
needK=2^10 = 1024latent states, whereas a factorial HMM could make use of 10
binary latent chains. The primary disadvantage of factorial HMMs, however, lies in
the additional complexity of training them. The M step for the factorial HMM model
is straightforward. However, observation of thexvariables introduces dependencies
between the latent chains, leading to difficulties with the E step. This can be seen
by noting that in Figure 13.19, the variablesz(1)n andz(2)n are connected by a path
which is head-to-head at nodexnand hence they are not d-separated. The exact E
step for this model doesnotcorrespond to running forward and backward recursions
along theMMarkov chains independently. This is confirmed by noting that the key
conditional independence property (13.5) is not satisfied for the individual Markov
chains in the factorial HMM model, as is shown using d-separation in Figure 13.20.
Now suppose that there areMchains of hidden nodes and for simplicity suppose
that all latent variables have the same numberKof states. Then one approach would
be to note that there areKMcombinations of latent variables at a given time step
Figure 13.20 Example of a path, highlighted in green,
which is head-to-head at the observed
nodesxn− 1 andxn+1, and head-to-tail
at the unobserved nodesz(2)n− 1 ,z(2)n and
z(2)n+1. Thus the path is not blocked and
so the conditional independence property
(13.5) does not hold for the individual la-
tent chains of the factorial HMM model.
As a consequence, there is no efficient
exact E step for this model.
z(1)n− 1 z(1)n z(1)n+1
z(2)n− 1 z(2)n z(2)n+1
xn− 1 xn xn+1