Robert_V._Hogg,_Joseph_W._McKean,_Allen_T._Craig

(Jacob Rumans) #1
6.6. The EM Algorithm 405

mutually independent. The conditional notation will prove useful here. Letg(x|θ)
denote the joint pdf ofX.Leth(x,z|θ) denote the joint pdf of the observed and
unobserved items. Letk(z|θ,x) denote the conditional pdf of the missing data given
the observed data. By the definition of a conditional pdf, we have the identity


k(z|θ,x)=

h(x,z|θ)
g(x|θ)

. (6.6.1)


Theobserved likelihoodfunction isL(θ|x)=g(x|θ). Thecomplete likelihood
function is defined by
Lc(θ|x,z)=h(x,z|θ). (6.6.2)


Our goal is maximize the likelihood functionL(θ|x) by using the complete likelihood
Lc(θ|x,z) in this process.
Using (6.6.1), we derive the following basic identity for an arbitrary but fixed
θ 0 ∈Ω:


logL(θ|x)=


logL(θ|x)k(z|θ 0 ,x)dz

=


logg(x|θ)k(z|θ 0 ,x)dz

=


[logh(x,z|θ)−logk(z|θ,x)]k(z|θ 0 ,x)dz

=


log[h(x,z|θ)]k(z|θ 0 ,x)dz−


log[k(z|θ,x)]k(z|θ 0 ,x)dz

= Eθ 0 [logLc(θ|x,Z)|θ 0 ,x]−Eθ 0 [logk(Z|θ,x)|θ 0 ,x], (6.6.3)

where the expectations are taken under the conditional pdfk(z|θ 0 ,x). Define the
first term on the right side of (6.6.3) to be the function

Q(θ|θ 0 ,x)=Eθ 0 [logLc(θ|x,Z)|θ 0 ,x]. (6.6.4)

The expectation that defines the functionQis called theEstep of the EM algorithm.
Recall that we want to maximize logL(θ|x). As discussed below, we need only
maximizeQ(θ|θ 0 ,x). This maximization is called theMstep of the EM algorithm.
Denote byθ̂(0)an initial estimate ofθ, perhaps based on the observed likelihood.
Let̂θ(1)be the argument that maximizesQ(θ|̂θ(0),x). This is the first-step estimate


ofθ. Proceeding this way, we obtain a sequence of estimatesθ̂(m). We formally
define this algorithm as follows:


Algorithm 6.6.1(EM Algorithm).Letθ̂(m)denote the estimate on themthstep.
To compute the estimate on the(m+1)ststep, do



  1. Expectation Step: Compute


Q(θ|θ̂(m),x)=Eθb(m)[logLc(θ|x,Z)|θ̂m,x], (6.6.5)

where the expectation is taken under the conditional pdfk(z|θ̂(m),x).
Free download pdf