Robert_V._Hogg,_Joseph_W._McKean,_Allen_T._Craig

(Jacob Rumans) #1
6.6. The EM Algorithm 409

An estimate of this expectation is the likelihood ofxibeing drawn from distribution
f 2 (x), which is given by

γi=

̂f 2 , 0 (xi)
(1−̂ )f 1 , 0 (xi)+̂ f 2 , 0 (xi)

, (6.6.20)

where the subscript 0 signifies that the parameters atθ 0 are being used. Expression
(6.6.20) is intuitively evident; see McLachlan and Krishnan (1997) for more discus-
sion. Replacingwibyγiin expression (6.6.19), the M step of the algorithm is to
maximize
Q(θ|θ 0 ,x)=


∑n

i=1

[(1−γi)logf 1 (xi)+γilogf 2 (xi)]. (6.6.21)

This maximization is easy to obtain by taking partial derivatives ofQ(θ|θ 0 ,x)with
respect to the parameters. For example,


∂Q
∂μ 1
=

∑n

i=1

(1−γi)(− 1 / 2 σ^21 )(−2)(xi−μ 1 ).

Setting this to 0 and solving forμ 1 yields the estimate ofμ 1. The estimates of the
other mean and the variances can be obtained similarly. These estimates are


̂μ 1 =

∑n
∑i=1(1−γi)xi
n
i=1(1−γi)

̂σ 12 =

∑n
i=1(1−γi)(xi−μ̂^1 )

2
∑n
i=1(1−γi)

̂μ 2 =

∑n
∑i=1γixi
n
i=1γi

̂σ 22 =

∑n
i=1γi(xi−μ̂^2 )

2
∑n
i=1γi

.

Sinceγiis an estimate ofP[Wi=1|θ 0 ,x], the averagen−^1

∑n
i=1γiis an estimate
of =P[Wi= 1]. This average is our estimate of̂.


EXERCISES

6.6.1.Rao (page 368, 1973) considers a problem in the estimation of linkages in
genetics. McLachlan and Krishnan (1997) also discuss this problem and we present
their model. For our purposes, it can be described as a multinomial model with the
four categoriesC 1 ,C 2 ,C 3 ,andC 4. For a sample of sizen,letX=(X 1 ,X 2 ,X 3 ,X 4 )′

denote the observed frequencies of the four categories. Hence,n=


∑ 4
i=1Xi.The
probability model is


C 1 C 2 C 3 C 4
1
2 +

1
4 θ

1
4 −

1
4 θ

1
4 −

1
4 θ

1
4 θ
Free download pdf