Robert_V._Hogg,_Joseph_W._McKean,_Allen_T._Craig

6.6. The EM Algorithm 409

An estimate of this expectation is the likelihood ofxibeing drawn from distribution f 2 (x), which is given by

γi=

̂f 2 , 0 (xi) (1−̂ )f 1 , 0 (xi)+̂f 2 , 0 (xi)

, (6.6.20)

where the subscript 0 signifies that the parameters atθ 0 are being used. Expression
(6.6.20) is intuitively evident; see McLachlan and Krishnan (1997) for more discus-
sion. Replacingwibyγiin expression (6.6.19), the M step of the algorithm is to
maximize
Q(θ|θ 0 ,x)=

∑n

i=1

[(1−γi)logf 1 (xi)+γilogf 2 (xi)]. (6.6.21)

This maximization is easy to obtain by taking partial derivatives ofQ(θ|θ 0 ,x)with
respect to the parameters. For example,

∂Q ∂μ 1 =

∑n

i=1

(1−γi)(− 1 / 2 σ^21 )(−2)(xi−μ 1 ).

Setting this to 0 and solving forμ 1 yields the estimate ofμ 1. The estimates of the
other mean and the variances can be obtained similarly. These estimates are

̂μ 1 =

∑n ∑i=1(1−γi)xi n i=1(1−γi)

̂σ 12 =

∑n i=1(1−γi)(xi−μ̂^1 )

2 ∑n i=1(1−γi)

̂μ 2 =

∑n ∑i=1γixi n i=1γi

̂σ 22 =

∑n i=1γi(xi−μ̂^2 )

2 ∑n i=1γi

.

Sinceγiis an estimate ofP[Wi=1|θ 0 ,x], the averagen−^1

∑n
i=1γiis an estimate
of =P[Wi= 1]. This average is our estimate of̂.

EXERCISES

6.6.1.Rao (page 368, 1973) considers a problem in the estimation of linkages in genetics. McLachlan and Krishnan (1997) also discuss this problem and we present their model. For our purposes, it can be described as a multinomial model with the four categoriesC 1 ,C 2 ,C 3 ,andC 4. For a sample of sizen,letX=(X 1 ,X 2 ,X 3 ,X 4 )′

denote the observed frequencies of the four categories. Hence,n=

∑ 4
i=1Xi.The
probability model is

C 1 C 2 C 3 C 4 1 2 +

1 4 θ

1 4 −

1 4 θ

1 4 −

1 4 θ

Robert_V._Hogg,_Joseph_W._McKean,_Allen_T._Craig

Get our desktop app

Company

Features

Documentation

Resources