Pattern Recognition and Machine Learning

(Jeff_L) #1
10.1. Variational Inference 471

Note that the true posterior distribution does not factorize in this way. The optimum
factorsqμ(μ)andqτ(τ)can be obtained from the general result (10.9) as follows.
Forqμ(μ)we have

lnqμ(μ)=Eτ[lnp(D|μ, τ)+lnp(μ|τ)]+const

= −

E[τ]
2

{

λ 0 (μ−μ 0 )^2 +

∑N

n=1

(xn−μ)^2

}

+const. (10.25)

Completing the square overμwe see thatqμ(μ)is a GaussianN

(
μ|μN,λ−N^1

)
with
Exercise 10.7 mean and precision given by


μN =

λ 0 μ 0 +Nx
λ 0 +N

(10.26)

λN =(λ 0 +N)E[τ]. (10.27)

Note that forN →∞this gives the maximum likelihood result in whichμN=x
and the precision is infinite.
Similarly, the optimal solution for the factorqτ(τ)is given by

lnqτ(τ)=Eμ[lnp(D|μ, τ)+lnp(μ|τ)]+lnp(τ)+const

=(a 0 −1) lnτ−b 0 τ+

N

2

lnτ


τ
2


[N

n=1

(xn−μ)^2 +λ 0 (μ−μ 0 )^2

]
+const (10.28)

and henceqτ(τ)is a gamma distributionGam(τ|aN,bN)with parameters

aN = a 0 +

N

2

(10.29)

bN = b 0 +

1

2


[N

n=1

(xn−μ)^2 +λ 0 (μ−μ 0 )^2

]

. (10.30)


Exercise 10.8 Again this exhibits the expected behaviour whenN→∞.
It should be emphasized that we did not assume these specific functional forms
for the optimal distributionsqμ(μ)andqτ(τ). They arose naturally from the structure
Section 10.4.1 of the likelihood function and the corresponding conjugate priors.
Thus we have expressions for the optimal distributionsqμ(μ)andqτ(τ)each of
which depends on moments evaluated with respect to the other distribution. One ap-
proach to finding a solution is therefore to make an initial guess for, say, the moment
E[τ]and use this to re-compute the distributionqμ(μ). Given this revised distri-
bution we can then extract the required momentsE[μ]andE[μ^2 ], and use these to
recompute the distributionqτ(τ), and so on. Since the space of hidden variables for
this example is only two dimensional, we can illustrate the variational approxima-
tion to the posterior distribution by plotting contours of both the true posterior and
the factorized approximation, as illustrated in Figure 10.4.

Free download pdf