Pattern Recognition and Machine Learning

(Jeff_L) #1
Exercises 177

3.20 ( ) www Starting from (3.86) verify all of the steps needed to show that maxi-
mization of the log marginal likelihood function (3.86) with respect toαleads to the
re-estimation equation (3.92).


3.21 ( ) An alternative way to derive the result (3.92) for the optimal value ofαin the
evidence framework is to make use of the identity


d

ln|A|=Tr

(
A−^1

d

A

)

. (3.117)


Prove this identity by considering the eigenvalue expansion of a real, symmetric
matrixA, and making use of the standard results for the determinant and trace of
Aexpressed in terms of its eigenvalues (Appendix C). Then make use of (3.117) to
derive (3.92) starting from (3.86).

3.22 ( ) Starting from (3.86) verify all of the steps needed to show that maximiza-
tion of the log marginal likelihood function (3.86) with respect toβleads to the
re-estimation equation (3.95).


3.23 ( ) www Show that the marginal probability of the data, in other words the
model evidence, for the model described in Exercise 3.12 is given by


p(t)=

1

(2π)N/^2

ba 00
baNN

Γ(aN)
Γ(a 0 )

|SN|^1 /^2

|S 0 |^1 /^2

(3.118)

by first marginalizing with respect towand then with respect toβ.

3.24 ( ) Repeat the previous exercise but now use Bayes’ theorem in the form


p(t)=

p(t|w,β)p(w,β)
p(w,β|t)

(3.119)

and then substitute for the prior and posterior distributions and the likelihood func-
tion in order to derive the result (3.118).
Free download pdf