Pattern Recognition and Machine Learning

(Jeff_L) #1
364 8. GRAPHICAL MODELS

Figure 8.5 This shows the same model as in Figure 8.4 but
with the deterministic parameters shown explicitly
by the smaller solid nodes.

tn

xn

N

w

α

σ^2

values, for example the variables{tn}from the training set in the case of polynomial
curve fitting. In a graphical model, we will denote suchobserved variablesby shad-
ing the corresponding nodes. Thus the graph corresponding to Figure 8.5 in which
the variables{tn}are observed is shown in Figure 8.6. Note that the value ofwis
not observed, and sowis an example of alatentvariable, also known as ahidden
variable. Such variables play a crucial role in many probabilistic models and will
form the focus of Chapters 9 and 12.
Having observed the values{tn}we can, if desired, evaluate the posterior dis-
tribution of the polynomial coefficientswas discussed in Section 1.2.5. For the
moment, we note that this involves a straightforward application of Bayes’ theorem

p(w|T)∝p(w)

∏N

n=1

p(tn|w) (8.7)

where again we have omitted the deterministic parameters in order to keep the nota-
tion uncluttered.
In general, model parameters such asware of little direct interest in themselves,
because our ultimate goal is to make predictions for new input values. Suppose we
are given a new input valuêxand we wish to find the corresponding probability dis-
tribution for̂tconditioned on the observed data. The graphical model that describes
this problem is shown in Figure 8.7, and the corresponding joint distribution of all
of the random variables in this model, conditioned on the deterministic parameters,
is then given by

p(̂t,t,w|̂x,x,α,σ^2 )=

[N

n=1

p(tn|xn,w,σ^2 )

]

p(w|α)p(̂t|̂x,w,σ^2 ). (8.8)

Figure 8.6 As in Figure 8.5 but with the nodes{tn}shaded
to indicate that the corresponding random vari-
ables have been set to their observed (training set)
values.

tn

xn

N

w

α

σ^2
Free download pdf