34.3 Conjugate pairs, conjugate priors and the exponential family 402
34.3.1 Sequences of random variables and the Markov chain view
LetP = (Pθ:θ∈Θ) be a Markov kernel from (Θ,G) to (X,H). Then let
Pn= (Pθn:θ∈Θ) be the unique Markov kernel from (Θ,G)→(Xn,Hn) defined
on rectangles by
Pθn(B 1 ×···×Bn) =
∏n
t=1
Pθ(Bt).
Given a priorQon (Θ,G), letθ∈Θ be a random element andX 1 ,X 2 ,...,Xn
be a sequence of random elements on some probability space (Ω,F,P) such that
(a)P(θ∈·) =Q(·).
(b)P(X 1 ,...,Xn∈B 1 ×···×Bn|θ) =Pθn(B 1 ×···×Bn) almost surely for all
B 1 ,...,Bn∈H.
In words, givenθwhose distribution isQ,X 1 ,...,Xnare independent and share
the common distributionPθ. The posterior aftertobservations is a Markov kernel
Qtfrom (Xt,Ht) to (Θ,G) such that
E[I{θ∈B}|X 1 ,...,Xt] =Qt(B|X 1 ,...,Xt) almost surely.
Then the conditional distribution ofXt+1givenX 1 ,...,Xtis
P(Xt+1∈B|X 1 ,...,Xt) =
∫
Θ
Pθ(B)Qt(dθ|X 1 ,...,Xt).
This means the conditional distribution ofXt+1can be written in terms of the
model and posterior. In many cases the posterior has a simple form that makes
this view rather convenient. The sequence of measuresQ 0 ,Q 1 ,...,Qncan be
viewed as a Markov chain on the spaces of measures on (Θ,G).
Example34.6. Suppose Θ = [0,1] andG =B([0,1]) andQ=Beta(α,β)
andPθ=B(θ) is Bernoulli. Then the posterior aftertobservations isQt=
Beta(α+St,β+t−St) whereSt=
∑t
s=1Xs. Furthermore,E[Xt+1|X^1 ,...,Xt] =
EQt[Xt+1] = (α+St)/(α+β+t) and hence
P(St+1=St+ 1|St) =
α+St
α+β+t
P(St+1=St|St) = 1−P(St+1=St+ 1).
So the posterior aftertobservations is a Beta distribution depending onStand
S 1 ,S 2 ,...,Snfollows a Markov chain evolving according to the above display.
Example34.7. Let (Θ,G) = (R,B(R)) andQ=N(μ,σ^2 ) andPθ=N(θ,1).
Then using the same notation as above the posterior is almost surelyQt=
N(μt,σ^2 t) where
μt=
μ/σ^2 +St
1 /σ^2 +t
σ^2 t=
(
1
σ^2
+t