Bandit Algorithms

34.2 Bayesian learning and the posterior distribution 398

a measure. By assuming that (Θ,G) is a Borel space this issue can be overcome by using a regular version (Theorem 3.11), a result which we restate here using the present notation.

Theorem34.1.If(Θ,G)is a Borel space, then there exists a probability kernel
Q:X×G →[0,1]such thatQ(A|X) =P(θ∈A|X)simultaneously for al lA∈G
outside of someP-null set. Furthermore, for any two probability kernelsQ,Q′
satisfying this condition,Q(·|x) =Q′(·|x)for al lxin some set ofPX-probability
one.

The posterior density
Theorem 34.1 provides weak conditions under which a posterior exists, but does
not suggest a useful way of finding it. In many practical situations the posterior
can be calculated using densities. Givenθ∈Θ letpθbe the Radon-Nikodym
derivative ofPθwith respect to some measureμand letq(θ) be the Radon-
Nikodym derivative ofQwith respect to another measureν. Provided all terms
are appropriately measurable and nonzero, then

q(θ|x) = ∫ pθ(x)q(θ) Θpθ(x)q(θ)dν(θ)

(34.2)

is the Radon-Nikodym derivative ofQ(·|x) with respect toν. In other words, for anyA∈Git holds thatQ(A|x) =

∫

Aq(θ|x)dν(θ). This corresponds to the
usual manipulation of densities whenμandνare the Lebesgue measures.
The reader may wonder about why all the fuss about the existence ofQ(·|x)
in the previous section if we can get its density with a simple formula like(34.2)?
In other words, why not flip around things and defineQ(·|x) via(34.2)? The
crux of the problem is that oftentimes it is hard to come up with an appropriate
dominating measureμand in general the denominator in the right-hand side of
(34.2)could be zero from some particular value ofx. But when we can identify
an appropriate measureμand the denominators are nonzero, the above formula
can indeed be used as the definition ofQ(·|x) (Exercise 34.3).

The nonuniqueness issue frequentists need to resolve
A minor annoyance when using Bayesian methods as part of a frequentist argument
is that the posterior need not be unique.

Example34.2.Let Θ = [0,1] andQbe the uniform measure on (Θ,B(Θ)) andPθ=δθbe the Dirac measure on [0,1] atθ. Further, letX: [0,1]→[0,1] be the identity:X(x) =xfor allx∈[0,1]. The following posterior satisfies the conditions of Theorem 34.1 for any countable setC⊂[0,1] and probability measureμon ([0,1],B(R)):

Q(A|x) =

{

δx(A), ifx /∈C; μ(A), ifx∈C.

A true Bayesian is unconcerned. Ifθis sampled from the priorQ, then the event

Bandit Algorithms

(34.2)

∫

{

Get our desktop app

Company

Features

Documentation

Resources