Bandit Algorithms

3.4 Notes 51

where Θ is the parameter space andPθis a measure on some measurable space
(Ω,F). This notation is often more convenient than writingP(θ,·). In Bayesian
statistics the posterior is a probability kernel from the observation space to
the parameter space and this is often written asP(·|x).
3 There is some disagreement about whether or not a Markov chain on an
uncountable state space should instead be called aMarkov process. In this
book we use Markov chain for arbitrary state spaces and discrete time. When
time is continuous (which it never is in this book), there is general agreement
that ‘process’ is more appropriate. For more history on this debate see [Meyn
and Tweedie, 2012, preface].
4 A topological spaceX isPolishif (a) it is separable and (b) there exists a
metric onXthat induces the topology that makes (X,d) a complete metric
space. All Polish spaces are Borel spaces.
5 In Theorem 3.2 it was assumed that eachμnwas defined on a Borel space.
No such assumption was required for Theorem 3.3, however. One can derive
Theorem 3.2 from Theorem 3.3 by using the existence of regular conditional
probability measures when conditioning on random elements taking values
in a Borel space (see the next note). Topological assumptions often creep
into foundational questions relating to the existence of probability measures
satisfying certain conditions and pathological examples show these assumptions
cannot be removed completely. Luckily, in this book we have no reason to
consider random elements that do not take values in a Borel space.
6 The fact that conditional expectation is only unique almost surely can be
problematic when you want a conditional distribution. Given random elements
XandY on the same probability space it seems reasonable to hope that
P(X∈·|Y)is a probability kernel from the space ofY to that ofX. A
version of the conditional distributions that satisfies this is called aregular
version. In general, there is no guarantee that such a regular version exist. The
basic properties of conditional expectation only guarantee that for any fixed
measurableA,P(X∈A|Y)is unique up to a set of measure zero. The set of
measure zero can depend onA, which causes problems when there are ‘too
many’ measurable sets in the space ofX. AssumingXlives in a Borel space,
the following theorem guarantees the existence of a conditional distribution.

Theorem3.11 (Regular conditional distributions).LetXandYbe random elements on the same probability space(Ω,F,P)taking values in measurable spacesXandYrespectively and assume thatXis Borel. Then there exists a probability kernelKfromYtoXsuch thatK(·|Y) =P(X∈·|Y)P-almost surely. Furthermore,Kis unique in the sense that for any kernelsK 1 andK 2 satisfying this condition it holds thatK 1 (·|y) =K 2 (·|y)for al lyin some set ofPY-measure one.

You can also condition on aσ-algebraG ⊂Fin which caseKis a probability kernel from (Ω,G) toX. The condition thatXbe Borel is sufficient, but not necessary. Some conditions are required, however. An example where no regular

Bandit Algorithms

Get our desktop app

Company

Features

Documentation

Resources