Bandit Algorithms

This material will be published by Cambridge University Press as Bandit Algorithms by Tor Lattimore and Csaba Szepesvari. This pre-publication version is free to view and download for personal use only. Not for re-distribution, sale, or use in derivative works.©Tor Lattimore and Csaba Szepesvari 2017. The latest version is available athttp://banditalgs.com. Feedback on any aspect is very welcome: [email protected]

3 Stochastic Processes and Markov Chains ( )

The measure-theoretic probability in the previous chapter covers almost all the definitions required. Occasionally, however, infinite sequences of random variables arise and for these a little more machinery is needed. We expect most readers will skip this chapter on the first reading, perhaps referring to it when necessary. Before one can argue about the properties of infinite sequences of random variables it must be demonstrated that such sequences exist under certain constraints on their joint distributions. For example, does there exist an infinite sequence of random variables such that any finite subset are independent and distributed like a standard Gaussian. The first theorem provides conditions under which questions like this can be answered positively. This allows us to write, for example, “let (Xn)∞n=1be an infinite sequence of independent standard Gaussian random variables” and be comfortable knowing there exists a probability space on which these random variables can be defined. To state the theorem we need the concept ofBorel spaces, which we introduce next. Two measurable spaces (X,F) and (Y,G) are said to beisomorphicif there exists a bijective functionf:X →Ysuch thatfisF/G-measurable andf−^1 is G/F-measurable. ABorel spaceis a measurable space (X,F) that is isomorphic to (A,B(A)) withA∈B(R) a Borel measurable subset of the reals. This is not a very strong assumption. For example, (Rn,B(Rn)) is a Borel space, along with all of its measurable subsets.

Theorem3.1.Letμbe a probability measure on a Borel spaceSandλbe the Lebesgue measure on([0,1],B([0,1]). Then there exists a sequence of independent random elementsX 1 ,X 2 ,...on([0,1],B([0,1]),λ)such that the lawλXt=μfor al lt.

We give a sketch of the proof because, although it is not really relevant for the material in this book, it illustrates the general picture and dispels some of the mystic about what is really going on. Exercise 3.1 asks you to provide the missing steps from the proof.

Proof sketch of Theorem 3.1 For simplicity we consider only the case that S= ([0,1],B([0,1])) andμis the Lebesgue measure. For anyx∈[0,1] let F 1 (x),F 2 (x),...be the binary expansion ofx, which is the unique binary-valued

Bandit Algorithms

3 Stochastic Processes and Markov Chains ( )

Get our desktop app

Company

Features

Documentation

Resources