Bandit Algorithms

(Jeff_L) #1
3.2 Markov chains 47

3.2 Markov chains


A Markov chain is an infinite sequence of random elementsX 1 ,X 2 ,...where the
conditional distribution ofXt+1givenX 1 ,...,Xtis the same as the conditional
distribution ofXt+1givenXt. The sequence has the property that given the last
element, the history is irrelevant to ‘predict’ the future. Such random sequences
appear throughout probability theory and have many applications besides. The
theory is too rich to explain in detail, so we give the basics and point towards the
literature for more details at the end. The focus here is mostly on the definition
and existence of Markov chains.
Let (X,F) and (Y,G) be measurable spaces. Aprobability kernelorMarkov
kernelbetween (X,F) and (Y,G) is a functionK:X ×G →[0,1] such that:
(a)K(x,·) is a measure for allx∈X.
(b)K(·,A) isF-measurable for allA∈G.
The idea here is thatKdescribes a stochastic transition. Having arrived atx, a
process’s next state is sampledY∼K(x,·). Occasionally one sees the notation
Kx(A) orK(A|x) rather thanK(x,A).
IfK 1 is a (X,F)→(Y,G) probability kernel andK 2 is a (Y,G)→(Z,H)
probability kernel, then theproduct kernelK 1 ⊗K 2 is the probability kernel
from (X,F)→(Y ×Z,G×H) defined by

(K 1 ⊗K 2 )(x,A) =


Y


Z

IA((y,z))K 2 (y,dz)K 1 (x,dy).

WhenPis a measure on (X,F) andKis a kernel fromXtoY, thenP⊗Kis a
measure on (X ×Y,F ×G) defined by

(P⊗K)(A) =


X


Y

IA((x,y))K(x,dy)dP(x).

There operations can be composed. WhenPis a probability measure onXand
K 1 a kernel fromXtoYandK 2 a kernel fromX ×YtoZ, thenP⊗K 1 ⊗K 2
is a probability measure onX ×Y ×Z. The following provides a counterpart of
Theorem 3.2.
Theorem3.3 (Ionescu-Tulcea).Let(Ωn,Fn)∞n=1be a sequence of measurable
spaces and K 1 be a probability measure on(Ω 1 ,F 1 )and forn≥ 2 letKn
be a probability kernel from

∏n− 1
t=1Ωt toΩn. Then there exists a probability
space(Ω,F,P)and random elements(Xt)∞t=1 withXt : Ω →Ωt such that
PX 1 ,...,Xn=

⊗n
t=1Ktfor alln∈N

+.


Ahomogeneous Markov chainis a sequence of random elements (Xt)∞t=1
taking values instate spaceS= (X,F) and with
P(Xt+1∈·|X 1 ,...,Xt) =P(Xt+1∈·|Xt) =μ(Xt,·) almost surely,
whereμis a probability kernel from (X,F) to (X,F) and we assume that
P(X 1 ∈·) =μ 0 (·) for some measureμ 0 on (X,F).
Free download pdf