Bandit Algorithms

(Jeff_L) #1
2.7 Notes 39

is thatQis forcingP(to be nil when it is nil). One can also remember the
direction ofby thinking (symbolically) of dividing both sides ofPQby
Qto givedP/dQ <∞, that is, the existence of the Radon-Nykodim derivative
ofPwith respect toQ.
11 A useful result for Radon-Nikodym derivatives is thechain rule, which states
that ifPQS, thendPdQdQdS=dPdS. The proof of this result follows from our
earlier observation that



fdQ=


fdQdSdSfor anyQ-integrablef. Indeed, the
chain rule is obtained form this by takingf=IAdPdQwithA∈Fand noting
that this is indeedQ-integrable and


IAdPdQdQ=


IAdQ. The chain rule is
often used to reduce the calculation of densities to calculation with known
densities.
12 The Radon-Nikodym derivative unifies the notions of distribution (for discrete
spaces) and density (for continuous spaces). Let Ω be discrete (finite or
countable) and letρbe thecounting measureon (Ω, 2 Ω), which is defined by
ρ(A) =|A|. For anyPon (Ω,F) it is easy to see thatPρanddPdρ(i) =P({i}),
which is sometimes called the distribution function ofP.
13 The Radon-Nikodym derivative provides one way to define the conditional
expectation. LetXbe an integrable random variable on (Ω,F,P) andH⊂F
be a sub-σ-algebra andP|Hbe the restriction ofPto (Ω,H). Define measure
μon (Ω,H) byμ(A) =



AXdP|H. It is easy to check thatμP|Hand
thatE[X|H] = ddμP|H satisfies Eq. (2.9). We note that the proof of the
Radon-Nikodym theorem is nontrivial and that the existence of conditional
expectations are more easily guaranteed via an ‘elementary’ but abstract
argument using functional analysis.
14 TheFubini-Tonelli theoremis a powerful result that allows one to exchange
the order of integrations. This result is needed for example for proving
Proposition 2.8 (Exercise 2.18). To state it, we need to introduceproduct
measures. These work as expected: Given two probability spaces, (Ω 1 ,F 1 ,P 1 )
and (Ω 2 ,F 2 ,P 2 ), the product measure PofP 1 andP 2 is defined as any
measure on (Ω 1 ×Ω 2 ,F 1 ⊗F 2 ) that satisfiesP(A 1 ,A 2 ) =P 1 (A 1 )P 2 (A 2 ) for
all (A 1 ,A 2 )∈ F 1 ×F 2 (recall thatF 1 ⊗F 2 =σ(F 1 ×F 2 ) is the product
σ-algebra ofF 1 andF 2 ). Theorem 2.4 implies that this product measure,
which is often denoted byP 1 ×P 2 (orP 1 ⊗P 2 ) is uniquely defined. (Think
about what this product measure has to do with independence.) The Fubini-
Tonelli theorem (often just ‘Fubini’) states the following: Let (Ω 1 ,F 1 ,P 1 ) and
(Ω 2 ,F 2 ,P 2 ) be two probability spaces and consider a random variableXon
the product probability space (Ω,F,P) = (Ω 1 ×Ω 2 ,F 1 ⊗ F 2 ,P 1 ×P 2 ). If
any of the three integrals



|X(ω)|dP(ω),


(


∫ |X(ω^1 ,ω^2 )|dP^1 (ω^1 ))dP^2 (ω^2 ),
(


|X(ω 1 ,ω 2 )|dP 2 (ω 2 )) dP 1 (ω 1 ) is finite, then

X(ω) dP(ω) =

∫(∫


X(ω 1 ,ω 2 ) dP 1 (ω 1 )

)


dP 2 (ω 2 )

=

∫(∫


X(ω 1 ,ω 2 ) dP 2 (ω 2 )

)


dP 1 (ω 1 ).
Free download pdf