Bandit Algorithms

2.6 Conditional expectation 34

conditional expectation ofXgivenHis denoted byE[X|H] and defined to be anyH-measurable random variable on Ω such that for allH∈H, ∫

H

E[X|H]dP=

∫

H

XdP. (2.9)

Given a random variable Y, the conditional expectation ofX given Y is E[X|Y] =E[X|σ(Y)].

Theorem2.11. Given any probability space(Ω,F,P), a sub-σ-algebraHof Fand aP-integrable random variableX: Ω→R, there exist aH-measurable functionf: Ω→Rthat satisfies(2.9). Further, any twoH-measurable functions f 1 ,f 2 : Ω→Rthat satisfy(2.9)are equal with probability one:P(f 1 =f 2 ) = 1.

When random variablesXandY agree withP-probability one we say they
agreeP-almost surelyequal, which is often abbreviated to ‘X=YP-a.s.’ or
‘X=Ya.s.’ when the measure is clear from context. A related useful notion is
the concept ofnull sets:U∈Fis null set ofP, or aP-null set ifP(U) = 0. Thus,
X=YP-a.s. if and only ifX=Y agree except on aP-null set.

The reader may find it odd thatE[X|Y] is a random variable on Ω rather than the range ofY. Lemma 2.5 and the fact thatE[X|σ(Y)] isσ(Y)- measurable shows there exists a measurable functionf : (R,B(R))→ (R,B(R)) such thatE[X|σ(Y)](ω) = (f◦Y)(ω) (see Fig. 2.4). In this sense E[X|Y](ω) only depends onY(ω) and occasionally we writeE[X|Y](y).

(Ω,F)

(R,B(R)) (R,B(R))

Y E[X|Y]

f

Figure 2.4Factorization of conditional expectation. When there is no confusion we occasionally writeE[X|Y](y) in place off(y).

Returning to Example 2.9 we see thatE[X|Y]=E[X|σ(Y)]andσ(Y) = {{ 1 , 2 , 3 },{ 4 , 5 , 6 },∅,Ω}. The condition thatE[X|H] isH-measurable can only be satisfied ifE[X|H](ω) is constant on { 1 , 2 , 3 } and{ 4 , 5 , 6 }. Then (2.9) immediately implies that

E[X|H] (ω) =

{

2 , ifω∈{ 1 , 2 , 3 }; 5 , ifω∈{ 4 , 5 , 6 }.

While the definition of conditional expectations given above is non-constructive

Bandit Algorithms

∫

{

Get our desktop app

Company

Features

Documentation

Resources