Bandit Algorithms

2.6 Conditional expectation 33

Eq. (2.8) does not generalize to continuous random variables becauseP(Y=y) in the denominator might be zero for ally. For example, letY be a random variable taking values on [0,1] according to a uniform distribution andX∈{ 0 , 1 } be Bernoulli with biasY. This means that the joint measure onXandY is P(X= 1,Y∈[p,q])=

∫q pxdxfor 0≤p < q≤1. Intuitively it seems likeE[X|Y] should be equal toY, but how to define it? The mean of a Bernoulli random variable is equal to its bias so the definition of conditional probability shows that for 0≤p < q≤1,

E[X= 1|Y∈[p,q]] =P(X= 1|Y∈[p,q])

=

P(X= 1,Y∈[p,q]) P(Y∈[p,q])

=

q^2 −p^2 2(q−p) =

p+q 2

.

This calculation is not well defined whenp =qbecauseP(Y∈[p,p])= 0.
Nevertheless, lettingq=p+εforε >0 and taking the limit asεtends to zero
seems like a reasonable way to argue thatP(X= 1|Y=p)=p. Unfortunately
this approach does not generalize to abstract spaces because there is no canonical
way of taking limits towards a set of measure zero and different choices lead to
different answers.
Instead we use Eq. (2.8) as the starting point for an abstract definition of
conditional expectation as a random variable satisfying two requirements. First,
from Eq. (2.8) we see thatEX|Y should only depend onY(ω) and so
should be measurable with respect toσ(Y). The second requirement is called the
‘averaging property’. For measurableA⊆Ythe Eq. (2.8) shows that

E[IY− (^1) (A)E[X|Y]] =

∑

y∈A

P(Y=y)E[X|Y=y]

=

∑

y∈A

∑

x∈X

xP(X=x,Y=y)

=E[IY− (^1) (A)X].
This can be viewed as putting a set of linear constraints onE[X|Y] with one
constraint for each measurableA⊆ Y. By treatingE[X|Y] as an unknown
σ(Y)-measurable random variable, we can attempt to solve this linear system. As
it turns out, this can always be done: The linear constraints and the measurability
restriction onE[X|Y]completely determineE[X|Y] except for a set of measure
zero. Notice that both conditions only depend onσ(Y)⊆ F. The abstract
definition of conditional expectation takes these properties as the definition and
replaces the role ofY with a sub-σ-algebra.
Definition2.10 (Conditional expectation).Let (Ω,F,P) be a probability space
andX : Ω →Rbe random variable andHbe a sub-σ-algebra ofF. The

Bandit Algorithms

.

∑

=

∑

∑

Get our desktop app

Company

Features

Documentation

Resources