11.4 Regret analysis 149
The ratio in the product can be rewritten in terms ofPtby
Wt
Wt− 1
=
∑k
j=1
exp(ηSˆt− 1 ,j)
Wt− 1
exp(ηXˆtj) =
∑k
j=1
Ptjexp(ηXˆtj). (11.11)
We need the following facts:
exp(x)≤1 +x+x^2 for allx≤1 and 1 +x≤exp(x) for allx∈R.
Using these two inequalities leads to
Wt
Wt− 1
≤1 +η
∑k
j=1
PtjXˆtj+η^2
∑k
j=1
PtjXˆ^2 tj
≤exp
η
∑k
j=1
PtjXˆtj+η^2
∑k
j=1
PtjXˆtj^2
. (11.12)
Notice that this was only possible becauseXˆtjis defined by Eq. (11.6), which
ensures thatXˆtj ≤1 and would not have been true had we used Eq. (11.3).
Combining Eq. (11.12) and Eq. (11.9),
exp
(
ηSˆni
)
≤kexp
ηSˆn+η^2
∑n
t=1
∑k
j=1
PtjXˆ^2 tj
.
Taking the logarithm of both sides, dividing byη >0 and reordering gives
Sˆni−Sˆn≤log(k)
η
+η
∑n
t=1
∑k
j=1
PtjXˆtj^2. (11.13)
As noted earlier, the expectation of the left-hand side isRni. The first term on
the right-hand side is a constant, which leaves us to bound the expectation of the
second term. Lettingytj= 1−xtjandYt= 1−Xtand expanding the definition