11.4 Regret analysis 149The ratio in the product can be rewritten in terms ofPtby
Wt
Wt− 1=
∑kj=1exp(ηSˆt− 1 ,j)
Wt− 1exp(ηXˆtj) =∑kj=1Ptjexp(ηXˆtj). (11.11)We need the following facts:
exp(x)≤1 +x+x^2 for allx≤1 and 1 +x≤exp(x) for allx∈R.Using these two inequalities leads toWt
Wt− 1≤1 +η∑kj=1PtjXˆtj+η^2∑kj=1PtjXˆ^2 tj≤exp
η∑kj=1PtjXˆtj+η^2∑kj=1PtjXˆtj^2
. (11.12)
Notice that this was only possible becauseXˆtjis defined by Eq. (11.6), which
ensures thatXˆtj ≤1 and would not have been true had we used Eq. (11.3).
Combining Eq. (11.12) and Eq. (11.9),exp(
ηSˆni)
≤kexp
ηSˆn+η^2∑nt=1∑kj=1PtjXˆ^2 tj
.
Taking the logarithm of both sides, dividing byη >0 and reordering gives
Sˆni−Sˆn≤log(k)
η+η∑nt=1∑kj=1PtjXˆtj^2. (11.13)As noted earlier, the expectation of the left-hand side isRni. The first term on
the right-hand side is a constant, which leaves us to bound the expectation of the
second term. Lettingytj= 1−xtjandYt= 1−Xtand expanding the definition