12.2 Regret analysis 162Proof LetAti=I{At=i}as before. WritingYt=∑
jAtjytj, we calculateYt−∑kj=1PtjYˆtj=∑kj=1(
1 −
Ptj
Ptj+γ)
Atjytj=γ∑kj=1Atj
Ptj+γytj=γ∑kj=1Yˆtj.ThereforeL ̃n−Lˆn=γ∑k
j=1Lˆnjas required.
Proof of Theorem 12.1 By Eq. (12.3) and Lemma 12.4 we haveRˆn≤log(k)
η+ (L ̃n−Lˆn) + max
i∈[k](Lˆni−Lni) +η
2∑kj=1Lˆnj=
log(k)
η+ max
i∈[k](Lˆni−Lni) +(η
2+γ)∑kj=1Lˆnj.Therefore by Lemma 12.3, with probability at least 1−δit holds thatRˆn≤log(k)
η+
log(k+1
δ)
2 γ+
(
γ+η
2)
∑kj=1Lnj+log(k+1
δ)
2 γ
≤
log(k)
η +(
γ+η
2)
nk+(
γ+η
2 + 1)log(k+1
δ)
2 γ ,
where the second inequality follows sinceLnj≤nfor allj. The result follows by
substituting the definitions ofη∈{η 1 ,η 2 }andγ=η/2.12.2.1 Proof of Lemma 12.2
We start with a technical inequality.Lemma12.5.For any 0 ≤x≤ 2 λit holds thatexp(
x
1 +λ)
≤1 +x.Note that 1 +x≤exp(x). What the lemma shows is that by slightly discounting
the argument of the exponential function, in a bounded neighborhood of zero,
1 +xcan be an upper bound for the resulting function. Or, equivalently, slightly
inflating the linear term in 1 +x, the linear lower bound becomes an upper bound.
Proof of Lemma 12.5 We haveexp(
x
1 +λ)
≤exp(
x
1 +x/ 2)
≤1 +x,where the first inequality is becauseλ7→exp(
x
1+λ)
is decreasing inλ, and the
second is because1+^2 uu≤log(1 + 2u) holds for allu≥0. This latter inequality
can be seen to hold by noting that foru= 0 the two sides are equal, while the
derivative of the left-hand side is smaller than that of the right-hand side at any
u≥0.