37.6 Upper bound for easy games 467By part (d) of Lemma 37.14 we havePtb≥γ/kfor alltandb∈[k]. Furthermore,
|vaj(At,Φt)|≤Vso thatη|Zˆtja|=|ηP ̃tjvaj(At,Φt)/PtAt|≤ηV k/γ= 1, where
the equality follows from the choice ofγ. Assumption(b)is satisfied in a similar
way withηβtja=η^2 V^2
∑
b∈NajP ̃
tj^2 /Ptb≤η^2 k^2 V^2 /γ=ηV ≤1, where in the
last inequality we used the definition ofηand assumed thatn≥log(k/δ). To
make sure that the regret bound holds even for smaller values ofn, we require
CG≥k√
log(ek)so that whenn < k^2 log(k/δ), the regret bound is trivial. For
assumption(c), we haveEt− 1 [Zˆtja^2 ] =Et− 1[(
P ̃tjvaj(At,Φt)
PtAt) 2 ]
≤V^2 P ̃tj^2 Et− 1[
INaj(At)
PtA^2 t]
=V^2
∑
b∈NajP ̃tj^2
Ptb=
βtja
η.
Finally(d)is satisfied by the definition ofvajand the fact thatPt∈ri(Pk− 1 ).
The result of Exercise 12.2 shows that with probability at least 1−(k+ 1)δ,
Rˆnj≤3 log(1/δ)
η+ 5
∑nt=1∑
a∈Nj∩AQtjaβtja+η∑nt=1∑
a∈Nj∩AQtjaZˆtja^2.Step 2: Aggregating the local regret
Using the result from the previous step in combination with a union bound over
j∈Awe have that with probability at least 1−k(k+ 1)δ,
∑j∈ARˆnj≤^3 klog(1/δ)
η+ 5
∑nt=1∑
j∈A∑
a∈Nj∩AQtjaβtja+η∑nt=1∑
j∈A∑
a∈Na∩AQtjaZˆtja^2. (37.18)the second term is bounded using the definition ofβtjain Eq. (37.12).
∑a∈Nj∩AQtjaβtja=ηV^2∑
a∈Nj∩AQtja∑
b∈NajP ̃tj^2
Ptb=ηV^2 P ̃tj∑
a∈Nj∩AQtja∑
b∈NajP ̃tj
Ptb.
Next split the sum that runs overb∈Najinto two, separating duplicates ofj
and the rest:
∑a∈Nj∩AQtja∑
b∈NajP ̃tj
Ptb=
∑
a∈Nj∩AQtja∑
b:`b=`jP ̃tj
Ptb+
∑
a∈Nj∩AQtja∑
b∈Naj:`b 6 =`jP ̃tj
Ptb=
∑
b:`b=`jP ̃tj
Ptb+
∑
a∈Nj∩A∑
b∈Naj:`b 6 =`jQtjaP ̃tj
Ptb≤ 4 k
∑
b:`b=`j1 +
∑
a∈Nj∩A∑
b∈Naj:`b 6 =`j1
≤ 4 k^2 , (37.19)