37.6 Upper bound for easy games 467
By part (d) of Lemma 37.14 we havePtb≥γ/kfor alltandb∈[k]. Furthermore,
|vaj(At,Φt)|≤Vso thatη|Zˆtja|=|ηP ̃tjvaj(At,Φt)/PtAt|≤ηV k/γ= 1, where
the equality follows from the choice ofγ. Assumption(b)is satisfied in a similar
way withηβtja=η^2 V^2
∑
b∈NajP ̃
tj^2 /Ptb≤η^2 k^2 V^2 /γ=ηV ≤1, where in the
last inequality we used the definition ofηand assumed thatn≥log(k/δ). To
make sure that the regret bound holds even for smaller values ofn, we require
CG≥k
√
log(ek)so that whenn < k^2 log(k/δ), the regret bound is trivial. For
assumption(c), we have
Et− 1 [Zˆtja^2 ] =Et− 1
[(
P ̃tjvaj(At,Φt)
PtAt
) 2 ]
≤V^2 P ̃tj^2 Et− 1
[
INaj(At)
PtA^2 t
]
=V^2
∑
b∈Naj
P ̃tj^2
Ptb
=
βtja
η
.
Finally(d)is satisfied by the definition ofvajand the fact thatPt∈ri(Pk− 1 ).
The result of Exercise 12.2 shows that with probability at least 1−(k+ 1)δ,
Rˆnj≤3 log(1/δ)
η
+ 5
∑n
t=1
∑
a∈Nj∩A
Qtjaβtja+η
∑n
t=1
∑
a∈Nj∩A
QtjaZˆtja^2.
Step 2: Aggregating the local regret
Using the result from the previous step in combination with a union bound over
j∈Awe have that with probability at least 1−k(k+ 1)δ,
∑
j∈A
Rˆnj≤^3 klog(1/δ)
η
+ 5
∑n
t=1
∑
j∈A
∑
a∈Nj∩A
Qtjaβtja
+η
∑n
t=1
∑
j∈A
∑
a∈Na∩A
QtjaZˆtja^2. (37.18)
the second term is bounded using the definition ofβtjain Eq. (37.12).
∑
a∈Nj∩A
Qtjaβtja=ηV^2
∑
a∈Nj∩A
Qtja
∑
b∈Naj
P ̃tj^2
Ptb
=ηV^2 P ̃tj
∑
a∈Nj∩A
Qtja
∑
b∈Naj
P ̃tj
Ptb
.
Next split the sum that runs overb∈Najinto two, separating duplicates ofj
and the rest:
∑
a∈Nj∩A
Qtja
∑
b∈Naj
P ̃tj
Ptb
=
∑
a∈Nj∩A
Qtja
∑
b:`b=`j
P ̃tj
Ptb
+
∑
a∈Nj∩A
Qtja
∑
b∈Naj:`b 6 =`j
P ̃tj
Ptb
=
∑
b:`b=`j
P ̃tj
Ptb
+
∑
a∈Nj∩A
∑
b∈Naj:`b 6 =`j
QtjaP ̃tj
Ptb
≤ 4 k
∑
b:`b=`j
1 +
∑
a∈Nj∩A
∑
b∈Naj:`b 6 =`j
1
≤ 4 k^2 , (37.19)