37.7 Proof of the classification theorem 468
where the first equality used that
∑
aQtja= 1, the second to last inequality
follows using parts (c) and (e) of Lemma 37.14, stationarity ofP ̃t, and the last
inequality uses a simple counting argument. Details of the arguments needed to
show the last two inequalities are left to reader in Exercise 37.10. Summing over
all rounds andj∈Ayields
5
∑n
t=1
∑
j∈A
∑
a∈Nj∩A
Qtjaβtja≤ 20 ηnk^2 V^2.
For the last term in Eq. (37.18) we use the definition ofZˆtjaand Parts (c) and
(e) of Lemma 37.14 to show that
η
∑n
t=1
∑
j∈A
∑
a∈Nj∩A
QtjaZˆtja^2 =η
∑n
t=1
∑
j∈A
∑
a∈Nj∩A
QtjaP ̃tj^2 vaj(At,Φt)^2
PtA^2 t
≤ηV^2
∑n
t=1
1
PtAt
∑
j∈A
P ̃tj
∑
a∈Nj∩A
QtjaP ̃tjINaj(At)
PtAt
≤ 4 ηkV^2
∑n
t=1
1
PtAt,
where the last step follows by splitting the sum overainto two based on whether
Atis a duplicate ofjand following an argument similar to the one used to
prove Eq. (37.19). Now, from Part (d) of Lemma 37.14, (γ/k)(1/Pta)≤1 for
alla, and in particular, holds fora=At. Furthermore,Et− 1 [1/PtAt] =kand
Et− 1 [1/PtA^2 t] =
∑
a^1 /Pta≤k
(^2) /γ. By the result in Exercise 5.15 we get that it
holds that with probability at least 1−δ,
∑n
t=1
1
PtAt≤^2 nk+
klog(1/δ)
γ.
Another union bound shows that with probability at least 1−(1 +k(k+ 1))δ,
∑
j∈A
Rˆnj≤^3 klog(1/δ)
η
+ 28ηnV^2 k^2 + 4V klog(1/δ).
The result follows from the definition ofη, Lemma 37.19 and the definition of
the local regretRˆnj.
37.7 Proof of the classification theorem
Almost all the results are now available to prove Theorem 37.9. In Section 37.4
we showed that ifGis globally observable and not locally observable, then
R∗n(G) = Ω(n^2 /^3 ). We also proved that ifGis locally observable and has neighbors,
thenR∗n(G) = Ω(
√
n). This last result is complemented by the policy and analysis
in Sections 37.5 and 37.6 where we showed that for locally observable problems
R∗n(G) =O(
√
nlog(n)). Finally we proved that ifGis not globally observable,