16.5 Exercises 203(e) Show thatα≥ n∆(ν)2
8 Clog(n)and conclude thatRn(π,ν)≥^1
2∆(ν)log(
n∆(ν)^2
8 Clog(n))
.
(f) Generalize the argument to an arbitrary number of arms.16.7(Lower bounds on variance) Letk >1 andE ⊂ ENk be the set of
k-armed Gaussian bandits with mean rewards in [0,1] for all arms. Suppose that
πis a policy such that for allν∈E,
lim sup
n→∞Rn(π,ν)
log(n)≤
∑
i:∆i> 02(1 +p)
∆i.
Prove thatlim sup
n→∞sup
ν∈Elog(V[Rˆn(π,ν)])
(1−p) log(n)≥ 1 ,
whereRˆn(π,ν) =nμ∗(ν)−
∑n
t=1μAt(ν).