23.2 Elimination on the hypercube 264Clearly ifθi= 0, thenRni= 0. And so it suffices to boundRnifor eachiwith
|θi|>0. Suppose that|θi|>0 for someiand the failure eventFigiven in
Lemma 23.3 does not occur. Thenθi∈Cτi+1,tand by definition of the algorithm
Ati= sign(θi) for allt≥τi. ThereforeRni=n|θi|−E[n
∑t=1Atiθi]
=E
[n
∑t=1|θi|(1−Atisign(θi))]
≤ 2 n|θi|P(Fi) +|θi|E[I{Fic}τi] (23.4)Sinceτiis the first roundtwhen 0∈C/ t+1,iit follows that ifFidoes not occur,
thenθ∈Cτi,iand 0∈Cτi,i. Thus the width of the confidence intervalCτi,imust
be at least|θi|and so2 Wτi− 1 = 4√(
1
τi− 1 +1
(τi−1)^2)
log(
n√
2 τi− 1)
≥|θi|,which after rearranging shows for some universal constantC >0 that
I{Fic}(τi−1)≤1 +Clog(n)
θ^2 i.
Combining this result with Eq. (23.4) leads toRni≤ 2 n|θi|P(Fi) +|θi|+Clog(n)
|θi|.
Using Lemma 23.3 to boundP(Fi)and substituting into the decomposition
Eq. (23.3) completes the proof of the first part. The second part is left as a treat
for you (Exercise 23.2).Proof of Lemma 23.3 LetSti=∑
j 6 =iAtjθjandZti=Atiηt+AtiSti. Fort≤τi,θˆti−θi=^1
t∑ts=1Zsi.The next step is to show thatZtiis conditionally
√
2-subgaussian fort≤τi.E[exp(λZti)|Gt− 1 ] =E[E[exp(λZti)|Ft− 1 ]|Gt− 1 ]
=E[exp(λAtiSti)E[exp(λAtiηt)|Ft− 1 ]|Gt− 1 ]≤E[
exp(λAtiSti) exp(
λ^2
2)∣∣
∣∣Gt− 1]
= exp(
λ^2
2)
E[E[exp(λAtiSti)|Gt− 1 ,Sti]|Gt− 1 ]≤exp(
λ^2
2)
E
[
exp(
λ^2 Sti^2
2)∣∣
∣∣Gt− 1]
≤exp(λ^2 ).The first inequality used the fact thatηtis conditionally 1-subgaussian. The second