23.2 Elimination on the hypercube 264
Clearly ifθi= 0, thenRni= 0. And so it suffices to boundRnifor eachiwith
|θi|>0. Suppose that|θi|>0 for someiand the failure eventFigiven in
Lemma 23.3 does not occur. Thenθi∈Cτi+1,tand by definition of the algorithm
Ati= sign(θi) for allt≥τi. Therefore
Rni=n|θi|−E
[n
∑
t=1
Atiθi
]
=E
[n
∑
t=1
|θi|(1−Atisign(θi))
]
≤ 2 n|θi|P(Fi) +|θi|E[I{Fic}τi] (23.4)
Sinceτiis the first roundtwhen 0∈C/ t+1,iit follows that ifFidoes not occur,
thenθ∈Cτi,iand 0∈Cτi,i. Thus the width of the confidence intervalCτi,imust
be at least|θi|and so
2 Wτi− 1 = 4
√(
1
τi− 1 +
1
(τi−1)^2
)
log
(
n
√
2 τi− 1
)
≥|θi|,
which after rearranging shows for some universal constantC >0 that
I{Fic}(τi−1)≤1 +Clog(n)
θ^2 i
.
Combining this result with Eq. (23.4) leads to
Rni≤ 2 n|θi|P(Fi) +|θi|+
Clog(n)
|θi|
.
Using Lemma 23.3 to boundP(Fi)and substituting into the decomposition
Eq. (23.3) completes the proof of the first part. The second part is left as a treat
for you (Exercise 23.2).
Proof of Lemma 23.3 LetSti=
∑
j 6 =iAtjθjandZti=Atiηt+AtiSti. Fort≤τi,
θˆti−θi=^1
t
∑t
s=1
Zsi.
The next step is to show thatZtiis conditionally
√
2-subgaussian fort≤τi.
E[exp(λZti)|Gt− 1 ] =E[E[exp(λZti)|Ft− 1 ]|Gt− 1 ]
=E[exp(λAtiSti)E[exp(λAtiηt)|Ft− 1 ]|Gt− 1 ]
≤E
[
exp(λAtiSti) exp
(
λ^2
2
)∣∣
∣∣Gt− 1
]
= exp
(
λ^2
2
)
E[E[exp(λAtiSti)|Gt− 1 ,Sti]|Gt− 1 ]
≤exp
(
λ^2
2
)
E
[
exp
(
λ^2 Sti^2
2
)∣∣
∣∣Gt− 1
]
≤exp(λ^2 ).
The first inequality used the fact thatηtis conditionally 1-subgaussian. The second