28.5 Notes 319Proof Leta∗= argmina∈A∑n
t=1〈a,yt〉be the optimal action. ThenRn=E[n
∑t=1〈At−ra∗,yt〉]
+
∑nt=1〈ra∗−a∗,yt〉≤Rn(ra∗) + (1−r)n,where the inequality follows from the definition ofAand Cauchy-Schwarz. By
Theorem 28.5 and Theorem 26.12,
Rn(a∗)≤diamF(A)
η+
η
2E
[n
∑t=1‖Yˆt‖^2 ∇F(Zt)− 1]
,
whereZt∈[A ̄t,A ̄t+1] lies on the chord connectingA ̄tandA ̄t+1. The algorithm is
stable in the sense that no matter how the losses are chosen,A ̄t+1 cannot
be too far fromA ̄t. This also means thatZt is close toA ̄t. By definition,
η‖Yˆt‖≤ηd/(1−r) = 1/2. Combining this with Eq. (28.13) shows that
1 −‖Zt‖
1 −‖A ̄t‖
≤ sup
α∈[0,1]1 −‖αA ̄t+ (1−α)A ̄t+1‖
1 −‖A ̄t‖= max{
1 ,
1 −‖A ̄t+1‖
1 −‖A ̄t‖}
≤max{
1 ,
1 +η‖Lˆt− 1 ‖
1 +η‖Lˆt‖}
≤max{
1 ,
1 +η‖Lˆt− 1 ‖
1 /2 +η‖Lˆt− 1 ‖}
≤ 2.
The next step is to find the Hessian ofF, which is
∇^2 F(a) =I
1 −‖a‖+
aa>
‖a‖(1−‖a‖)^2≥
I
1 −‖a‖Therefore∇^2 F(a)−^1 ≤(1−‖a‖)Iand so
E[
‖Yˆt‖^2 ∇F(Zt)− 1]
≤E
[
(1−‖Zt‖)‖Yˆt‖^2]
=d^2 E[
(1−‖Zt‖)Et〈At,yt〉^2
(1−‖A ̄t‖)^2]
≤ 2 d.The diameter is bounded by diamF(A ̄)≤log(1/(1−r)) and hence
Rn≤(1−r)n+^1
ηlog(
1
1 −r)
+ηnd≤
1
ηlog(
1
2 ηd)
+ 3ηnd≤ 3√
2 ndlog(n).28.5 Notes
1 Findingat+1for both mirror descent and follow the regularized leader requires
solving a convex optimization problem. Provided the dimension is not too
large and the action set and potential are reasonably nice, there exist practical
approximation algorithms for this problem. The two-step process described in
Eqs. (28.7) and (28.8) is sometimes an easier way to go. Usually(28.7)can
be solved analytically while(28.8)can be quite expensive. In some important