28.7 Exercises 324
(b)What happens if you replace mirror descent by follow the regularized leader?
Pt+1= argminp∈A
∑t
s=1
〈p,Yˆs〉+F(p).
28.8(Exp3 as mirror descent (ii)) Here you will show that the tools in this
chapter not only lead to the same algorithm, but also the same bounds.
(a)LetP ̃t+1=argminp∈[0,∞)kη〈p,Yˆt〉+DF(p,Pt). Show both relations in the
following display:
DF(Pt,P ̃t+1) =
∑k
i=1
Pti
(
exp(−ηYˆti)−1 +ηYˆti
)
≤
η^2
2
∑k
i=1
PtiYˆti^2.
(b) Show that^1
η
E
[n
∑
t=1
DF(Pt,P ̃t+1)
]
≤ηnk
2
.
(c) Show that diamF(Pk− 1 ) = log(k).
(d) Conclude that for appropriately tunedη >0 the regret of Exp3 satisfies,
Rn≤
√
2 nklog(k).
28.9(Mirror descent and changing learning rates) LetAbe closed and
convex andy 1 ,...,yn∈L⊆Rd. LetFbe Legendre with domainDand assume
thatA ⊆cl(D) and that Eq. (28.6) holds. Letη 0 ,η 1 ,...,ηnbe a sequence of
learning rates where we assume thatη 0 =∞anda 1 = argmina∈AF(a) and
̃at+1= argmina∈Dηt〈a,yt〉+DF(a,at),
at+1= argmina∈A∩DDF(a, ̃at+1).
Show that for alla∈A:
(a)Rn(a) =
∑n
t=1
〈at−a,yt〉≤
∑n
t=1
DF(at, ̃at+1)
ηt
+
∑n
t=1
DF(a,at)−DF(a, ̃at+1)
ηt
.
(b)Rn(a)≤
∑n
t=1
DF(at, ̃at+1)
ηt
+
∑n
t=1
DF(a,at)
(
1
ηt
−
1
ηt− 1
)
.
28.10(FTRL and changing potentials) Like in the previous exercise, let
Abe closed and convex andy 1 ,...,yn∈ L ⊆Rd. LetF 1 ,...,Fn,Fn+1be a
sequence of Legendre functions wheredom(Ft) =DtandA⊆cl(Dt) for allt. Let
Φt(a) =Ft(a) +
∑t− 1
s=1〈a,ys〉andat= argmina∈AΦt(a).
(a) Show that
Rn(a)≤
∑n
t=1
(〈at−at+1,yt〉−DFt(at+1,at))
+Fn+1(a)−F 1 (a 1 ) +
∑n
t=1
(Ft(at+1)−Ft+1(at+1)).