296 Online Learning
Weighted-Majority
input:number of experts,d ; number of rounds,T
parameter:η=
√
2 log(d)/T
initialize:w ̃(1)= (1,...,1)
fort= 1, 2 ,...
setw(t)=w ̃(t)/ZtwhereZt=
∑
iw ̃
(t)
i
choose expertiat random according toP[i] =w(it)
receive costs of all expertsvt∈[0,1]d
pay cost〈w(t),vt〉
update rule∀i,w ̃i(t+1)= ̃w(it)e−ηvt,i
The following theorem is key for analyzing the regret bound of Weighted-
Majority.
theorem21.11 Assuming thatT >2 log(d), theWeighted-Majorityalgo-
rithm enjoys the bound
∑T
t=1
〈w(t),vt〉−mini∈[d]
∑T
t=1
vt,i ≤
√
2 log(d)T.
Proof We have:
log
Zt+1
Zt
= log
∑
i
w ̃i(t)
Zt
e−ηvt,i= log
∑
i
wi(t)e−ηvt,i.
Using the inequalitye−a≤ 1 −a+a^2 /2, which holds for alla∈(0,1), and the
fact that
∑
iw
(t)
i = 1, we obtain
log
Zt+1
Zt
≤log
∑
i
w(it)
(
1 −ηvt,i+η^2 v^2 t,i/ 2
)
= log
(
1 −
∑
i
w(it)
(
ηvt,i−η^2 v^2 t,i/ 2
)
︸ ︷︷ ︸
def=b
)
.
Next, note thatb∈(0,1). Therefore, taking log of the two sides of the inequality
1 −b≤e−bwe obtain the inequality log(1−b)≤−b, which holds for allb≤1,
and obtain
log
Zt+1
Zt ≤−
∑
i
wi(t)
(
ηvt,i−η^2 vt,i^2 / 2
)
=−η〈w(t),vt〉+η^2
∑
i
wi(t)v^2 t,i/ 2
≤−η〈w(t),vt〉+η^2 / 2.