296 Online Learning
Weighted-Majorityinput:number of experts,d ; number of rounds,T
parameter:η=√
2 log(d)/T
initialize:w ̃(1)= (1,...,1)
fort= 1, 2 ,...
setw(t)=w ̃(t)/ZtwhereZt=∑
iw ̃(t)
i
choose expertiat random according toP[i] =w(it)
receive costs of all expertsvt∈[0,1]d
pay cost〈w(t),vt〉
update rule∀i,w ̃i(t+1)= ̃w(it)e−ηvt,iThe following theorem is key for analyzing the regret bound of Weighted-
Majority.theorem21.11 Assuming thatT >2 log(d), theWeighted-Majorityalgo-
rithm enjoys the bound∑Tt=1〈w(t),vt〉−mini∈[d]∑T
t=1vt,i ≤√
2 log(d)T.Proof We have:logZt+1
Zt= log∑
iw ̃i(t)
Zte−ηvt,i= log∑
iwi(t)e−ηvt,i.Using the inequalitye−a≤ 1 −a+a^2 /2, which holds for alla∈(0,1), and the
fact that∑
iw(t)
i = 1, we obtainlogZt+1
Zt≤log∑
iw(it)(
1 −ηvt,i+η^2 v^2 t,i/ 2)
= log(
1 −
∑
iw(it)(
ηvt,i−η^2 v^2 t,i/ 2)
︸ ︷︷ ︸
def=b)
.
Next, note thatb∈(0,1). Therefore, taking log of the two sides of the inequality
1 −b≤e−bwe obtain the inequality log(1−b)≤−b, which holds for allb≤1,
and obtainlogZt+1
Zt ≤−∑
iwi(t)(
ηvt,i−η^2 vt,i^2 / 2)
=−η〈w(t),vt〉+η^2∑
iwi(t)v^2 t,i/ 2≤−η〈w(t),vt〉+η^2 / 2.