Understanding Machine Learning: From Theory to Algorithms

136 Boosting

isfT. In addition, denote

Zt=

1

m

∑m

i=1

e−yift(xi).

Note that for any hypothesis we have that (^1) [h(x) 6 =y]≤e−yh(x). Therefore,LS(fT)≤
ZT, so it suffices to show thatZT≤e−^2 γ
(^2) T

. To upper boundZTwe rewrite it
as
ZT=

ZT

Z 0 =

ZT

ZT− 1 ·

ZT− 1

ZT− 2 ···

Z 2

Z 1 ·

Z 1

Z 0 , (10.2)

where we used the fact thatZ 0 = 1 becausef 0 ≡0. Therefore, it suffices to show that for every roundt, Zt+1 Zt

≤e−^2 γ

2

. (10.3)

To do so, we first note that using a simple inductive argument, for alltandi,

D(it+1)= e

−yift(xi) ∑m j=1e−yjft(xj)

.

Hence, Zt+1 Zt

=

∑m i=1e−yift+1(xi) ∑m j=1

e−yjft(xj)

=

∑m i=1e

−yift(xi)e−yiwt+1ht+1(xi) ∑m j=1

e−yjft(xj)

=

∑m

i=1

D(it+1)e−yiwt+1ht+1(xi)

=e−wt+1

∑

i:yiht+1(xi)=1

D(it+1)+ewt+1

∑

i:yiht+1(xi)=− 1

Di(t+1)

=e−wt+1(1−t+1) +ewt+1t+1

=

1

√

1 /t+1− 1

(1−t+1) +

√

1 /t+1− 1 t+1

=

√

t+1 1 −t+1

(1−t+1) +

√

1 −t+1 t+1

t+1

= 2

√

t+1(1−t+1).

By our assumption,t+1≤^12 −γ. Since the functiong(a) =a(1−a) is mono- tonically increasing in [0, 1 /2], we obtain that

2

√

t+1(1−t+1)≤ 2

√(

1

2

−γ

)(

1

2

+γ

)

=

√

1 − 4 γ^2.

Understanding Machine Learning: From Theory to Algorithms

1

ZT

Z 0 =

ZT

ZT− 1 ·

ZT− 1

ZT− 2 ···

Z 2

Z 1 ·

Z 1

Z 0 , (10.2)

.

=

=

=

∑

∑

1

√

√

=

√

√

√

2

√

√(

1

2

)(

1

2

)

=

√

Get our desktop app

Company

Features

Documentation

Resources