Understanding Machine Learning: From Theory to Algorithms

28.2 The Lower Bound for the Agnostic Case 395

Therefore,

max b∈{± 1 }

P[LDb(A(y))−LDb(hb) =]

= max b∈{± 1 }

∑

y

Pb[y] (^1) [A(y) 6 =b]

≥

1

2

∑

y

P+[y] (^1) [A(y) 6 =+]+

1

2

∑

y

P−[y] (^1) [A(y) 6 =−]

=

1

2

∑

y∈N+

(P+[y] (^1) [A(y) 6 =+]+P−[y] (^1) [A(y) 6 =−]) +

1

2

∑

y∈N−

(P+[y] (^1) [A(y) 6 =+]+P−[y] (^1) [A(y) 6 =−])

≥

1

2

∑

y∈N+

(P−[y] (^1) [A(y) 6 =+]+P−[y] (^1) [A(y) 6 =−]) +

1

2

∑

y∈N−

(P+[y] (^1) [A(y) 6 =+]+P+[y] (^1) [A(y) 6 =−])

=

1

2

∑

y∈N+

P−[y] +

1

2

∑

y∈N−

P+[y].

Next note that

∑

y∈N+P−[y] =

∑

y∈N−P+[y], and both values are the prob-
ability that a Binomial (m,(1−)/2) random variable will have value greater
thanm/2. Using Lemma B.11, this probability is lower bounded by

1 2

(

1 −

√

1 −exp(−m^2 /(1−^2 ))

)

≥^1

2

(

1 −

√

1 −exp(− 2 m^2 )

)

,

where we used the assumption that^2 ≤ 1 /2. It follows that ifm≤ 0 .5 log(1/(4δ))/^2
then there existsbsuch that

P[LDb(A(y))−LDb(hb) =]

≥^1 2

(

1 −

√

1 −

√

4 δ

)

≥δ,

where the last inequality follows by standard algebraic manipulations. This con-
cludes our proof.

28.2.2 Showing Thatm(, 1 /8)≥ 8 d/

We shall now prove that for every < 1 /(8

√

2) we have thatm(,δ)≥^8 d 2. Letρ= 8and note thatρ∈(0, 1 /

√

2). We will construct a family of distri-
butions as follows. First, letC={c 1 ,...,cd}be a set ofdinstances which are
shattered byH. Second, for each vector (b 1 ,...,bd)∈ {± 1 }d, define a distribu-
tionDbsuch that

Db({(x,y)}) =

{ 1

d·

1+ybiρ 2 if∃i:x=ci 0 otherwise.

That is, to sample an example according toDb, we first sample an elementci∈C
uniformly at random, and then set the label to bebiwith probability (1 +ρ)/ 2
or−biwith probability (1−ρ)/2.
It is easy to verify that the Bayes optimal predictor forDbis the hypothesis

Understanding Machine Learning: From Theory to Algorithms

∑

≥

1

2

∑

1

2

∑

=

1

2

∑

1

2

∑

≥

1

2

∑

1

2

∑

=

1

2

∑

1

2

∑

∑

∑

(

1 −

√

)

≥^1

2

(

1 −

√

)

,

(

1 −

√

1 −

√

)

28.2.2 Showing Thatm(, 1 /8)≥ 8 d/

√

√

{ 1

Get our desktop app

Company

Features

Documentation

Resources

28.2.2 Showing Thatm(, 1 /8)≥ 8 d/