Understanding Machine Learning: From Theory to Algorithms

(Jeff_L) #1
28.2 The Lower Bound for the Agnostic Case 393

Combining this with Lemma 26.8 we obtain the following bound on the Rademacher
complexity:


R(A)≤


2 dlog(em/d)
m

Using Theorem 26.5 we obtain that with probability of at least 1−δ, for every
h∈Hwe have that


LD(h)−LS(h)≤


8 dlog(em/d)
m

+


2 log(2/δ)
m

Repeating the previous argument for minus the zero-one loss and applying the
union bound we obtain that with probability of at least 1−δ, for everyh∈ H
it holds that


|LD(h)−LS(h)|≤


8 dlog(em/d)
m

+


2 log(4/δ)
m

≤ 2


8 dlog(em/d) + 2 log(4/δ)
m

To ensure that this is smaller thanwe need


m≥

4

^2

·(8dlog(m) + 8dlog(e/d) + 2 log(4/δ)).

Using Lemma A.2, a sufficient condition for the inequality to hold is that


m≥ 4

32 d
^2

·log

(

64 d
^2

)

+

8

^2

·(8dlog(e/d) + 2 log(4/δ)).

28.2 The Lower Bound for the Agnostic Case


Here, we prove that there existsCsuch thatHis agnostic PAC learnable with
sample complexity


mH(,δ)≥C

d+ ln(1/δ)
^2.

We will prove the lower bound in two parts. First, we will show thatm(,δ)≥
0 .5 log(1/(4δ))/^2 , and second we will show that for everyδ≤ 1 /8 we have that
m(,δ)≥ 8 d/^2. These two bounds will conclude the proof.


28.2.1 Showing Thatm(,δ)≥ 0 .5 log(1/(4δ))/


We first show that for any < 1 /



2 and anyδ∈(0,1), we have thatm(,δ)≥
0 .5 log(1/(4δ))/^2. To do so, we show that form≤ 0 .5 log(1/(4δ))/^2 ,His not
learnable.
Choose one example that is shattered byH. That is, letcbe an example such

Free download pdf