Understanding Machine Learning: From Theory to Algorithms

28.3 The Upper Bound for the Realizable Case 401

28.3.1 From-Nets to PAC Learnability

theorem28.4 LetHbe a hypothesis class overXwithVCdim(H) =d. Let Dbe a distribution overXand letc∈Hbe a target hypothesis. Fix,δ∈(0,1) and letmbe as defined in Theorem 28.3. Then, with probability of at least 1 −δ over a choice ofmi.i.d. instances fromXwith labels according tocwe have that any ERM hypothesis has a true error of at most. Proof Define the classHc={c

a h:h∈H}, wherec

a h= (h\c)∪(c\h). It is easy to verify that if someA⊂Xis shattered byHthen it is also shattered byHc and vice versa. Hence, VCdim(H) = VCdim(Hc). Therefore, using Theorem 28.3 we know that with probability of at least 1−δ, the sampleSis an-net forHc. Note thatLD(h) =D(h

a c). Therefore, for anyh∈HwithLD(h)≥we have that|(h

a c)∩S|>0, which implies thathcannot be an ERM hypothesis, which concludes our proof.

Understanding Machine Learning: From Theory to Algorithms

28.3.1 From-Nets to PAC Learnability

Get our desktop app

Company

Features

Documentation

Resources

Understanding Machine Learning: From Theory to Algorithms

28.3.1 From-Nets to PAC Learnability

Get our desktop app

Company

Features

Documentation

Resources

28.3.1 From-Nets to PAC Learnability