Understanding Machine Learning: From Theory to Algorithms

(Jeff_L) #1

106 The Runtime of Learning


the rectangle with the minimal training error. This procedure is guaranteed to
find an ERM hypothesis, and the runtime of the procedure ismO(n). It follows
that ifnis fixed, the runtime is polynomial in the sample size. This does not
contradict the aforementioned hardness result, since there we argued that unless
P=NP one cannot have an algorithm whose dependence on the dimensionnis
polynomial as well.

8.2.3 Boolean Conjunctions


A Boolean conjunction is a mapping fromX={ 0 , 1 }ntoY={ 0 , 1 }that can be
expressed as a proposition formula of the formxi 1 ∧...∧xik∧¬xj 1 ∧...∧¬xjr,
for some indicesi 1 ,...,ik,j 1 ,...,jr∈[n]. The function that such a proposition
formula defines is

h(x) =

{

1 ifxi 1 =···=xik= 1 andxj 1 =···=xjr= 0
0 otherwise

LetHnCbe the class of all Boolean conjunctions over{ 0 , 1 }n. The size ofHnCis
at most 3n+ 1 (since in a conjunction formula, each element ofxeither appears,
or appears with a negation sign, or does not appear at all, and we also have the
all negative formula). Hence, the sample complexity of learningHnCusing the
ERM rule is at mostnlog(3/δ)/.

Efficiently Learnable in the Realizable Case


Next, we show that it is possible to solve the ERM problem forHnCin time
polynomial innandm. The idea is to define an ERM conjunction by including
in the hypothesis conjunction all the literals that do not contradict any positively
labeled example. Letv 1 ,...,vm+be all the positively labeled instances in the
input sampleS. We define, by induction oni≤m+, a sequence of hypotheses
(or conjunctions). Leth 0 be the conjunction of all possible literals. That is,
h 0 =x 1 ∧¬x 1 ∧x 2 ∧...∧xn∧¬xn. Note thath 0 assigns the label 0 to all the
elements ofX. We obtainhi+1by deleting from the conjunctionhiall the literals
that are not satisfied byvi+1. The algorithm outputs the hypothesishm+. Note
thathm+labels positively all the positively labeled examples inS. Furthermore,
for everyi≤m+,hiis the most restrictive conjunction that labelsv 1 ,...,vi
positively. Now, since we consider learning in the realizable setup, there exists
a conjunction hypothesis,f∈ HnC, that is consistent with all the examples in
S. Sincehm+ is the most restrictive conjunction that labels positively all the
positively labeled members ofS, any instance labeled 0 byfis also labeled 0 by
hm+. It follows thathm+has zero training error (w.r.t.S), and is therefore a
legal ERM hypothesis. Note that the running time of this algorithm isO(mn).
Free download pdf