Understanding Machine Learning: From Theory to Algorithms

404 Multiclass Learnability

lemma29.4 (Natarajan) |H|≤|X|Ndim(H)·k2Ndim(H).

The proof of Natarajan’s lemma shares the same spirit of the proof of Sauer’s lemma and is left as an exercise (see Exercise 3).

29.3 Calculating the Natarajan Dimension

In this section we show how to calculate (or estimate) the Natarajan dimension of several popular classes, some of which were studied in Chapter 17. As these calculations indicate, the Natarajan dimension is often proportional to the number of parameters required to define a hypothesis.

29.3.1 One-versus-All Based Classes

In Chapter 17 we have seen two reductions of multiclass categorization to binary classification: One-versus-All and All-Pairs. In this section we calculate the Natarajan dimension of the One-versus-All method. Recall that in One-versus-All we train, for each label, a binary classifier that distinguishes between that label and the rest of the labels. This naturally sug- gests considering multiclass hypothesis classes of the following form. LetHbin⊂ { 0 , 1 }Xbe a binary hypothesis class. For everyh ̄= (h 1 ,...,hk)∈(Hbin)kdefine T( ̄h) :X →[k] by T( ̄h)(x) = argmax i∈[k]

hi(x).

If there are two labels that maximizehi(x), we choose the smaller one. Also, let

HbinOvA,k={T( ̄h) : ̄h∈(Hbin)k}.

What “should” be the Natarajan dimension ofHbinOvA,k? Intuitively, to specify a hypothesis inHbinwe needd= VCdim(Hbin) parameters. To specify a hypothesis inHOvAbin,k, we need to specifykhypotheses inHbin. Therefore,kdparameters should suffice. The following lemma establishes this intuition.

lemma29.5 Ifd= VCdim(Hbin)then

Ndim(HOvAbin,k)≤ 3 kdlog (kd).

Proof LetC⊂ Xbe a shattered set. By the definition of shattering (for multiclass hypotheses) ∣∣ ∣

(

HOvAbin,k

)

C

∣∣

∣ ≥ 2 |C|.

On the other hand, each hypothesis inHOvAbin ,kis determined by usingkhypothe- ses fromHbin. Therefore, ∣∣ ∣

(

HOvAbin,k

)

C

∣∣

∣ ≤ |(Hbin)C|k.

Understanding Machine Learning: From Theory to Algorithms

29.3 Calculating the Natarajan Dimension

29.3.1 One-versus-All Based Classes

(

)

∣∣

∣ ≥ 2 |C|.

(

)

∣∣

Get our desktop app

Company

Features

Documentation

Resources