Understanding Machine Learning: From Theory to Algorithms

26.3 Generalization Bounds for SVM 383

Finally, since the variablesσ 1 ,...,σmare independent we have

Eσ

[

‖

∑m

i=1

σixi‖^22

]

=Eσ





∑

i,j

σiσj〈xi,xj〉





=

∑

i 6 =j

〈xi,xj〉Eσ[σiσj] +

∑m

i=1

〈xi,xi〉Eσ

[

σi^2

]

=

∑m

i=1

‖xi‖^22 ≤mmax i

‖xi‖^22.

Combining this with Equation (26.15) and Equation (26.16) we conclude our proof. Next we bound the Rademacher complexity ofH 1 ◦S.

lemma26.11 LetS= (x 1 ,...,xm)be vectors inRn. Then,

R(H 1 ◦S) ≤ maxi ‖xi‖∞

√

2 log(2n) m. Proof Using Holder’s inequality we know that for any vectorsw,vwe have 〈w,v〉≤‖w‖ 1 ‖v‖∞. Therefore,

mR(H 1 ◦S) =Eσ

[

sup a∈H 1 ◦S

∑m

i=1

σiai

]

=Eσ

[

sup w:‖w‖ 1 ≤ 1

∑m

i=1

σi〈w,xi〉

]

=Eσ

[

sup w:‖w‖ 1 ≤ 1

〈w,

∑m

i=1

σixi〉

]

≤Eσ

[

‖

∑m

i=1

σixi‖∞

]

. (26.17)

For eachj∈[n], letvj= (x 1 ,j,...,xm,j)∈Rm. Note that‖vj‖ 2 ≤

√

mmaxi‖xi‖∞. LetV={v 1 ,...,vn,−v 1 ,...,−vn}. The right-hand side of Equation (26.17) is mR(V). Using Massart lemma (Lemma 26.8) we have that

R(V)≤maxi ‖xi‖∞

√

2 log(2n)/m,

which concludes our proof.

26.3 Generalization Bounds for SVM

In this section we use Rademacher complexity to derive generalization bounds for generalized linear predictors with Euclidean norm constraint. We will show how this leads to generalization bounds for hard-SVM and soft-SVM.

Understanding Machine Learning: From Theory to Algorithms

[

‖

]





∑





=

∑

[

]

=

√

[

]

[

]

[

]

[

‖

]

. (26.17)

√

√

26.3 Generalization Bounds for SVM

Get our desktop app

Company

Features

Documentation

Resources