380 Rademacher Complexities
lemma26.7 LetA be a subset ofRm and letA′ ={∑N
j=1αja(j) :N ∈
N,∀j,a(j)∈A,αj≥ 0 ,‖α‖ 1 = 1}. Then,R(A′) =R(A).
Proof The main idea follows from the fact that for any vectorvwe havesup
α≥ 0 :‖α‖ 1 =1∑N
j=1αjvj = maxj vj.Therefore,mR(A′) =Eσ sup
α≥ 0 :‖α‖ 1 =1sup
a(1),...,a(N)∑mi=1σi∑N
j=1αja(ij)=Eσ sup
α≥ 0 :‖α‖ 1 =1∑N
j=1αjsup
a(j)∑mi=1σia(ij)=Eσsup
a∈A∑mi=1σiai=mR(A),
and we conclude our proof.
The next lemma, due to Massart, states that the Rademacher complexity of
a finite set grows logarithmically with the size of the set.
lemma26.8 (Massart lemma) LetA={a 1 ,...,aN}be a finite set of vectors
inRm. Define ̄a=N^1∑N
i=1ai. Then,R(A) ≤ max
a∈A‖a−a ̄‖√
2 log(N)
m.
Proof Based on Lemma 26.6, we can assume without loss of generality that
̄a= 0. Letλ >0 and letA′={λa 1 ,...,λaN}. We upper bound the Rademacher
complexity as follows:mR(A′) =Eσ[
max
a∈A′〈σ,a〉]
= Eσ[
log(
max
a∈A′e〈σ,a〉)]
≤Eσ[
log(
∑
a∈A′e〈σ,a〉)]
≤log(
Eσ[
∑
a∈A′e〈σ,a〉])
// Jensen’s inequality= log(
∑
a∈A′∏mi=1Eσ
i[eσiai])
,
where the last equality occurs because the Rademacher variables are indepen-
dent. Next, using Lemma A.6 we have that for allai∈R,σE
ieσiai=exp(ai) + exp(−ai)
2≤exp(a^2 i/2),