26.1 The Rademacher Complexity 381and therefore
mR(A′)≤log(
∑
a∈A′∏mi=1exp(
a^2 i
2))
= log(
∑
a∈A′exp(
‖a‖^2 / 2)
)
≤log(
|A′|maxa∈A′exp(
‖a‖^2 / 2))
= log(|A′|) + maxa∈A′(‖a‖^2 /2).SinceR(A) =λ^1 R(A′) we obtain from the equation that
R(A)≤log(|A|) +λ^2 maxa∈A(‖a‖^2 /2)
λm.
Settingλ=
√
2 log(|A|)/maxa∈A‖a‖^2 and rearranging terms we conclude our
proof.
The following lemma shows that composingAwith a Lipschitz function does
not blow up the Rademacher complexity. The proof is due to Kakade and Tewari.
lemma26.9 (Contraction lemma) For eachi∈[m], letφi:R→Rbe aρ-
Lipschitz function, namely for allα,β∈Rwe have|φi(α)−φi(β)| ≤ρ|α−β|.
Fora∈Rmletφ(a)denote the vector(φ 1 (a 1 ),...,φm(ym)). Letφ◦A={φ(a) :
a∈A}. Then,
R(φ◦A)≤ρR(A).Proof For simplicity, we prove the lemma for the caseρ= 1. The caseρ 6 =
1 will follow by definingφ′ =^1 ρφand then using Lemma 26.6. Let Ai =
{(a 1 ,...,ai− 1 ,φi(ai),ai+1,...,am) :a∈A}. Clearly, it suffices to prove that
for any setAand alliwe haveR(Ai)≤R(A). Without loss of generality we will
prove the latter claim fori= 1 and to simplify notation we omit the subscript
fromφ 1. We have
mR(A 1 ) =Eσ[
sup
a∈A 1∑mi=1σiai]
=Eσ[
sup
a∈Aσ 1 φ(a 1 ) +∑mi=2σiai]
=
1
2
σ E
2 ,...,σm[
sup
a∈A(
φ(a 1 ) +∑mi=2σiai)
+ sup
a∈A(
−φ(a 1 ) +∑mi=2σiai)]
=
1
2
σ E
2 ,...,σm[
sup
a,a′∈A(
φ(a 1 )−φ(a′ 1 ) +∑mi=2σiai+∑mi=2σia′i)]
≤
1
2
σ E
2 ,...,σm[
sup
a,a′∈A(
|a 1 −a′ 1 |+∑mi=2σiai+∑mi=2σia′i)]
, (26.12)
where in the last inequality we used the assumption thatφis Lipschitz. Next,
we note that the absolute value on|a 1 −a′ 1 |in the preceding expression can