26.1 The Rademacher Complexity 381
and therefore
mR(A′)≤log
(
∑
a∈A′
∏m
i=1
exp
(
a^2 i
2
))
= log
(
∑
a∈A′
exp
(
‖a‖^2 / 2
)
)
≤log
(
|A′|maxa∈A′exp
(
‖a‖^2 / 2
))
= log(|A′|) + maxa∈A′(‖a‖^2 /2).
SinceR(A) =λ^1 R(A′) we obtain from the equation that
R(A)≤
log(|A|) +λ^2 maxa∈A(‖a‖^2 /2)
λm
.
Settingλ=
√
2 log(|A|)/maxa∈A‖a‖^2 and rearranging terms we conclude our
proof.
The following lemma shows that composingAwith a Lipschitz function does
not blow up the Rademacher complexity. The proof is due to Kakade and Tewari.
lemma26.9 (Contraction lemma) For eachi∈[m], letφi:R→Rbe aρ-
Lipschitz function, namely for allα,β∈Rwe have|φi(α)−φi(β)| ≤ρ|α−β|.
Fora∈Rmletφ(a)denote the vector(φ 1 (a 1 ),...,φm(ym)). Letφ◦A={φ(a) :
a∈A}. Then,
R(φ◦A)≤ρR(A).
Proof For simplicity, we prove the lemma for the caseρ= 1. The caseρ 6 =
1 will follow by definingφ′ =^1 ρφand then using Lemma 26.6. Let Ai =
{(a 1 ,...,ai− 1 ,φi(ai),ai+1,...,am) :a∈A}. Clearly, it suffices to prove that
for any setAand alliwe haveR(Ai)≤R(A). Without loss of generality we will
prove the latter claim fori= 1 and to simplify notation we omit the subscript
fromφ 1. We have
mR(A 1 ) =Eσ
[
sup
a∈A 1
∑m
i=1
σiai
]
=Eσ
[
sup
a∈A
σ 1 φ(a 1 ) +
∑m
i=2
σiai
]
=
1
2
σ E
2 ,...,σm
[
sup
a∈A
(
φ(a 1 ) +
∑m
i=2
σiai
)
+ sup
a∈A
(
−φ(a 1 ) +
∑m
i=2
σiai
)]
=
1
2
σ E
2 ,...,σm
[
sup
a,a′∈A
(
φ(a 1 )−φ(a′ 1 ) +
∑m
i=2
σiai+
∑m
i=2
σia′i
)]
≤
1
2
σ E
2 ,...,σm
[
sup
a,a′∈A
(
|a 1 −a′ 1 |+
∑m
i=2
σiai+
∑m
i=2
σia′i
)]
, (26.12)
where in the last inequality we used the assumption thatφis Lipschitz. Next,
we note that the absolute value on|a 1 −a′ 1 |in the preceding expression can