13.3 Tikhonov Regularization as a Stabilizer 177
corollary 13.6 Assume that the loss function is convex andρ-Lipschitz.
Then, the RLM rule with the regularizerλ‖w‖^2 is on-average-replace-one-stable
with rate^2 ρ
2
λm. It follows (using Theorem 13.2) that
S∼DEm[LD(A(S))−LS(A(S))]≤
2 ρ^2
λm
13.3.2 Smooth and Nonnegative Loss
If the loss isβ-smooth and nonnegative then it is also self-bounded (see Sec-
tion 12.1):
‖∇f(w)‖^2 ≤ 2 βf(w). (13.12)
We further assume thatλ≥^2 mβ, or, in other words, thatβ≤λm/2. By the
smoothness assumption we have that
(A(S(i)),zi)−
(A(S),zi)≤〈∇`(A(S),zi),A(S(i))−A(S)〉+
β
2
‖A(S(i))−A(S)‖^2.
(13.13)
Using the Cauchy-Schwartz inequality and Equation (12.6) we further obtain
that
`(A(S(i)),zi)−`(A(S),zi)
≤‖∇`(A(S),zi)‖‖A(S(i))−A(S)‖+β
2
‖A(S(i))−A(S)‖^2
≤
√
2 β`(A(S),zi)‖A(S(i))−A(S)‖+
β
2
‖A(S(i))−A(S)‖^2.
(13.14)
By a symmetric argument it holds that,
`(A(S),z′)−`(A(S(i)),z′)
≤
√
2 β`(A(S(i)),z′)‖A(S(i))−A(S)‖+
β
2 ‖A(S
(i))−A(S)‖ (^2).
Plugging these inequalities into Equation (13.10) and rearranging terms we ob-
tain that
‖A(S(i))−A(S)‖≤
√
2 β
(λm−β)
(√
`(A(S),zi) +
√
`(A(S(i)),z′)
)
.
Combining the preceding with the assumptionβ≤λm/2 yields
‖A(S(i))−A(S)‖≤
√
8 β
λm
(√
`(A(S),zi) +
√
`(A(S(i)),z′)