12.1 Convexity, Lipschitzness, and Smoothness 161
Intuitively, a Lipschitz function cannot change too fast. Note that iff:R→R
is differentiable, then by the mean value theorem we have
f(w 1 )−f(w 2 ) =f′(u)(w 1 −w 2 ),
whereuis some point betweenw 1 andw 2. It follows that if the derivative off
is everywhere bounded (in absolute value) byρ, then the function isρ-Lipschitz.
Example 12.4
- The functionf(x) =|x|is 1-Lipschitz overR. This follows from the triangle
inequality: For everyx 1 ,x 2 ,
|x 1 |−|x 2 |=|x 1 −x 2 +x 2 |−|x 2 |≤|x 1 −x 2 |+|x 2 |−|x 2 |=|x 1 −x 2 |.
Since this holds for bothx 1 ,x 2 andx 2 ,x 1 , we obtain that||x 1 |−|x 2 || ≤
|x 1 −x 2 |.
- The functionf(x) = log(1 + exp(x)) is 1-Lipschitz overR. To see this, observe
that
|f′(x)|=
∣∣
∣∣ exp(x)
1 + exp(x)
∣∣
∣∣=
∣∣
∣∣^1
exp(−x) + 1
∣∣
∣∣≤ 1.
- The functionf(x) =x^2 is notρ-Lipschitz overRfor anyρ. To see this, take
x 1 = 0 andx 2 = 1 +ρ, then
f(x 2 )−f(x 1 ) = (1 +ρ)^2 > ρ(1 +ρ) =ρ|x 2 −x 1 |.
However, this function isρ-Lipschitz over the setC={x:|x| ≤ρ/ 2 }.
Indeed, for anyx 1 ,x 2 ∈Cwe have
|x^21 −x^22 |=|x 1 +x 2 | |x 1 −x 2 |≤2(ρ/2)|x 1 −x 2 |=ρ|x 1 −x 2 |.
- The linear functionf:Rd→Rdefined byf(w) =〈v,w〉+bwherev∈Rd
is‖v‖-Lipschitz. Indeed, using Cauchy-Schwartz inequality,
|f(w 1 )−f(w 2 )|=|〈v,w 1 −w 2 〉|≤‖v‖‖w 1 −w 2 ‖.
The following claim shows that composition of Lipschitz functions preserves
Lipschitzness.
claim 12.7 Letf(x) =g 1 (g 2 (x)), whereg 1 isρ 1 -Lipschitz andg 2 isρ 2 -
Lipschitz. Then,fis(ρ 1 ρ 2 )-Lipschitz. In particular, ifg 2 is the linear function,
g 2 (x) =〈v,x〉+b, for somev∈Rd,b∈R, thenfis(ρ 1 ‖v‖)-Lipschitz.
Proof
|f(w 1 )−f(w 2 )|=|g 1 (g 2 (w 1 ))−g 1 (g 2 (w 2 ))|
≤ρ 1 ‖g 2 (w 1 )−g 2 (w 2 )‖
≤ρ 1 ρ 2 ‖w 1 −w 2 ‖.