Understanding Machine Learning: From Theory to Algorithms

12.1 Convexity, Lipschitzness, and Smoothness 159

f(w)

f(u)

w u

f(

w) +

〈u

−w

,∇

f(w

)〉

Iffis a scalar differentiable function, there is an easy way to check if it is
convex.

lemma12.3 Letf :R→Rbe a scalar twice differential function, and let
f′,f′′be its first and second derivatives, respectively. Then, the following are
equivalent:

1.fis convex

2.f′is monotonically nondecreasing
3.f′′is nonnegative

Example 12.1

The scalar functionf(x) =x^2 is convex. To see this, note thatf′(x) = 2x
andf′′(x) = 2>0.

The scalar functionf(x) = log(1 + exp(x)) is convex. To see this, observe that
f′(x) =1+exp(exp(x)x)=exp(−^1 x)+1. This is a monotonically increasing function
since the exponent function is a monotonically increasing function.

The following claim shows that the composition of a convex scalar function
with a linear function yields a convex vector-valued function.

claim12.4 Assume thatf:Rd→Rcan be written asf(w) =g(〈w,x〉+y),
for somex∈Rd, y∈R, andg:R→R. Then, convexity ofgimplies the
convexity off.

Proof Letw 1 ,w 2 ∈Rdandα∈[0,1]. We have

f(αw 1 + (1−α)w 2 ) =g(〈αw 1 + (1−α)w 2 ,x〉+y) =g(α〈w 1 ,x〉+ (1−α)〈w 2 ,x〉+y) =g(α(〈w 1 ,x〉+y) + (1−α)(〈w 2 ,x〉+y)) ≤αg(〈w 1 ,x〉+y) + (1−α)g(〈w 2 ,x〉+y),

where the last inequality follows from the convexity ofg.

Example 12.2

Understanding Machine Learning: From Theory to Algorithms

Get our desktop app

Company

Features

Documentation

Resources