Understanding Machine Learning: From Theory to Algorithms

160 Convex Learning Problems

Given somex ∈Rdandy ∈R, letf :Rd →Rbe defined asf(w) =
(〈w,x〉−y)^2. Then,fis a composition of the functiong(a) =a^2 onto a
linear function, and hencefis a convex function.

Given somex∈Rdandy∈ {± 1 }, letf:Rd→Rbe defined asf(w) =
log(1 + exp(−y〈w,x〉)). Then,fis a composition of the functiong(a) =
log(1 + exp(a)) onto a linear function, and hencefis a convex function.
Finally, the following lemma shows that the maximum of convex functions is
convex and that a weighted sum of convex functions, with nonnegative weights,
is also convex.

claim12.5 Fori = 1,...,r, letfi: Rd →Rbe a convex function. The following functions fromRdtoRare also convex.

g(x) = maxi∈[r]fi(x)

g(x) =

∑r i=1wifi(x), where for alli,wi≥^0. Proof The first claim follows by g(αu+ (1−α)v) = maxi fi(αu+ (1−α)v)

≤max i

[αfi(u) + (1−α)fi(v)]

≤αmaxi fi(u) + (1−α) maxi fi(v)

=αg(u) + (1−α)g(v).

For the second claim g(αu+ (1−α)v) =

∑

i

wifi(αu+ (1−α)v)

≤

∑

i

wi[αfi(u) + (1−α)fi(v)]

=α

∑

i

wifi(u) + (1−α)

∑

i

wifi(v)

=αg(u) + (1−α)g(v).

Example 12.3 The functiong(x) =|x|is convex. To see this, note thatg(x) = max{x,−x}and that both the functionf 1 (x) =xandf 2 (x) =−xare convex.

12.1.2 Lipschitzness

The definition of Lipschitzness below is with respect to the Euclidean norm over Rd. However, it is possible to define Lipschitzness with respect to any norm.

definition12.6 (Lipschitzness) LetC⊂Rd. A functionf :Rd →Rk is ρ-Lipschitz overCif for everyw 1 ,w 2 ∈Cwe have that‖f(w 1 )−f(w 2 )‖ ≤ ρ‖w 1 −w 2 ‖.

Understanding Machine Learning: From Theory to Algorithms

∑

≤

∑

∑

∑

12.1.2 Lipschitzness

Get our desktop app

Company

Features

Documentation

Resources