Bandit Algorithms

26.5 Optimization 296

better point

−∇f(x)

−∇f(x∗)

Figure 26.4Illustration of first-order optimality conditions. The point at the top is not a minimizer because the hyperplane with normal as gradient does not support the convex set. The point at the right is a minimizer.

where the last step follows from the useful trick thatax−bx^2 ≤a^2 /(4b) for all x∈Randb≥0.

26.5 Optimization

Thefirst-order optimality conditionstates that ifx∈Rdis the minimizer of a differentiable functionf:Rd→R, then∇f(x) = 0. One of the things we like about convex functions is that whenfis convex the first-order optimality condition is both necessary and sufficient. In particular, if∇f(x) = 0 for some x∈Rdthenxis a minimizer off. The first-order optimality condition can also be generalized to constrained minima: iff:Rd→Ris convex, differentiable and A⊆Rdis a nonempty convex set then

x∗∈argminx∈Af(x)⇔∀x∈A:〈∇f(x∗),x−x∗〉≥ 0. (26.2)

The necessity of this condition is easy to understand by a geometric reasoning. If ∇f(x∗) = 0 then the said condition trivially holds. If∇f(x∗) 6 = 0, the hyperplane Hx∗whose normal is∇f(x∗) and goes through x∗ must be asupporting hyperplaneofA atx∗ with−∇f(x∗) being the outer normal ofAatx∗ otherwisex∗could be moved by a small amount while staying insideAand improving the value off. SinceAis convex, it thus lies entirely on the side of Hx∗that∇f(x∗) points into. This is clearly equivalent to(26.2). The sufficiency of the condition also follows from this geometric viewpoint as the reader may verify from the figure. The above statement continues to hold with a small modification even whenf is not differentiable everywhere. In particular, in this case the equivalence(26.2) holds for anyx∗∈dom(∇f) with the modification that on the right-side of the equivalence,Ashould be replaced byA∩dom(f):

Bandit Algorithms

26.5 Optimization

Get our desktop app

Company

Features

Documentation

Resources