Understanding Machine Learning: From Theory to Algorithms

22.8 Exercises 321

(might) find a solution whose k-means objective is at leastt·OPT, where
OPT is the minimum k-means objective.
2.k-Means Might Not Necessarily Converge to a Local Minimum:
Show that thek-means algorithm might converge to a point which is not
a local minimum.Hint: Suppose that k = 2 and the sample points are
{ 1 , 2 , 3 , 4 } ⊂Rsuppose we initialize thek-means with the centers{ 2 , 4 };
and suppose we break ties in the definition ofCiby assigningito be the
smallest value in argminj‖x−μj‖.

Given a metric space (X,d), where|X|<∞, andk∈N, we would like to find
a partition ofXintoC 1 ,...,Ckwhich minimizes the expression

Gk−diam((X,d),(C 1 ,...,Ck)) = max j∈[d]

diam(Cj),

where diam(Cj) = maxx,x′∈Cjd(x,x′) (we use the convention diam(Cj) = 0 if|Cj|<2). Similarly to the k-means objective, it is NP-hard to minimize thek- diamobjective. Fortunately, we have a very simple approximation algorithm: Initially, we pick somex∈Xand setμ 1 =x. Then, the algorithm iteratively sets ∀j∈{ 2 ,...,k}, μj= argmax x∈X

min i∈[j−1]

d(x,μi).

Finally, we set

∀i∈[k], Ci={x∈X:i= argmin j∈[k]

d(x,μj)}.

Prove that the algorithm described is a 2-approximation algorithm. That is, if we denote its output byCˆ 1 ,...,Cˆk, and denote the optimal solution by C∗ 1 ,...,Ck∗, then,

Gk−diam((X,d),(Cˆ 1 ,...,Cˆk))≤ 2 ·Gk−diam((X,d),(C 1 ∗,...,Ck∗)).

Hint:Consider the pointμk+1(in other words, the next center we would have chosen, if we wantedk+ 1 clusters). Letr= minj∈[k]d(μj,μk+1). Prove the following inequalities

Gk−diam((X,d),(Cˆ 1 ,...,Cˆk))≤ 2 r Gk−diam((X,d),(C 1 ∗,...,Ck∗))≥r.

Recall that a clustering function,F, is called Center-Based Clustering if, for
some monotonic functionf:R+→R+, on every given input (X,d),F(X,d)
is a clustering that minimizes the objective

Gf((X,d),(C 1 ,...Ck)) =μ min 1 ,...μk∈X′

∑k

i=1

∑

x∈Ci

f(d(x,μi)),

whereX′is eitherXor some superset ofX.

Understanding Machine Learning: From Theory to Algorithms

∑

Get our desktop app

Company

Features

Documentation

Resources