Understanding Machine Learning: From Theory to Algorithms

(Jeff_L) #1

314 Clustering


Proof To simplify the notation, let us use the shorthandG(C 1 ,...,Ck) for the
k-means objective, namely,

G(C 1 ,...,Ck) =μ min
1 ,...,μk∈Rn

∑k

i=1


x∈Ci

‖x−μi‖^2. (22.2)

It is convenient to defineμ(Ci) =|C^1 i|


x∈Cixand note thatμ(Ci) = argminμ∈Rn


x∈Ci‖x−
μ‖^2. Therefore, we can rewrite thek-means objective as

G(C 1 ,...,Ck) =

∑k

i=1


x∈Ci

‖x−μ(Ci)‖^2. (22.3)

Consider the update at iterationtof thek-means algorithm. LetC 1 (t−1),...,Ck(t−1)
be the previous partition, letμ(it−1)=μ(Ci(t−1)), and letC 1 (t),...,Ck(t)be the
new partition assigned at iterationt. Using the definition of the objective as
given in Equation (22.2) we clearly have that

G(C 1 (t),...,Ck(t))≤

∑k

i=1


x∈Ci(t)

‖x−μ(it−1)‖^2. (22.4)

In addition, the definition of the new partition (C 1 (t),...,Ck(t)) implies that it
minimizes the expression

∑k
i=1


x∈Ci‖x−μ

(t−1)
i ‖^2 over all possible partitions
(C 1 ,...,Ck). Hence,

∑k

i=1


x∈C(it)

‖x−μ(it−1)‖^2 ≤

∑k

i=1


x∈C(it−1)

‖x−μ(it−1)‖^2. (22.5)

Using Equation (22.3) we have that the right-hand side of Equation (22.5) equals
G(C 1 (t−1),...,Ck(t−1)). Combining this with Equation (22.4) and Equation (22.5),
we obtain thatG(C( 1 t),...,Ck(t))≤G(C 1 (t−1),...,Ck(t−1)), which concludes our
proof.

While the preceding lemma tells us that thek-means objective is monotonically
nonincreasing, there is no guarantee on the number of iterations thek-means al-
gorithm needs in order to reach convergence. Furthermore, there is no nontrivial
lower bound on the gap between the value of thek-means objective of the al-
gorithm’s output and the minimum possible value of that objective function. In
fact,k-means might converge to a point which is not even a local minimum (see
Exercise 2). To improve the results ofk-means it is often recommended to repeat
the procedure several times with different randomly chosen initial centroids (e.g.,
we can choose the initial centroids to be random points from the data).
Free download pdf