Understanding Machine Learning: From Theory to Algorithms

(Jeff_L) #1

312 Clustering


parameter. In practice, it is often up to the user of the clustering algorithm to
choose the parameterkthat is most suitable for the given clustering problem.
In the following we describe some of the most common objective functions.


  • Thek-means objective functionis one of the most popular clustering
    objectives. Ink-means the data is partitioned into disjoint setsC 1 ,...,Ck
    where eachCiis represented by a centroidμi. It is assumed that the input
    setXis embedded in some larger metric space (X′,d) (so thatX ⊆ X′)
    and centroids are members ofX′. Thek-means objective function measures
    the squared distance between each point inXto the centroid of its cluster.
    The centroid ofCiis defined to be
    μi(Ci) = argmin
    μ∈X′



x∈Ci

d(x,μ)^2.

Then, thek-means objective is

Gk−means((X,d),(C 1 ,...,Ck)) =

∑k

i=1


x∈Ci

d(x,μi(Ci))^2.

This can also be rewritten as

Gk−means((X,d),(C 1 ,...,Ck)) = min
μ 1 ,...μk∈X′

∑k

i=1


x∈Ci

d(x,μi)^2. (22.1)

Thek-means objective function is relevant, for example, in digital com-
munication tasks, where the members ofXmay be viewed as a collection
of signals that have to be transmitted. WhileXmay be a very large set
of real valued vectors, digital transmission allows transmitting of only a
finite number of bits for each signal. One way to achieve good transmis-
sion under such constraints is to represent each member ofXby a “close”
member of some finite setμ 1 ,...μk, and replace the transmission of any
x∈ Xby transmitting the index of the closestμi. Thek-means objective
can be viewed as a measure of the distortion created by such a transmission
representation scheme.


  • Thek-medoids objective functionis similar to thek-means objective,
    except that it requires the cluster centroids to be members of the input
    set. The objective function is defined by


GK−medoid((X,d),(C 1 ,...,Ck)) =μ min
1 ,...μk∈X

∑k

i=1


x∈Ci

d(x,μi)^2.


  • Thek-median objective functionis quite similar to thek-medoids objec-
    tive, except that the “distortion” between a data point and the centroid
    of its cluster is measured by distance, rather than by the square of the
    distance:


GK−median((X,d),(C 1 ,...,Ck)) =μ min
1 ,...μk∈X

∑k

i=1


x∈Ci

d(x,μi).
Free download pdf