Understanding Machine Learning: From Theory to Algorithms

(Jeff_L) #1
22.3 Spectral Clustering 315

22.3 Spectral Clustering


Often, a convenient way to represent the relationships between points in a data
setX ={x 1 ,...,xm}is by asimilarity graph; each vertex represents a data
pointxi, and every two vertices are connected by an edge whose weight is their
similarity,Wi,j=s(xi,xj), whereW ∈Rm,m. For example, we can setWi,j=
exp(−d(xi,xj)^2 /σ^2 ), whered(·,·) is a distance function andσis a parameter.
The clustering problem can now be formulated as follows: We want to find a
partition of the graph such that the edges between different groups have low
weights and the edges within a group have high weights.
In the clustering objectives described previously, the focus was on one side
of our intuitive definition of clustering – making sure that points in the same
cluster are similar. We now present objectives that focus on the other requirement


  • points separated into different clusters should be nonsimilar.


22.3.1 Graph Cut


Given a graph represented by a similarity matrixW, the simplest and most
direct way to construct a partition of the graph is to solve the mincut problem,
which chooses a partitionC 1 ,...,Ckthat minimizes the objective

cut(C 1 ,...,Ck) =

∑k

i=1


r∈Ci,s/∈Ci

Wr,s.

Fork= 2, the mincut problem can be solved efficiently. However, in practice it
often does not lead to satisfactory partitions. The problem is that in many cases,
the solution of mincut simply separates one individual vertex from the rest of the
graph. Of course, this is not what we want to achieve in clustering, as clusters
should be reasonably large groups of points.
Several solutions to this problem have been suggested. The simplest solution
is to normalize the cut and define the normalized mincut objective as follows:

RatioCut(C 1 ,...,Ck) =

∑k

i=1

1

|Ci|


r∈Ci,s/∈Ci

Wr,s.

The preceding objective assumes smaller values if the clusters are not too small.
Unfortunately, introducing this balancing makes the problem computationally
hard to solve. Spectral clustering is a way to relax the problem of minimizing
RatioCut.

22.3.2 Graph Laplacian and Relaxed Graph Cuts


The main mathematical object for spectral clustering is the graph Laplacian
matrix. There are several different definitions of graph Laplacian in the literature,
and in the following we describe one particular definition.
Free download pdf