Social Media Mining: An Introduction

(Axel Boer) #1

P1: WQS Trim: 6.125in×9.25in Top: 0.5in Gutter: 0.75in
CUUS2079-06 CUUS2079-Zafarani 978 1 107 01885 3 January 13, 2014 17:15


154 Community Analysis

v 1 v 4

v 3

v 2 v 5 v 7

v 8 v 9

A

C B
v 6

Figure 6.8. Minimum Cut (A) and Two More Balanced Cuts (B and C) in a Graph.

cut is minimized. Figure6.8depicts several cuts in a graph. For example,
MINIMUM cutBhas size 4, andAis the minimum cut.
CUT Based on the well-knownmax-flow min-cuttheorem, the minimum cut
of a graph can be computed efficiently. However, minimum cuts are not
always preferred for community detection. Often, they result in cuts where a
partition is only one node (singleton), and the rest of the graph is in the other.
Typically, communities with balanced sizes are preferred. Figure6.8depicts
an example where the minimum cut (A) creates unbalanced partitions,
whereas, cutCis a more balanced cut.
To solve this problem, variants of minimum cut define an objec-
tive function, minimizing (or maximizing) that during the cut-finding
procedure, results in a more balanced and natural partitioning of the
data. Consider a graphG(V,E). A partitioning of G intokpartitions
is a tuple⋃ P=(P 1 ,P 2 ,P 3 ,...,Pk), such thatPi⊆V,Pi∩Pj=∅and
k
i= 1 Pi=V. Then, the objective function for the ratio cut and normalized
cut are defined as follows:

RATIO CUT
AND
NORMALIZED
CUT

Ratio Cut(P)=

1


k

∑k

i= 1

cut(Pi,P ̄i)
|Pi|

, (6.7)


Normalized Cut(P)=

1


k

∑k

i= 1

cut(Pi,P ̄i)
vol(Pi)

, (6.8)


whereP ̄i=V−Piis the complement cut set,cut(Pi,P ̄i) is the size of
the cut, and volumevol(Pi)=


v∈Pidv. Both objective functions provide
a more balanced community size by normalizing the cut size either by the
number of vertices in the cutset or the volume (total degree).
Both the ratio cut and normalized cut can be formulated in a matrix for-
mat. Let matrixX∈{ 0 , 1 }|V|×kdenote thecommunity membershipmatrix,
where Xi,j=1 if nodei is in community j; otherwise,Xi,j=0. Let
D=diag(d 1 ,d 2 ,...,dn) represent the diagonal degree matrix. Then the
Free download pdf