Social Media Mining: An Introduction

(Axel Boer) #1

P1: WQS Trim: 6.125in×9.25in Top: 0.5in Gutter: 0.75in
CUUS2079-06 CUUS2079-Zafarani 978 1 107 01885 3 January 13, 2014 17:15


152 Community Analysis

Figure 6.7. Examples of 2-Cliques, 2-Clubs, and 2-Clans.

 k-Clubis a more restricted definition; it follows the same definition
ask-cliques with the additional constraint that nodes on the shortest
paths should be part of the subgraph.
 k-Clanis ak-clique where, for all shortest paths within the subgraph,
the distance is less than or equal tok.Allk-clans arek-cliques and
k-clubs, but not vice versa. In other words,

k-Clans=k-Cliques∩k-Clubs.

Figure6.7depicts an example of the three discussed models.

Node Similarity

Node similarity attempts to determine the similarity between two nodesvi
andvj. Similar nodes (or most similar nodes) are assumed to be in the
same community. The question has been addressed in different fields; in
STRUCTURAL particular, the problem ofstructural equivalencein the field of sociology
EQUIVALENCE considers the same problem. In structural equivalence, similarity is based
on the overlap between the neighborhood of the vertices. LetN(vi) and
N(vj) be the neighbors of verticesviandvj, respectively. In this case, a
measure of vertex similarity can be defined as follows:

σ(vi,vj)=|N(vi)∩N(vj)|. (6.2)

For large networks, this value can increase rapidly, because nodes may
share many neighbors. Generally, similarity is attributed to a value that is
bounded and usually in the range [0,1]. For that to happen, various normal-
ization procedures such as the Jaccard similarity or the cosine similarity
can be done:

σJaccard(vi,vj)=

|N(vi)∩N(vj)|
|N(vi)∪N(vj)|

, (6.3)


σCosine(vi,vj)=

|N(vi)∩N(vj)|

|N(vi)||N(vj)|

. (6.4)

Free download pdf