Social Media Mining: An Introduction

(Axel Boer) #1

P1: WQS Trim: 6.125in×9.25in Top: 0.5in Gutter: 0.75in
CUUS2079-06 CUUS2079-Zafarani 978 1 107 01885 3 January 13, 2014 17:15


172 Community Analysis

normalize mutual information. We provide the following equation, without
proof, which will help us normalize mutual information,

MI≤min(H(L),H(H)), (6.45)

whereH(·) is the entropy function,

H(L)=−



l∈L

nl
n

log

nl
n

(6.46)


H(H)=−



h∈H

nh
n

log

nh
n

. (6.47)


From Equation6.45,wehaveMI≤H(L) andMI≤H(H); therefore,

(MI)^2 ≤H(H)H(L). (6.48)

Equivalently,

MI≤


H(H)



H(L). (6.49)


Equation6.49can be used to normalize mutual information. Thus, we
introduce the NMI as

NMI=


MI



H(L)



H(H)


. (6.50)


By plugging Equations6.47,6.46, and6.44into6.50,

NMI=



h∈H


l∈Lnh,llog

n·nh,l
√ nhnl
(


h∈Hnhlog

nh
n)(


l∈Lnllog

nl
n)

. (6.51)


An NMI value close to one indicates high similarity between commu-
nities found and labels. A value close to zero indicates a long distance
between them.

6.3.2 Evaluation without Ground Truth
When no ground truth is available, we can incorporate techniques based on
semantics or clustering quality measures to evaluate community detection
algorithms.
Free download pdf