Social Media Mining: An Introduction

(Axel Boer) #1

P1: WQS Trim: 6.125in×9.25in Top: 0.5in Gutter: 0.75in
CUUS2079-06 CUUS2079-Zafarani 978 1 107 01885 3 January 13, 2014 17:15


6.3 Community Evaluation 173

(a)U.S. Constitution (b)Sports
Figure 6.16. Tag Clouds for Two Communities.

Evaluation with Semantics
A simple way of analyzing detected communities is to analyze other
attributes (posts, profile information, content generated, etc.) of community
members to see if there is a coherency among community members. The
coherency is often checked via human subjects. For example, the Amazon
Mechanical Turk platform^4 allows defining this task on its platform for
human workers and hiring individuals from all around the globe to perform
tasks such as community evaluation. To help analyze these communities,
one can use word frequencies. By generating a list of frequent keywords for
each community, human subjects determine whether these keywords rep-
resent a coherent topic. A more focused and single-topic set of keywords
represents a coherent community. Tag clouds are one way of demonstrating
these topics. Figure6.16depicts two coherent tag clouds for a community
related to the U.S. Constitution and another for sports. Larger words in these
tag clouds represent higher frequency of use.

Evaluation Using Clustering Quality Measures
When experts are not available, an alternative is to use clustering quality
measures. This approach is commonly used when two or more community
detection algorithms are available. Each algorithm is run on the target net-
work, and the quality measure is computed for the identified communities.
The algorithm that yields a more desirable quality measure value is con-
sidered a better algorithm. SSE (sum of squared errors) and inter-cluster
distance are some of the quality measures. For other measures refer to
Chapter 5.
We can also follow this approach for evaluating a single community
detection algorithm; however, we must ensure that the clustering quality
measure used to evaluate community detection is different from the measure
Free download pdf