Social Media Mining: An Introduction

(Axel Boer) #1

P1: Trim: 6.125in×9.25in Top: 0.5in Gutter: 0.75in
CUUS2079-10 CUUS2079-Zafarani 978 1 107 01885 3 January 13, 2014 17:56


274 Behavior Analytics

0.025

0.02

0.01

(^005101520253035404550)
0.015
probability
0.005
Figure 10.2. Probability of Joining a Community (with Error Bars) as a Function of the
Number of FriendsmAlready in the Community (fromBackstrom et al. [2006]).
community (i.e., class attribute). Figure10.2depicts the probability of
joining a community with respect to the number of friends an individual
has who are already members of the community. The probability increases
DIMINISHING as more friends are in a community, but adiminishing returnsproperty is
RETURNS also observed, meaning that when enough friends are inside the community,
more friends have no or only marginal effects on the likelihood of the
individual’s act of joining the community.
Thus far we have defined only one feature. However, one can go beyond
a single feature. Figure10.3lists the comprehensive features that can be
used to analyze community-joining behavior.
As discussed, these features may or may not affect the joining behavior;
thus, a validation procedure is required to understand their effect on the
joining behavior. Which one of these features is more relevant to the joining
behavior? In other words, which feature can help best determine whether
individuals will join or not?
To answer this question, we can use anyfeature selectionalgorithm.
Feature selection algorithms determine features that contribute the most to
the prediction of the class attribute. Alternatively, we can use a classifica-
tion algorithm, such as decision tree learning, to identify the relationship
between features and the class attribute (i.e., joined={Yes, No}). The ear-
lier a feature is selected in the learned tree (i.e., is closer to the root of the
tree), the more important to the prediction of the class attribute value.

Free download pdf