Social Media Mining: An Introduction

(Axel Boer) #1

P1: Trim: 6.125in×9.25in Top: 0.5in Gutter: 0.75in
CUUS2079-08 CUUS2079-Zafarani 978 1 107 01885 3 January 13, 2014 17:22


8.4 Distinguishing Influence and Homophily 239

Algorithm 8.3Influence Significance Test
Require: Gt,Gt+ 1 ,Xt,Xt+ 1 , number of randomized runsn,α
1: return Significance
2: g 0 =GInfluence(t);
3: for all 1 ≤i≤n do
4: XRti+ 1 =randomizeI(Xt,Xt+ 1 );
5: gi=A(Gt,XRit+ 1 )−A(Gt,Xt);
6: end for
7: ifg 0 larger than (1−α/2)% of values in{gi}ni= 1 then
8: return significant;
9: else ifg 0 smaller thanα/2% of values in{gi}ni= 1 then
10: return significant;
11: else
12: return insignificant;
13: end if

(influence), or A(Gt+ 1 ,Xt)−A(Gt,Xt) (homophily), are significant or
not. To detect change significance, we use the influence significance test
and homophily significance test algorithms outlined in Algorithms8.3
and8.4, respectively. The influence significance algorithm starts with INFLUENCE
SIGNIFICANCE
TEST

computing influence gain, which is the assortativity difference observed
due to influence (g 0 ). It then forms a random attribute set at timet+ 1
(null-hypotheses), assuming that attributes changed randomly att+1 and
not due to influence. This random attribute setXRit+ 1 is formed from
Xt+ 1 by making sure that effects of influence in changing attributes are
removed.
For instance, assume two usersuandvare connected at timet, and
uhas hobbymoviesat timetandvdoes not have this hobby listed at
timet. Now, assuming there is an influence ofuoverv, so that at time
t+1,vaddsmoviesto her set of hobbies. In other words,movies ∈Xvt
andmovies∈Xvt+ 1. To remove this influence, we can constructXrti+ 1 by
removingmoviesfrom the hobbies ofvat timet+1 and adding some
random hobby such asreading, which is ∈Xut and ∈Xvt, to the list of
hobbies ofvat timet+1inXRti+ 1. This guarantees that the randomized
XRit+ 1 constructed has no sign of influence. We construct this random-
ized setntimes; this set is then used to compute influence gains{gi}ni= 1.
Obviously, the more distantg 0 is from these gains, the more significant influ-
ence is. We can assume that wheneverg 0 is smaller thanα/2% (or larger
Free download pdf