Social Media Mining: An Introduction

(Axel Boer) #1

P1: qVa Trim: 6.125in×9.25in Top: 0.5in Gutter: 0.75in
CUUS2079-03 CUUS2079-Zafarani 978 1 107 01885 3 January 13, 2014 16:45


3.4 Similarity 75

vk


vi


vj


σregular
(vk, v
j)

σregular

(vi, vj

)


(a) Original Formulation (b) Relaxed Formulation

σregular(vk ,vl)


vi σregular(vi ,vj) vj


vk vl


Figure 3.15. Regular Equivalence. Solid lines denote edges, and dashed lines denote
similarities between nodes. In regular equivalence, similarity between nodesviandvj
is replaced by similarity between (a) their neighborsvkandvlor between (b) neighbor
vkand nodevj.

Unfortunately, this formulation is self-referential because solving fori
andjrequires solving forkandl, solving forkandlrequires solving for
their neighbors, and so on. So, we relax this formulation and assume that
nodeviis similar to nodevjwhenvjis similar tovi’s neighborsvk. This is
shown in Figure3.15(b). Formally,

σregular(vi,vj)=α


k

Ai,kσRegular(vk,vj). (3.66)

In vector format, we have

σregular=αAσRegular. (3.67)

A node is highly similar to itself. To make sure that our formulation
guarantees this, we can add an identity matrix to this vector format. Adding
an identity matrix will add 1 to all diagonal entries, which represent self-
similaritiesσregular(vi,vi):

σregular=αAσRegular+I. (3.68)

By rearranging, we get

σregular=(I−αA)−^1 , (3.69)

which we can use to find the regular equivalence similarity.
Note the similarity between Equation3.69and that of Katz centrality
(Equation3.21). As with Katz centrality, we must be careful how we choose
αfor convergence. A common practice is to select anαsuch thatα< 1 /λ,
whereλis the largest eigenvalue ofA.
Free download pdf