Social Media Mining: An Introduction

(Axel Boer) #1

P1: WQS Trim: 6.125in×9.25in Top: 0.5in Gutter: 0.75in
CUUS2079-04 CUUS2079-Zafarani 978 1 107 01885 3 January 13, 2014 16:54


98 Network Models

Algorithm 4.2Preferential Attachment
Require: GraphG(V 0 ,E 0 ), where|V 0 |=m 0 anddv≥ 1 ∀v∈V 0 , number
of expected connectionsm≤m 0 , time to run the algorithmt
1: return A scale-free network
2: //Initial graph withm 0 nodes with degrees at least
1
3: G(V,E)=G(V 0 ,E 0 );
4: for1totdo
5: V=V∪{vi};// add new nodevi
6: whiledi =mdo
7: Connectvi to a random nodevj∈V,i =j ( i.e.,E=E∪
{e(vi,vj)})
with probabilityP(vj)=∑dj
kdk

.


8: end while
9: end for
10: ReturnG(V,E)

node’s degree, the higher the probability of new nodes getting connected
to it. Unlike random graphs in which we assume friendships are formed
randomly, in the preferential attachment model we assume that individuals
are more likely to befriend gregarious others. The model’s algorithm is
provided in Algorithm4.2.
The algorithm starts with a graph containing a small set of nodesm 0
and then adds new nodes one at a time. Each new node gets to connect to
m≤m 0 other nodes, and each connection to existing nodevidepends on
the degree ofvi(i.e.,P(vi)=∑di
jdj

). Intrinsically, higher degree nodes get
more attention from newly added nodes. Note that the initialm 0 nodes must
have at least degree 1 for probabilityP(vi)=∑di
jdj

to be nonzero.
The model incorporates two ingredients – (1) thegrowthelement and
(2) thepreferential attachmentelement – to achieve a scale-free network.
The growth is realized by adding nodes as time goes by. The preferen-
tial attachment is realized by connecting to nodevibased on its degree
probability,P(vi)=∑di
jdj

. Removing any one of these ingredients gener-
ates networks that are not scale-free (see Exercises). Next, we show that
preferential attachment models are capable of generating networks with a
power-law degree distribution. They are also capable of generating small
average path length, but unfortunately fail at generating the high clustering
coefficients observed in real-world networks.

Free download pdf