Social Media Mining: An Introduction

P1: WQS Trim: 6.125in×9.25in Top: 0.5in Gutter: 0.75in
CUUS2079-04 CUUS2079-Zafarani 978 1 107 01885 3 January 13, 2014 16:54

86 Network Models

using notationcorkin the literature. Since we frequently usekto denote degree values, we usecto denote the expected degree of a random graph,

c=(n−1)p, (4.4)

or equivalently,

p=

c n− 1

. (4.5)

Proposition 4.2.The expected number of edges in G(n,p)is

(n 2

)

p.

Proof.Following the same line of argument, because edges are selected independently and we have a maximum of

(n 2

)

edges, the expected number of edges is

(n 2

)

p.

Proposition 4.3. In a graph generated by G(n,p)model, the probability of observing m edges is

P(|E|=m)=

((n 2

)

m

)

pm(1−p)(

n 2 )−m , (4.6)

which is a binomial distribution.

Proof. medges are selected from the

(n 2

)

possible edges. These edges are formed with probabilitypm, and other edges are not formed (to guarantee the existence of onlymedges) with probability (1−p)(

n 2 )−m .

Given these basic propositions, we next analyze how random graphs evolve as we add edges to them.

4.2.1 Evolution of Random Graphs In random graphs, when nodes form connections, after some time a large fraction of nodes get connected (i.e., there is a path between any pair of them). This large fraction forms aconnected component, commonly called GIANT thelargest connected component or thegiant component. We can tune the COMPONENT behavior of the random graph model by selecting the appropriatepvalue. InG(n,p), whenp=0, the size of the largest connected component is 0 (no two pairs are connected), and whenp=1, the size isn(all pairs are connected). Table4.3provides the size of the largest connected component (slcin the figure) for random graphs with 10 nodes and differentpvalues.

Social Media Mining: An Introduction

. (4.5)

)

)

)

)

)

)

Get our desktop app

Company

Features

Documentation

Resources