Social Media Mining: An Introduction

P1: WQS Trim: 6.125in×9.25in Top: 0.5in Gutter: 0.75in
CUUS2079-04 CUUS2079-Zafarani 978 1 107 01885 3 January 13, 2014 16:54

84 Network Models

Table 4.2. Average Path Length in Real-World Networks (from [Broder et al., 2000 ;Ugander et al., 2011;Mislove et al., 2007])

Web Facebook Flickr LiveJournal Orkut YouTube 16.12 4.7 5.67 5.88 4.25 5.10

4.1.3 Average Path Length

In real-world networks, any two members of the network are usually connected via short paths. In other words, the average path length is small. This is known as thesmall-worldphenomenon. In the well-knownsmall-world experimentconducted in the 1960s by Stanley Milgram, Milgram conjec- tured that people around the world are connected to one another via a path of at most six individuals (i.e.,the six degrees of separation). Similarly, we observe small average path lengths in social networks. For example, in May 2011, the average path length between individuals in the Facebook graph was 4.7. This average was 4.3 for individuals in the United States at the same time [Ugander et al., 2011]. Table4.2provides the average path length for real-world social networks and the web. These three properties – powerlaw degree distribution, high clustering coefficient, and small average path length are consistently observed in real-world networks. We design models based on simple assumptions on how friendships are formed, hoping that these models generate scale-free networks, with high clustering coefficient and small average path lengths. We start with the simplest network model, the random graph model.

4.2 Random Graphs

We start with the most basicassumptionon how friendships can be formed:

Edges (i.e., friendships) between nodes (i.e., individuals) are formed randomly.

The random graph model follows this basic assumption. In reality friend-

SMALL-WORLD AND SIX DEGREES OF SEPARATION

ships in real-world networks are far from random. By assuming random friendships, we simplify the process of friendship formation in real-world networks, hoping that these random friendships ultimately create networks that exhibit common characteristics observed in real-world networks. Formally, we can assume that for a graph with afixednumber of nodes n, any of the

(n 2

)

edges can be formed independently, with probabilityp. G(n,p) This graph is called arandom graphand we denote it as theG(n,p) model. This model was first proposed independently by Edgar Gilbert [ 1959 ] and

Social Media Mining: An Introduction

)

Get our desktop app

Company

Features

Documentation

Resources