Social Media Mining: An Introduction

(Axel Boer) #1

P1: WQS Trim: 6.125in×9.25in Top: 0.5in Gutter: 0.75in
CUUS2079-02 CUUS2079-Zafarani 978 1 107 01885 3 January 13, 2014 16:38


2.1 Graph Basics 17

important role in describing the network being studied. Any distribution
can be described by its members. In our case, these are the degrees of all
nodes in the graph. The degree distributionpd(orp(d), orp(dv=d)gives
the probability that a randomly selected nodevhas degreed. Becausepd
is a probability distribution

∑∞


d= 1 pd=1. In a graph withnnodes,pdis
defined as

pd=

nd
n

, (2.5)


wherendis the number of nodes with degreed. An important, commonly
performed procedure is to plot a histogram of the degree distribution, in
which thex-axis represents the degree (d) and they-axis represents either
(i) the number of nodes having that degree (nd) or (2) the fraction of nodes
having that degree (pd).

Example 2.1.For the graph provided in Figure2.1, the degree distribution
pdfor d={ 1 , 2 , 3 , 4 }is

p 1 =^17 , p 2 =^47 , p 3 =^17 , p 4 =^17. (2.6)

Because we have four nodes have degree 2, and degrees 1, 3, and 4 are
observed once.

Example 2.2.In social networking sites, friendship relationships can be
represented by a large graph. In this graph, nodes represent individuals and
the edges represent friendship relationships. We can compute the degrees
and plot the degree distribution using a graph where thex-axis is the
degree and they-axis is the fraction of nodes with that degree.^1 The degree
distribution plot for Facebook in May 2012 is shown in Figure2.3.A
general trend observable in social networking sites is that there exist many
users with few connections and there exist a handful of users with very
large numbers of friends. This is commonly called the power-law degree
distribution. POWER-LAW
DISTRIBUTION
As previously discussed, any graphGcan be represented as a pair
G(V,E), whereV is the node set andE is the edge set. Since edges
are between nodes, we have

E⊆V×V. (2.7)
Free download pdf