Social Media Mining: An Introduction

(Axel Boer) #1

P1: WQS Trim: 6.125in×9.25in Top: 0.5in Gutter: 0.75in
CUUS2079-END CUUS2079-Zafarani 978 1 107 01885 3 January 13, 2014 19:33


Notes


Chapter 1


  1. The data has a power-law distribution and more often than not, data is not inde-
    pendent and identically distributed (i.i.d.) as generally assumed in data mining.


Chapter 2


  1. This is similar to plotting the probability mass function for degrees.

  2. Instead ofWin weighted networks,Cis used to clearly represent capacities.

  3. This edge is often called theweak link.

  4. The proof is omitted here and is a direct result from the minimum-cut/maximum
    flow theorem not discussed in this chapter.


Chapter 3


  1. This constraint is optional and can be lifted based on the context.

  2. When det(I−αAT)=0, it can be rearranged as det(AT−α−^1 I)=0, which is
    basically the characteristic equation. This equation first becomes zero when the
    largest eigenvalue equalsα−^1 , or equivalentlyα= 1 /λ.

  3. Whendoutj =0, we know that since the out-degree is zero,∀i,Aj,i=0, this makes
    the term inside the summation^00. We can fix this problem by settingdoutj =1since
    the node will not contribute any centrality to any other nodes.

  4. Here, we start fromv 1 and follow the edges. One can start from a different node,
    and the result should remain the same.

  5. HITS stands for hypertext-induced topic search.


Chapter 4


  1. For a more detailed approach refer to [Clauset et al., 2009].

  2. Note that forc=1, the component size is stable, and in the limit, no growth will
    be observed. The phase transition happens exactly atc=1.

  3. Hint: The proof is similar to the proof provided for the likelihood of observingm
    edges (Proposition 4.3).


295

Free download pdf