Social Media Mining: An Introduction

P1: WQS Trim: 6.125in×9.25in Top: 0.5in Gutter: 0.75in
CUUS2079-END CUUS2079-Zafarani 978 1 107 01885 3 January 13, 2014 19:33

296 Notes

Chapter 5

See [Zafarani, Cole, and Liu, 2010] for a repository of network data.

One can use all unique words in all documents (D) or a more frequent subset of
words in the documents for vectorization.

Note that in our example, the class attribute can take two values; therefore, the
initial guess ofP(yi= 1 |N(vi))=^12 = 0 .5 is reasonable. When a class attribute
takesnvalues, we can set our initial guess toP(yi= 1 |N(vi))=^1 n.

Chapter 6

For more details refer to [Chung, 1996].

See [Kossinets and Watts, 2006] for details.

LetX be the solution to spectral clustering. Consider an orthogonal matrix
Q (i.e., QQT=I). LetY=XQ. In spectral clustering, we are maximiz-
ingTr(XTLX)=Tr(XTLXQQT)=Tr(QTXTLXQ)=Tr((XQ)TL(XQ))=
Tr(YTLY). In other words,Yis another answer to our trace-maximization prob-
lem. This proves that the solutionXto spectral clustering is non-unique under
orthogonal transformationsQ.

http://www.mturk.com.

Chapter 7

This assumption can be lifted [Kempe et al., 2003].

See [Gruhl et al., 2004] for an application in the blogosphere.

Formally, assumingP =NP, there is no polynomial time algorithm for this prob-
lem.

The internal-influence model is similar to the SI model discussed later in the section
on epidemics. For the sake of completeness, we provide solutions to both. Readers
are encouraged to refer to that model in Section7.4for further insight.

A generalization of these techniques over networks can be found in [Hethcote et al.,
1981 ;Hethcote, 2000;Newman, 2010].

Chapter 8

From ADD health data: http://www.cpc.unc.edu/projects/addhealth.

The directed case is left to the reader.

As defined by the Merriam-Webster dictionary.

In the original paper, the authors utilize a weight function instead. Here, for clarity,
we use coefficients for all parameters.

Note that Equation 8.28is defined recursively, because I(p) depends on
InfluenceFlowand that, in turn, depends onI(p) (Equation8.27). Therefore, to
estimateI(p), we can use iterative methods where we start with an initial value for
I(p) and compute until convergence.

Note that we have assumed that homophily is the leading social force in the
network that leads to its assortativity change. This assumption is often strong for
social networks because other social forces act in these networks.

Social Media Mining: An Introduction

Get our desktop app

Company

Features

Documentation

Resources