Social Media Mining: An Introduction

(Axel Boer) #1

P1: Sqe Trim: 6.125in×9.25in Top: 0.5in Gutter: 0.75in
CUUS2079-05 CUUS2079-Zafarani 978 1 107 01885 3 January 13, 2014 19:23


122 Data Mining Essentials

P(y 3 |N(v 3 ))does not need to be computed again because its neighbors
are all labeled (thus, this probability estimation has converged). Similarly,

P(y 4 |N(v 4 ))=

1


2


(1+ 0 .5)= 0. 75 , (5.30)


P(y 6 |N(v 6 ))=

1


2


(0. 75 +0)= 0. 38. (5.31)


We need to recompute both P(y 4 |N(v 4 ))and P(y 6 |N(v 6 ))until conver-
gence. Let P(t)(yi|N(vi))denote the estimated probability after t computa-
tions. Then,

P(1)(y 4 |N(v 4 ))=

1


2


(1+ 0 .38)= 0. 69 , (5.32)


P(1)(y 6 |N(v 6 ))=

1


2


(0. 69 +0)= 0. 35 , (5.33)


P(2)(y 4 |N(v 4 ))=

1


2


(1+ 0 .35)= 0. 68 , (5.34)


P(2)(y 6 |N(v 6 ))=

1


2


(0. 68 +0)= 0. 34 , (5.35)


P(3)(y 4 |N(v 4 ))=

1


2


(1+ 0 .34)= 0. 67 , (5.36)


P(3)(y 6 |N(v 6 ))=

1


2


(0. 67 +0)= 0. 34 , (5.37)


P(4)(y 4 |N(v 4 ))=

1


2


(1+ 0 .34)= 0. 67 , (5.38)


P(4)(y 6 |N(v 6 ))=

1


2


(0. 67 +0)= 0. 34. (5.39)


After four iterations, both probabilities converge. So, from these proba-
bilities (Equations5.29,5.38, and5.39), we can tell that nodesv 3 andv 4
will likely have class attribute value 1 and nodev 6 will likely have class
attribute value 0.

5.4.5 Regression
In classification, class attribute values are discrete. In regression, class
attribute values are real numbers. For instance, we wish to predict the
stock market value (class attribute) of a company given information about
the company (features). The stock market value is continuous; therefore,
regression must be used to predict it. The input to the regression method is
Free download pdf