Social Media Mining: An Introduction

(Axel Boer) #1

P1: Sqe Trim: 6.125in×9.25in Top: 0.5in Gutter: 0.75in
CUUS2079-05 CUUS2079-Zafarani 978 1 107 01885 3 January 13, 2014 19:23


124 Data Mining Essentials

Therefore,

XTY=XTXW. (5.44)

Since XTX is invertible for any X, by multiplying both sides by
(XTX)−^1 , we get

W=(XTX)−^1 XTY. (5.45)

Alternatively, one can compute the singular value decomposition (SVD)
ofX=U VT:

W=(XTX)−^1 XTY
=(V UTU VT)−^1 V UTY
=(V 2 VT)−^1 V UTY
=V −^2 VTV UTY
=V −^1 UTY, (5.46)

and since we can have zero singular values,

W=V +UTY, (5.47)

where +is the submatrix of with nonzero singular values.

Logistic Regression
Logistic regression provides a probabilistic view of regression. For simplic-
ity, let us assume that the class attribute can only take values of 0 and 1.
Formally, logistic regression finds probabilitypsuch that

P(Y= 1 |X)=p, (5.48)

whereXis the vector of features andYis the class attribute. We can use
linear regression to approximatep. In other words, we can assume that
probabilitypdepends onX; that is,

p=βX, (5.49)

whereβis a vector of coefficients. Unfortunately,βXcan take unbounded
values becauseXcan take on any value and there are no constraints on how
β’s are chosen. However, probabilitypmust be in range [0,1]. SinceβX
is unbounded, we can perform a transformationg(.)onpsuch that it also
Free download pdf