Social Media Mining: An Introduction

P1: Sqe Trim: 6.125in×9.25in Top: 0.5in Gutter: 0.75in
CUUS2079-05 CUUS2079-Zafarani 978 1 107 01885 3 January 13, 2014 19:23

124 Data Mining Essentials

Therefore,

XTY=XTXW. (5.44)

Since XTX is invertible for any X, by multiplying both sides by (XTX)−^1 , we get

W=(XTX)−^1 XTY. (5.45)

Alternatively, one can compute the singular value decomposition (SVD) ofX=U VT:

W=(XTX)−^1 XTY =(V UTU VT)−^1 V UTY =(V 2 VT)−^1 V UTY =V −^2 VTV UTY =V −^1 UTY, (5.46)

and since we can have zero singular values,

W=V +UTY, (5.47)

where +is the submatrix of with nonzero singular values.

Logistic Regression Logistic regression provides a probabilistic view of regression. For simplic- ity, let us assume that the class attribute can only take values of 0 and 1. Formally, logistic regression finds probabilitypsuch that

P(Y= 1 |X)=p, (5.48)

whereXis the vector of features andYis the class attribute. We can use linear regression to approximatep. In other words, we can assume that probabilitypdepends onX; that is,

p=βX, (5.49)

whereβis a vector of coefficients. Unfortunately,βXcan take unbounded values becauseXcan take on any value and there are no constraints on how β’s are chosen. However, probabilitypmust be in range [0,1]. SinceβX is unbounded, we can perform a transformationg(.)onpsuch that it also

Social Media Mining: An Introduction

Get our desktop app

Company

Features

Documentation

Resources