Social Media Mining: An Introduction

P1: Sqe Trim: 6.125in×9.25in Top: 0.5in Gutter: 0.75in
CUUS2079-05 CUUS2079-Zafarani 978 1 107 01885 3 January 13, 2014 19:23

5.4 Supervised Learning 123

a dataset where attributes are represented usingx 1 ,x 2 ,...,xm(also known asregressors) and class attribute is represented usingY(also known as the dependent variable), where the class attribute is a real number. We want to find the relation betweenYand the vectorX=(x 1 ,x 2 ,...,xm). We discuss two basic regression techniques: linear regression and logistic regression.

Linear Regression In linear regression, we assume that the class attributeYhas a linear relation with the regressors (feature set)Xby considering a linear error. In other words,

Y=XW+ , (5.40)

whereWrepresents the vector of regression coefficients. The problem of regression can be solved by estimatingWusing the training dataset and its labelsYsuch that fitting error is minimized. A variety of methods have been introduced to solve the linear regression problem, most of which use least squares or maximum-likelihood estimation. We employ the least squares technique here. Interested readers can refer to the bibliographic notes for more detailed analyses. In the least square method, we findW using regressorsXand labelsYsuch that the square of fitting errorepsilon is minimized.

2 =|| 2 || = ||Y−XW||^2. (5.41)

To minimize , we compute the gradient and set it to zero to find the optimalW:

∂||Y−XW||^2 ∂W

= 0. (5.42)

We know that for anyX,||X||^2 =(XTX); therefore,

∂||Y−XW||^2 ∂W

=

∂(Y−XW)T(Y−XW)

∂W

=

∂(YT−WTXT)(Y−XW)

∂W

=

∂(YTY−YTXW−WTXTY+WTXTXW)

∂W

=− 2 XTY+ 2 XTXW= 0. (5.43)

Social Media Mining: An Introduction

= 0. (5.42)

=

∂(Y−XW)T(Y−XW)

∂W

=

∂(YT−WTXT)(Y−XW)

∂W

=

∂(YTY−YTXW−WTXTY+WTXTXW)

∂W

=− 2 XTY+ 2 XTXW= 0. (5.43)

Get our desktop app

Company

Features

Documentation

Resources