Social Media Mining: An Introduction

(Axel Boer) #1

P1: Trim: 6.125in×9.25in Top: 0.5in Gutter: 0.75in
CUUS2079-09 CUUS2079-Zafarani 978 1 107 01885 3 January 13, 2014 17:28


260 Recommendation in Social Media

matrix factorization methods can be used to findUandV, given user-item
rating matrixR. In mathematical terms, in this matrix factorization, we are
findingUandVby solving the following optimization problem:

minU,V

1


2


||R−UTV||^2 F. (9.50)


Users often have only a few ratings for items; therefore, theRmatrix is
very sparse and has many missing values. Since we computeUandVonly
for nonmissing ratings, we can change Equation9.50to

min
U,V

1


2


∑n

i= 1

∑m

j= 1

Iij(Rij−UiTVj)^2 , (9.51)

whereIij∈{ 0 , 1 }andIij=1 when userihas rated itemjand is equal
to 0 otherwise. This ensures that nonrated items do not contribute to the
summations being minimized in Equation9.51. Often, when solving this
optimization problem, the computedUandV can estimate ratings for
thealready rateditems accurately, but fail at predicting ratings for unrated
OVERFITTING items. This is known as theoverfitting problem. The overfitting problem can
be mitigated by allowing bothUandVto only consider important features
required to represent the data. In mathematical terms, this is equivalent to
bothUandVhaving small matrix norms. Thus, we can change Equation
9.51to
1
2

∑n

i= 1

∑m

j= 1

Iij(Rij−UiTVj)^2 +

λ 1
2

||U||^2 F+


λ 2
2

||V||^2 F, (9.52)


whereλ 1 ,λ 2 >0 are predetermined constants that control the effects of
matrix norms. The termsλ 21 ||U||^2 Fandλ 22 ||V||^2 Fare denoted asregulariza-
REGULARIZATIONtionterms. Note that to minimize Equation9.52, we need to minimize all
TERM terms in the equation, including the regularization terms. Thus, whenever
one needs to minimize some other constraint, it can be introduced as a new
additive term in Equation9.52. Equation9.52lacks a term that incorporates
the social network of users. For that, we can add another regularization term,
∑n

i= 1


j∈F(i)

sim(i,j)||Ui−Uj||^2 F, (9.53)

wheresim(i,j) denotes the similarity between useriand j(e.g., cosine
similarity or Pearson correlation between their ratings) andF(i) denotes
the friends ofi. When this term is minimized, it ensures that the taste for
useriis close to that of all his friendsj∈F(i). As we did with previous
Free download pdf