P1: Trim: 6.125in×9.25in Top: 0.5in Gutter: 0.75in
CUUS2079-09 CUUS2079-Zafarani 978 1 107 01885 3 January 13, 2014 17:28
252 Recommendation in Social Media
Using cosine similarity (or Pearson correlation), the similarity between
Aladdin and others can be computed:
sim(Aladdin,Lion King)=
0 × 3 + 4 × 5 + 2 × 1 + 2 × 2
√
24
√
39
= 0. 84.
(9.21)
sim(Aladdin,Mulan)=
0 × 3 + 4 × 0 + 2 × 4 + 2 × 0
√
24
√
25
= 0. 32.
(9.22)
sim(Aladdin,Anastasia)=
0 × 3 + 4 × 2 + 2 × 2 + 2 × 1
√
24
√
18
= 0. 67.
(9.23)
Now, assuming that the neighborhood size is 2, then Lion King and Anas-
tasia are the two most similar neighbors. Then, Jane’s rating for Aladdin
computed from item-based collaborative filtering is
rJane,Aladdin=r ̄Aladdin+
sim(Aladdin,Lion King)(rJane,Lion King−r ̄Lion King)
sim(Aladdin,Lion King)+sim(Aladdin,Anastasia)
+
sim(Aladdin,Anastasia)(rJane,Anastasia−r ̄Anastasia)
sim(Aladdin,Lion King)+sim(Aladdin,Anastasia)
= 2 +
0 .84(3− 2 .8)+ 0 .67(0− 1 .6)
0. 84 + 0. 67
= 1. 40. (9.24)
Model-Based Collaborative Filtering
In memory-based methods (either item-based or user-based), one aims to
predict the missing ratings based on similarities between users or items. In
model-based collaborative filtering, one assumes that an underlying model
governs the way users rate. We aim to learn that model and then use that
model to predict the missing ratings. Among a variety of model-based
techniques, we focus on a well-established model-based technique that is
SINGULAR based on singular value decomposition (SVD).
VALUE
DECOMPOSITION
SVD is a linear algebra technique that, given a real matrixX∈Rm×n,
m≥n, factorizes it into three matrices,
X=U VT, (9.25)
whereU∈Rm×mandV∈Rn×nare orthogonal matrices and ∈Rm×nis
a diagonal matrix. The product of these matrices is equivalent to the original
matrix; therefore, no information is lost. Hence, the process islossless.
LOSSLESS
MATRIX
FACTORIZATION