Title: 2. Introduction
1Modeling User Rating ProfilesFor Collaborative
Filtering
Benjamin M. Marlin
marlin_at_cs.toronto.edu
University of Toronto. Department of Computer
Science.Toronto, Ontario, Canada
2. Introduction
AP 08
21. Abstract
We present a new latent variable model for
rating-based collaborative filtering called the
User Rating Profile model (URP). URP has
complete generative semantics at the user and
rating profile levels. URP is related to
several models including a multinomial mixture
model, the aspect model, and latent Dirichlet
allocation, but has advantages over each. A
variational Expectation Maximization procedure is
used to fit the URP model. Rating prediction
makes use of a well defined variational inference
procedure. Empirical results on two rating
prediction tasks using the EachMovie and
MovieLens data sets show that URP attains lower
error rates than the multinomial mixture model,
the aspect model, and neighborhood-based
techniques.
2. Introduction
3Collaborative Filtering Formulations
4The Pure, Non-Sequential, Rating-Based Formulation
Rating Database
Active User Ratings
Figure 1 Given a rating prediction method, a
recommendation method is easily obtained
predict, then sort.
53. Related Work
Neighborhood Methods Introduced by Resnick et
al (GroupLens), Shardanand and Maes (Ringo).
All variants can be seen as modifications of
the K-Nearest Neighbor classifier. Rating
Prediction 1. Compute similarity measure between
active user and all users in database.
2. Compute predicted rating for each item.
Multinomial Mixture Model
A simple mixture model with fast, reliable
learning by EM, and low prediction time.
Simple but correct generative semantics. Each
profile is generated by 1 of K types.
Learning
E-Step
M-Step
Rating Prediction
6Latent Dirichlet Allocation
The Aspect Model
Proposed by Blei et al. for text modeling.
Can be used in a co-occurrence based CF
formulation. Can not model ratings. A correct
generative version of the dyadic aspect model.
Users distribution over types is random variable
with Dirichlet prior.
Many versions proposed by Hofmann. Of main
interest are dyadic, triadic, and new vector
version proposed by Marlin. All have incomplete
generative semantics.
Learning (Vector)
Learning
Model learned using variational EM or Minkas
Expectation propagation. Exact inference not
possible.
Prediction
Rating Prediction (Vector)
Needs approximate inference. Variational
methods result in an iterative algorithm.
7Graphical Models
Figure 2 Dyadic Aspect Model
Figure 4 Vector Aspect Model
Figure 3 Triadic Aspect Model
Co-occurrence to Ratings
Ratings to Rating profiles
Variable U User indexVariable Z Attitude
indexVariable Y Item IndexVariable R Rating
ValueParameter ? P(ZUu)Parameter ?
P(RZz,Yy)
Variable U User indexVariable Zy Attitude
indexVariable Ry Rating valueVariable Y Item
IndexParameter ? P(ZUu)Parameter ?
P(RZz,Yy)
Variable U User indexVariable Z Attitude
indexVariable Y Item IndexParameter ?
P(ZUu)Parameter ? P(YZz)
Generative
Generative
8Co-occurrence to Rating Profile
94. The URP Model
Model Specification
Generative Process Unlike a simple mixture
model, each user has a unique distribution over
?. Unlike the aspect model family, there are
proper generative semantics on ?. Unlike LDA,
URP generates a set of complete user rating
profiles
Description The latent space description of a
user is a Dirichlet random variable ? that
encodes a multinomial distribution over user
types. Each setting of the multinomial
variables Zy is an index into K user types or
user attitudes. Each user attitude is
represented by a multinomial distribution over
ratings for each item encoded by ?. The
multinomial variables Ry give the ratings for
each item y. Possible values are from 1 to V.
1. For each user u 1 to N 2. Sample ?
Dirichlet(?)3. For each item y 1 to M4.
Sample z Multinomial(?)5. Sample r
Multimonial(?yz)
10Learning
Variational Approximation Exact inference is
intractable with URP. We define a fully
factorized approximate q-distribution with
variational multinomial parameters ?u, and
variational Dirichlet parameters ?u.
Paramter Estimation
Variational Inference
Solve
11Rating Prediction
Once rating distributions are estimated, any
number of prediction techniques can be used. The
prediction technique should match the error
measure used.
5. Experimentation
Strong Generalization Experiment Users split
into training set and testing set. Ratings for
test users split into observed and unobserved
sets. Trained on training users, tested on test
users. Repeated on 3 random splits of data.
Weak Generalization Experiment Available
ratings for each user split into observed and
unobserved sets. Trained on the observed ratings,
tested on the unobserved ratings. Repeated on
3 random splits of data.
12Error Measure
Data Sets
Normalized Mean Absolute Error Average over
all users of the absolute difference between
predicted and actual ratings. Normalized by
expectation of the difference between predicted
and actual ratings under empirical rating
distribution of the base data set.
EachMovie Compaq Systems Research Center
Ratings 2,811,983 Sparsity 97.6
Filtering 20 ratings
Users 72916 Items 1628 Rating Values 6
MovieLens GroupLens Research Center
Ratings 1,000,209 Sparsity 95.7
Filtering 20 ratings
Users 6040 Items 3900 Rating Values 5
Figure 7 Distribution of ratings in weak and
strong filtered data sets compared to base data
sets.
135. Experimentation and Results
6. Results
Norm.
Norm.
Figure 9 MovieLens Strong Generalization Results
Figure 8 MovieLens Weak Generalization Results
URP and the aspect model attain the same
minimum weak generalization error rate, but URP
does so using far fewer model parameters.
14Norm.
Norm.
Figure 11 EachMovie Strong Generalization Results
Figure 10 EachMovie Weak Generalization Results
On the more difficult EachMovie data set, URP
clearly performs better than the other rating
prediction methods considered.
157. Conclusions and Future Work
Conclusions We have introduced URP, a new
generative model specially designed for pure,
non-sequential, ratings-based collaborative
filtering. URP has consistent generative
semantics at both the user level, and the rating
profile level. Empirical results show that
URP outperforms other popular rating prediction
methods using fewer model parameters.
Future Work Models with more intuitive
generative semantics. Currently under study are a
promising family of product models. Models
that integrate additional features, or sequential
dynamics, or both.
168. References
1. D. Blei, A. Ng, and M. Jordan. Latent
Dirichlet allocation. Journal of Machine Learning
Research, 3993-1022, January 2003. 2. John S.
Breese, David Heckerman, and Carl Kadie.
Empirical Analysis of Predictive Algorithms for
Collaborative Filtering. In Proceedings of the
Fourteenth Annual Conference on Uncertainty in
Artificial Intelligence, pages 43-52, July
1998. 3. Thomas Hofmann. Learning What People
(Don't) Want. In Proceedings of the European
Conference on Machine Learning (ECML), 2001. 5.
Thomas Minka and John Lafferty.
Expectation-Propagation for the Generative Aspect
Model. In Proceedings of the 18th Conference on
Uncertainty in Artificial Intelligence, 2002. 6.
R. M. Neal and G. E. Hinton. A new view of the EM
algorithm that justifies incremental, sparse and
other variants. In M. I. Jordan, editor, Learning
in Graphical Models, pages 355-368. Kluwer
Academic Publishers, 1998. 7. P. Resnick, N.
Iacovou, M. Suchak, P. Bergstorm, and J. Riedl.
GroupLens An Open Architecture for Collaborative
Filtering of Netnews. In Proceedings of ACM 1994
Conference on Computer Supported Cooperative
Work, pages 175186, Chapel Hill, North Carolina,
1994. ACM. 8. Upendra Shardanand and Patti Maes.
Social information ltering Algorithms for
automating word of mouth". In Proceedings of ACM
CHI'95, volume 1, pages 210-217, 1995.