Title: Combining Content-based and Collaborative Filtering
1Combining Content-based and Collaborative
Filtering
Gabriela Polcicová Pavol Návrat
- Department of Computer Science and Engineering,
Slovak University of Technology - polcicova_at_dcs.elf.stuba.sk
- navrat_at_elf.stuba.sk
2Overview
- Information Filtering and its Types
- Combined Method
- Experiment with Information Filtering Methods
- Conclusions
3Information Filtering (1)
- delivery of relevant information to the people
who need it - Types of Information Filtering
- Content-based - for textual documents
- Collaborative - for communities of users
- Interests
- information about interests - stored in profiles
- expressing opinions to documents - ratings
- Ratings i, j, rij
- for user i, item j, the value of rating rij
4Information Filtering (2)
Filter
Rated items user, item, value
Learning interests
Unrated items user, item
Estimating the value of rating
Recommendations user, item, estimation
Choosing recommendations
5Content-based Filtering (1)
- Basic idea
- recommending documents based on content and
properties of document - Profile
- consists of keywords with assigned weights
- only documents matching profile are recommended
- Recommendations
- based on objective measurable properties
6Content-based Filtering (2)
Documents rated by the user
Documents unrated by the user
Documents of interest
Documents, ratings
PROFILE
Keywords, phrases with weights
Documents matching profile gt recommended
documents
7Collaborative Filtering (1)
- Basic idea
- automating word of mouth
- leverage opinions of like-minded users while
making decisions - Schema
- collecting users opinions
- searching for like-minded users
- making recommendations
8Collaborative Filtering (2)
Profile of user 1
Profile of user 2
Profile of current user
Profile of user 3
Profile of user 4
Documents from like-minded users profiles gt
recommended documents
Profile of user 5
9Collaborative Filtering (3)
- Similarity measure Pearson Correlation
Coefficient
? (rcj - rc) (rij - ri)
j ? Ici
kci
? (rcj - rc)2 ? (rij - ri)2
j ? Ici
j ? Ici
- Recommendations computation weighted sum of
ratings
? (rij - ri) kci
i ? Ucj
rcj rc
? kci
i ? Ucj
10Combining Content-based and Collaborative
Filtering (1)
- Computing of estimates for missing ratings by
Content-based Filtering method for each user - Searching for like-minded users
- computing coefficient kci between current and
i-th user (only from ratings) - computing coefficient kci between current and
i-th user (from both ratings and estimates) - New recommendations computation
- using ratings (with coefficients kci) and also
ratings with estimates (with coefficient kci) as
weights in weighted sum of ratings and estimates
11Datasets for Experiments
- Data
- EachMovie - users ratings for movies
- www.research.digital.com/SRC/eachmovie/
- IMDB - textual information for CBF (movies
descriptions) - www.imdb.com/
- Datasets
- A - ratings from the period up to Mar 1, 1996
- (810 ratings from 71 users)
- B - ratings from the period uo to Mar 15, 1996
- (2407 ratings from 131 users)
- C - ratings from the period up to Apr 1, 1996
- (12290 ratings from 651 users)
12EachMovie Data and Constant Method
13Experiments with Combination of Content-based and
Collaborative Filtering (2)
Dataset
Content-based Filtering method
recommendations
test, training sets
Collaborative Filtering method
Apply filtering methods and evaluate their
performance
recommendations
Divide dataset into training set (90) and
test set (10)
test, training sets
Combined Filtering method
recommendations
test set
Constant method
recommendations
Evaluation of methods performance
14Metrics
- Coverage percentage of items for which the
method is able to compute estimates - Accuracy
- F-measure
-
- NMAE
R ?L R ?L L L
R - set of recommended items L - set of
liked items
2.Precision.Recall Precision Recall
R ? L R
Precision Recall
?rij - rij n.s
R ? L L
15Results of Experiments
16Conclusions
- Combination of content-based and collaborative
filtering might help in initial phase
Future work
- Weighting of coefficients
- Comparing method with additional methods
17Content-based Filtering - Vector Representation
of Documents and Profiles
Documentj computer machine learning
W . Profile
Sim(W, Profile)
W . Profile
n
profilei ? rj .wij
j 1
TF-IDF
TF-IDF
TF-IDF
Wj (0, , 0, 0.5 , 0, , 0, 0.3
, 0, , 0, 0.2 , 0, , 0)
D ( , computer, , learning,
, machine, . )
18Collaborative Filtering - Example
A B C D E F G current 1
4 5 1 3 5 1 2 2
1 3
2 5 3 5
1 4 5 4 1
4 2 4 5 2 4 2 5
2
19Combining Content-based and Collaborative
Filtering (2)
- Similarity measure Pearson Correlation
Coefficient
? (rcj - rc) (rij - ri)
CBF
CBF
j ? Ici
kci
? (rcj - rc)2 ? (rij - ri)2
CBF
CBF
j ? Ici
j ? Ici
- Recommendations computation weighted sum of
ratings and estimates
? (rij - ri) kci ? (rij - ri) kci
CBF
i ? Ucj
i ? Ucj
rcj rc
? kci ? kci
i ? Ucj
i ? Ucj
20Experiments with Combination of Content-based and
Collaborative Filtering (1)
- Content-based Filtering Method (CBF)
- documents and profiles vector representation -
weighted keywords (TF-IDF) - estimation computation normalized dot product of
document and profile vectors - Collaborative Filtering (CF)
- Pearson correlation coefficient
- weighted sum of ratings
- Combination of CF and CBF
- Pearson correlation coefficients
- weighted sum of ratings and CBF estimations
- Constant Method (rcj 5)