Title: Recommender Systems and Collaborative Filtering
1Recommender Systems and Collaborative Filtering
- Drawing much on some online ppt in this area,
especially William W. Cohen (CMU) -
2You visit an online bookshop ...
- The shop has 100,000 books.
- On the webpage, they will display 5 book covers,
especially for you. - What ones will they display?
3Why?
- same for books, webpages, music, films, clothes,
food, everything ... this is very serious for
e-commerce -- big financial uplift if stores get
recommendations right - What if the website is not selling you anything
(e.g. research papers, search, interest group
forum). Why does such a site need to make good
recommendations?
4Basic approaches used for recommendation
- User-based
- Recommend things that were purchased or viewed by
users who are similar to you - Item-based
- Recommend things that are similar to the items
that you have viewed/purchased before
5Amazon cold-start recomendation
6Amazon with minimal info about me via a cookie
on this netbook
7Amazon, when I logged in
8User Profiles
- For user-based recommendation, sites need to have
some kind of user profile. - Similarity with other users is based on distance
measurements based on the profile. - What do you think could be in a user profile?
9Potential contents of user profiles
- Demographic data age, gender, salary,
profession, country of residence, country of
origin, religion ... - Site behaviour Purchase history at the site
viewing history, perhaps including time spent on
certain pages/items clickstream sequence
10K-Nearest Neighbour based Recommendation
Age
You
Salary
(Think in terms of many dimensions, not just
these two)
11K-Nearest Neighbour based Recommendation
Age
You
Salary
Your neighbours recommend things that they
have viewed/purchased
12Collaborative Filtering The main idea
- People who purchased A also purchased B
Different from nearest-neighbour this can lead
to recommendations based on behaviour of users
who are very dissimilar to you
13Other forms/aspects of collaborative filtering
- Why collaborative? Basically, someone else (in
fact many someones) have gone to the effort of
viewing/filtering things, and chosen the best
few. You get a recommendation of the best few,
without having to spend the effort. - Rampant examples of CF twitter, pagerank,
stumbleupon, digg, Facebook (Likes), etc ...
14Another look at Googles PageRank(this bit
adapted from slides of William Cohen, CMU)
Inlinks are good (recommendations) Inlinks from
a good site are better than inlinks from a
bad site but inlinks from sites with many
outlinks are not as good... Good and bad
are relative.
web site xxx
web site xxx
web site xxx
web site a b c d e f g
web site pdq pdq ..
web site yyyy
web site a b c d e f g
web site yyyy
15Googles PageRank(Brin Page,
http//www-db.stanford.edu/backrub/google.html)
web site xxx
- Imagine a pagehopper that always either
- follows a random link, or
- jumps to random page
- PageRank ranks pages by the amount of time the
pagehopper spends on a page - or, if there were many pagehoppers, PageRank is
the expected crowd size
web site xxx
web site a b c d e f g
web site pdq pdq ..
web site yyyy
web site a b c d e f g
web site yyyy
16Collaborative Filtering and User Ratings
Many systems ask users to rate items e.g. on a
scale of 1 to 10. These ratings then enable the
system to give more precise/accurate
recommendations, and use a variety of
sophisticated learning/prediction algorithms.
17Collaborative Filtering and User Ratings
Many systems ask users to rate items e.g. on a
scale of 1 to 10. These ratings then enable the
system to give more precise/accurate
recommendations, and use a variety of
sophisticated learning/prediction
algorithms. E.g. Here are user ratings for some
items ? means unrated. A
B C D E F G H You 7
2 1 8 9 9 ? ? User1
1 8 8 2 ? 2 8
7 User2 6 3 3 7 6 5
3 1 User3 7 2 1 7 7
? 3 1 How might a system predict your
rating for items G and H?
18Collaborative Filtering Works
19BellCores MovieRecommender(Bell Communications
Research)
- Participants sent email to videos_at_bellcore.com
- System replied with a list of 500 movies to rate
on a 1-10 scale (250 random, 250 popular) - Only subset need to be rated
- New participant P sends in rated movies via email
- System compares ratings for P to ratings of (a
random sample of) previous users - Most similar users are used to predict scores for
unrated movies - System returns recommendations in an email
message.
20(No Transcript)
21Start your own business? Bookmark based
recommendation
22Display the right adverts on your site
23End