1 - PowerPoint PPT Presentation

1 / 30

About This Presentation

Title:

1

Description:

Netflix Prize - $1,000,000 Prize. Netflix recently announced their NetflixPrize in ... Movies by Netflix, MovieLens. 6. Formal Model. C = set of Customers ... – PowerPoint PPT presentation

Number of Views:97

Avg rating:3.0/5.0

Slides: 31

Provided by: cseCu

Category:

Tags: netflix

more less

Transcript and Presenter's Notes

Title: 1

1
Introduction to Recommendation System

Presented by HongBo Deng
Nov 14, 2006

Refer to the PPT from Stanford Anand Rajaraman,
Jeffrey D. Ullman
2
Netflix Prize - 1,000,000 Prize
Netflix recently announced their NetflixPrize in
which they will award 1 million dollars for an
algorithm that can out-perform their
recommendation approach Cinematch by 10.
3
Outline

What is Recommendation systems?
Three recommendation approaches
Content-based
Collaborative
Hybrid approach
Conclusions
Review of my previous work

4
What is Recommendation systems?
Recommendation systems are programs which attempt
to predict items that a user may be interested in
Items
Products, web sites, blogs, news items,
5
Recommendation Types

Editorial
Simple aggregates
Top 10, Most Popular, Recent Uploads
Tailored to individual users
Amazon, Netflix,
Books, CDs, other products at amazon.com
Movies by Netflix, MovieLens

6
Formal Model

C set of Customers
S set of Items, e.g. books, movies
The space S of possible items and the user space
C can be very large.
Utility function u C S ! R
R set of ratings
R is a totally ordered set
e.g., 0-5 stars, real number in 0,1

7
Utility Matrix
King Kong
LOTR
Matrix
Nacho Libre
Alice
Bob
Carol
David
8
Recommendation Process

Collecting known ratings for matrix
Extrapolate unknown ratings from known ratings
Estimate ratings for the items that have not been
seen by a user
Recommend the items with the highest estimated
ratings to a user

9
Collecting Ratings

Explicit data collection
Ask people to rate items
Doesnt work well in practice people cant be
bothered
Implicit data collection
Learn ratings from user actions
e.g., purchase implies high rating
What about low ratings?

10
Extrapolating Utilities

Key problem matrix U is sparse
most people have not rated most items
Three approaches
Content-based recommendation
Collaborative recommendation
Hybrid recommendation

11
Content-based recommendations

Main idea recommend items to customer C similar
to previous items rated highly by C
Movie recommendations
recommend movies with same actor(s), director,
genre,
Websites, blogs, news
recommend other sites with similar content

12
Plan of action
Item profiles
likes
build
recommend
Red Circles Triangles
match
User profile
13
Item Profiles

For each item, create an item profile
Profile is a set of features
movies author, title, actor, director,
text set of important words in document
How to pick important words?
Usual heuristic is TF.IDF (Term Frequency times
Inverse Doc Frequency)

14
TF.IDF

fij frequency of term ti in document dj
ni number of docs that mention term i
N total number of docs
TF.IDF score wij TFij IDFi
Doc profile set of words with highest TF.IDF
scores, together with their scores

15
User profiles and prediction

User profile possibilities
Weighted average of rated item profiles
Variation weight by difference from average
rating for item
Traditional heuristic
Given user profile c and item profile s, estimate
u(c,s) cos(c,s) c.s/(cs)
Need efficient method to find items with high
utility
E.g.

16
Model-based approaches

For each user, learn a classifier that classifies
items into rating classes
liked by user and not liked by user
e.g., Bayesian, regression, SVM
Apply classifier to each item to find
recommendation candidates
Problem scalability

17
Limitations of content-based approach

Finding the appropriate features
e.g., images, movies, music
Overspecialization
Never recommends items outside users content
profile
People might have multiple interests
Recommendations for new users
How to build a profile?
A new user, having very few ratings, would not be
able to get accurate recommendations.

18
Collaborative Filtering

Consider user c
Find set D of other users whose ratings are
similar to cs ratings
Estimate users ratings based on ratings of users
in D

Set of other users
Similar
Ratings
Estimate
Ratings
19
Similar users

Let rx be the vector of user xs ratings
Cosine similarity measure
sim(x,y) cos(rx , ry)
Pearson correlation coefficient
Sxy items rated by both users x and y

20
Rating predictions

Let D be the set of k users that are the most
similar to c and who have rated item s
Possibilities for prediction function (item s)
rcs 1/k ?d2D rds
rcs (?d2D sim(c,d) rds)/(?d2 D sim(c,d))
Other options?

21
Complexity

Expensive step is finding k most similar
customers
O(U)
Too expensive to do at runtime
Need to pre-compute
Naïve precomputation takes time O(NU)
Simple trick gives some speedup
Can use clustering, partitioning as alternatives,
but quality degrades

22
Item-Item Collaborative Filtering

So far User-user collaborative filtering
Another view
For item s, find other similar items
Estimate rating for item based on ratings for
similar items
Can use same similarity metrics and prediction
functions as in user-user model
In practice, it has been observed that item-item
often works better than user-user

23
Pros and cons of collaborative filtering

Works for any kind of item
No feature selection needed
New user problem
The same problem as with content-based system
New item problem
Sparsity of rating matrix

24
Hybrid Methods

Implement two separate recommenders and combine
their predictions
Add content-based methods to collaborative
approach
item profiles for new item problem
deal with sparsity-related problems

25
Evaluating Recommendations

Precision
Accuracy of predictions
Compare predictions with known ratings,
Root-mean-square error (RMSE)
Receiver operating characteristic (ROC)
Tradeoff curve between false positives and false
negatives
Recommendation Quality
Top-n measures (e.g., Breese score)
Item-Set Coverage
Number of items/users for which system can make
predictions

26
Conclusions

Content-based
The user will be recommended items similar to the
ones the user preferred in the past
Collaborative
The user will be recommended items that people
with similar tastes and preferences liked in the
past
Hybrid
Combine collaborative and content-based methods

27
Review of my previous work
28
Facial Expression Recognition
Preprocessing procedure
Rotate to line up eye coordinates
Locate Corp Face Region
Geometrical Normalize
Histogram Equalization
Gabor Feature Extraction
Normalize
Train Phase
Templates
PCALDA Translation Matrix
Test Phase
Distance Classifier
29
Image Stitching
Feature Points extraction
Correlation Match
Ransac eliminate pseudo match points
Demo
Build the Model Perspective model
Image alignment
Image Stitching
30
Any questions or suggestions