1 - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

1

Description:

Netflix Prize - $1,000,000 Prize. Netflix recently announced their NetflixPrize in ... Movies by Netflix, MovieLens. 6. Formal Model. C = set of Customers ... – PowerPoint PPT presentation

Number of Views:97
Avg rating:3.0/5.0
Slides: 31
Provided by: cseCu
Category:
Tags: netflix

less

Transcript and Presenter's Notes

Title: 1


1
Introduction to Recommendation System
  • Presented by HongBo Deng
  • Nov 14, 2006

Refer to the PPT from Stanford Anand Rajaraman,
Jeffrey D. Ullman
2
Netflix Prize - 1,000,000 Prize
Netflix recently announced their NetflixPrize in
which they will award 1 million dollars for an
algorithm that can out-perform their
recommendation approach Cinematch by 10.
3
Outline
  • What is Recommendation systems?
  • Three recommendation approaches
  • Content-based
  • Collaborative
  • Hybrid approach
  • Conclusions
  • Review of my previous work

4
What is Recommendation systems?
Recommendation systems are programs which attempt
to predict items that a user may be interested in
Items
Products, web sites, blogs, news items,
5
Recommendation Types
  • Editorial
  • Simple aggregates
  • Top 10, Most Popular, Recent Uploads
  • Tailored to individual users
  • Amazon, Netflix,
  • Books, CDs, other products at amazon.com
  • Movies by Netflix, MovieLens

6
Formal Model
  • C set of Customers
  • S set of Items, e.g. books, movies
  • The space S of possible items and the user space
    C can be very large.
  • Utility function u C S ! R
  • R set of ratings
  • R is a totally ordered set
  • e.g., 0-5 stars, real number in 0,1

7
Utility Matrix
King Kong
LOTR
Matrix
Nacho Libre
Alice
Bob
Carol
David
8
Recommendation Process
  • Collecting known ratings for matrix
  • Extrapolate unknown ratings from known ratings
  • Estimate ratings for the items that have not been
    seen by a user
  • Recommend the items with the highest estimated
    ratings to a user

9
Collecting Ratings
  • Explicit data collection
  • Ask people to rate items
  • Doesnt work well in practice people cant be
    bothered
  • Implicit data collection
  • Learn ratings from user actions
  • e.g., purchase implies high rating
  • What about low ratings?

10
Extrapolating Utilities
  • Key problem matrix U is sparse
  • most people have not rated most items
  • Three approaches
  • Content-based recommendation
  • Collaborative recommendation
  • Hybrid recommendation

11
Content-based recommendations
  • Main idea recommend items to customer C similar
    to previous items rated highly by C
  • Movie recommendations
  • recommend movies with same actor(s), director,
    genre,
  • Websites, blogs, news
  • recommend other sites with similar content

12
Plan of action
Item profiles
likes
build
recommend
Red Circles Triangles
match
User profile
13
Item Profiles
  • For each item, create an item profile
  • Profile is a set of features
  • movies author, title, actor, director,
  • text set of important words in document
  • How to pick important words?
  • Usual heuristic is TF.IDF (Term Frequency times
    Inverse Doc Frequency)

14
TF.IDF
  • fij frequency of term ti in document dj
  • ni number of docs that mention term i
  • N total number of docs
  • TF.IDF score wij TFij IDFi
  • Doc profile set of words with highest TF.IDF
    scores, together with their scores

15
User profiles and prediction
  • User profile possibilities
  • Weighted average of rated item profiles
  • Variation weight by difference from average
    rating for item
  • Traditional heuristic
  • Given user profile c and item profile s, estimate
    u(c,s) cos(c,s) c.s/(cs)
  • Need efficient method to find items with high
    utility
  • E.g.

16
Model-based approaches
  • For each user, learn a classifier that classifies
    items into rating classes
  • liked by user and not liked by user
  • e.g., Bayesian, regression, SVM
  • Apply classifier to each item to find
    recommendation candidates
  • Problem scalability

17
Limitations of content-based approach
  • Finding the appropriate features
  • e.g., images, movies, music
  • Overspecialization
  • Never recommends items outside users content
    profile
  • People might have multiple interests
  • Recommendations for new users
  • How to build a profile?
  • A new user, having very few ratings, would not be
    able to get accurate recommendations.

18
Collaborative Filtering
  • Consider user c
  • Find set D of other users whose ratings are
    similar to cs ratings
  • Estimate users ratings based on ratings of users
    in D

Set of other users
Similar
Ratings
Estimate
Ratings
19
Similar users
  • Let rx be the vector of user xs ratings
  • Cosine similarity measure
  • sim(x,y) cos(rx , ry)
  • Pearson correlation coefficient
  • Sxy items rated by both users x and y

20
Rating predictions
  • Let D be the set of k users that are the most
    similar to c and who have rated item s
  • Possibilities for prediction function (item s)
  • rcs 1/k ?d2D rds
  • rcs (?d2D sim(c,d) rds)/(?d2 D sim(c,d))
  • Other options?

21
Complexity
  • Expensive step is finding k most similar
    customers
  • O(U)
  • Too expensive to do at runtime
  • Need to pre-compute
  • Naïve precomputation takes time O(NU)
  • Simple trick gives some speedup
  • Can use clustering, partitioning as alternatives,
    but quality degrades

22
Item-Item Collaborative Filtering
  • So far User-user collaborative filtering
  • Another view
  • For item s, find other similar items
  • Estimate rating for item based on ratings for
    similar items
  • Can use same similarity metrics and prediction
    functions as in user-user model
  • In practice, it has been observed that item-item
    often works better than user-user

23
Pros and cons of collaborative filtering
  • Works for any kind of item
  • No feature selection needed
  • New user problem
  • The same problem as with content-based system
  • New item problem
  • Sparsity of rating matrix

24
Hybrid Methods
  • Implement two separate recommenders and combine
    their predictions
  • Add content-based methods to collaborative
    approach
  • item profiles for new item problem
  • deal with sparsity-related problems

25
Evaluating Recommendations
  • Precision
  • Accuracy of predictions
  • Compare predictions with known ratings,
    Root-mean-square error (RMSE)
  • Receiver operating characteristic (ROC)
  • Tradeoff curve between false positives and false
    negatives
  • Recommendation Quality
  • Top-n measures (e.g., Breese score)
  • Item-Set Coverage
  • Number of items/users for which system can make
    predictions

26
Conclusions
  • Content-based
  • The user will be recommended items similar to the
    ones the user preferred in the past
  • Collaborative
  • The user will be recommended items that people
    with similar tastes and preferences liked in the
    past
  • Hybrid
  • Combine collaborative and content-based methods

27
Review of my previous work
28
Facial Expression Recognition
Preprocessing procedure
Rotate to line up eye coordinates
Locate Corp Face Region
Geometrical Normalize
Histogram Equalization
Gabor Feature Extraction
Normalize
Train Phase
Templates
PCALDA Translation Matrix
Test Phase
Distance Classifier
29
Image Stitching
Feature Points extraction
Correlation Match
Ransac eliminate pseudo match points
Demo
Build the Model Perspective model
Image alignment
Image Stitching
30
Any questions or suggestions
  • Thank you
Write a Comment
User Comments (0)
About PowerShow.com