Recommendation Systems - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Recommendation Systems

Description:

'The Netflix Prize seeks to substantially improve the accuracy of predictions ... Judging the Netflix Prize Results ... Current Netflix Prize Leader. Team Bellkor ... – PowerPoint PPT presentation

Number of Views:121
Avg rating:3.0/5.0
Slides: 21
Provided by: ralphmc
Category:

less

Transcript and Presenter's Notes

Title: Recommendation Systems


1
Recommendation Systems
  • Jeremy Bonjio
  • Eric McGregor

2
Overview
  • Recommender Systems
  • The Netflix Prize
  • Current Netflix Prize Leaders
  • Collaborative Filtering
  • The Bellkor CF Algorithm
  • Our Semi-Clone Implementation
  • References

3
Recommender Systems
  • Definition
  • Recommender systems (RS) analyze patterns of
    user interest in items to provide personalized
    recommendations of items that will suit a users
    taste.1

4
Benefits of RS
  • Value for Businesses
  • may decrease use of system resources
  • may improve inventory management
  • Value for Customers
  • may decrease time required to make a selection

5
Common Types of RS
  • Content Based Approach
  • create profiles of each product and user
  • requires gathering of external information
  • Collaborative Filtering (CF)
  • relies only on past user behavior
  • e.g. previous transactions or user ratings
  • does not require profiles on items or users
  • analyzes relationships between users and
    similarities among products

6
Online RS
  • Amazon.coms Item Recommender
  • Yahoo! Musics Online Music Recommender
  • Netflixs Cinematch Movie Recommender

7
The Netflix Prize Contest Goal
  • The Netflix Prize seeks to substantially improve
    the accuracy of predictions about how much
    someone is going to love a movie based on their
    (and others past) movie preferences.2

8
Netflix Prize Contest
  • Input
  • a set of user ratings for a set of movies
  • a.k.a. the training data
  • a set of (user-id, movie-id) pairs which are not
    given in the training data
  • a.k.a. the qualifying test
  • Goal
  • to determine for each pair (u, m) in the
    qualifying test, the rating that u actually gave
    for m.

9
Judging the Netflix Prize Results
  • RMSE (root mean squared error) is computed as the
    square root of the averaged squared difference
    between each prediction and the actual rating
  • Predictions must be 10 greater than Cinematch
    using RMSE in order to win Grand Prize.

10
Training Data Set
  • Consists of more than 100 million ratings (1-5)
  • from over 480 thousand randomly-chosen, anonymous
    customers
  • on nearly 18 thousand movie titles
  • The date of each rating is provided
  • The title of each movie is provided
  • The year of release for each movie is provided
  • No other customer or movie information is provided

11
Qualifying Test Set
  • Contains over 2.8 million customer/movie id pairs
  • with rating dates
  • but with the ratings withheld.
  • These pairs were selected from the most recent
    ratings from a subset of the same customers in
    the training data set, over a subset of the same
    movies.

12
Current Netflix Prize Leader
  • Team Bellkor
  • consists of Bob Bell and Yehuda Koren or ATT
    Research
  • in the Statistics and the Information
    Visualization departments. 3
  • Collaborative Filtering Approach 1

13
Collaborative Filtering
  • Can be viewed abstractly as
  • Missing Value Estimation Problem
  • note more missing values - the harder the
    problem

14
Forms of Collaborative Filtering
  • neighborhood approach
  • a.k.a. kth Nearest Neighbor or kNN
  • most common
  • used by Bellkor
  • factorization approach

15
kNN Methods of CF
  • To predict rating of item i by user u
  • Identify set of items (neighbors) that tend to be
    rated similarly to i
  • each neighbor must have been rated by u
  • Assign each neighbor a weight
  • Predict rating rui ?j?N(i,u) wij ruj

16
Major Components of kNN Method
  • Data Normalization
  • simple solution adjust for varying mean ratings
    across users (and/or items)
  • more complex removing effects
  • Neighborhood Selection
  • Pearson Correlation Coefficients
  • cosine similarity
  • Determination of Interpolation Weights for Each
    of k-Neighbors
  • simple solution use Pearson Correlation
    Coefficient
  • more complex Bellkor method
  • Calculate predictions
  • for user u and item i, rui ?j?N(i,u) wij ruj

17
Bellkor Data Normalization
  • Complex Normalization
  • removal systematic effects
  • e.g. systematic tendencies for some users to give
    higher ratings than others
  • removal of characteristic effects
  • e.g. ratings for some movies may fall over time
    but not be evident in the set of ratings
  • removal of variable effects
  • e.g. if the number of ratings for a movie is low
    the average rating may not necessarily be a good
    measurement

18
Bellkor Determination of Interpolation Weights
  • More advanced then using Pearson Correlation
    Coefficients
  • Accounts for interactions among neighbors
  • Avoids triple counting
  • e.g. Neighbor set containing trilogy
  • Avoids over-fitting
  • e.g. The user has not watched any movies similar
    to the item

19
Our Semi-Clone Implementation
  • Normalization
  • simply adjust for varying mean ratings across
    users
  • Neighborhood Selection
  • Pearson Coefficient Constants
  • Determination of Interpolant Weights
  • clone Bellkor process

20
References
  • R.Bell and Y. Koren, ATT Research, Scalable
    Collaborative Filtering with Jointly Derived
    Neighborhood Interpolation Weights
  • http//www.netflixprize.com
  • http//www.research.att.com/volinsky/netflix
Write a Comment
User Comments (0)
About PowerShow.com