Recommendation Systems - PowerPoint PPT Presentation

1 / 20

About This Presentation

Title:

Recommendation Systems

Description:

'The Netflix Prize seeks to substantially improve the accuracy of predictions ... Judging the Netflix Prize Results ... Current Netflix Prize Leader. Team Bellkor ... – PowerPoint PPT presentation

Number of Views:121

Avg rating:3.0/5.0

Slides: 21

Provided by: ralphmc

Category:

more less

Transcript and Presenter's Notes

Title: Recommendation Systems

1
Recommendation Systems

Jeremy Bonjio
Eric McGregor

2
Overview

Recommender Systems
The Netflix Prize
Current Netflix Prize Leaders
Collaborative Filtering
The Bellkor CF Algorithm
Our Semi-Clone Implementation
References

3
Recommender Systems

Definition
Recommender systems (RS) analyze patterns of
user interest in items to provide personalized
recommendations of items that will suit a users
taste.1

4
Benefits of RS

Value for Businesses
may decrease use of system resources
may improve inventory management
Value for Customers
may decrease time required to make a selection

5
Common Types of RS

Content Based Approach
create profiles of each product and user
requires gathering of external information
Collaborative Filtering (CF)
relies only on past user behavior
e.g. previous transactions or user ratings
does not require profiles on items or users
analyzes relationships between users and
similarities among products

6
Online RS

Amazon.coms Item Recommender
Yahoo! Musics Online Music Recommender
Netflixs Cinematch Movie Recommender

7
The Netflix Prize Contest Goal

The Netflix Prize seeks to substantially improve
the accuracy of predictions about how much
someone is going to love a movie based on their
(and others past) movie preferences.2

8
Netflix Prize Contest

Input
a set of user ratings for a set of movies
a.k.a. the training data
a set of (user-id, movie-id) pairs which are not
given in the training data
a.k.a. the qualifying test
Goal
to determine for each pair (u, m) in the
qualifying test, the rating that u actually gave
for m.

9
Judging the Netflix Prize Results

RMSE (root mean squared error) is computed as the
square root of the averaged squared difference
between each prediction and the actual rating
Predictions must be 10 greater than Cinematch
using RMSE in order to win Grand Prize.

10
Training Data Set

Consists of more than 100 million ratings (1-5)
from over 480 thousand randomly-chosen, anonymous
customers
on nearly 18 thousand movie titles
The date of each rating is provided
The title of each movie is provided
The year of release for each movie is provided
No other customer or movie information is provided

11
Qualifying Test Set

Contains over 2.8 million customer/movie id pairs
with rating dates
but with the ratings withheld.
These pairs were selected from the most recent
ratings from a subset of the same customers in
the training data set, over a subset of the same
movies.

12
Current Netflix Prize Leader

Team Bellkor
consists of Bob Bell and Yehuda Koren or ATT
Research
in the Statistics and the Information
Visualization departments. 3
Collaborative Filtering Approach 1

13
Collaborative Filtering

Can be viewed abstractly as
Missing Value Estimation Problem
note more missing values - the harder the
problem

14
Forms of Collaborative Filtering

neighborhood approach
a.k.a. kth Nearest Neighbor or kNN
most common
used by Bellkor
factorization approach

15
kNN Methods of CF

To predict rating of item i by user u
Identify set of items (neighbors) that tend to be
rated similarly to i
each neighbor must have been rated by u
Assign each neighbor a weight
Predict rating rui ?j?N(i,u) wij ruj

16
Major Components of kNN Method

Data Normalization
simple solution adjust for varying mean ratings
across users (and/or items)
more complex removing effects
Neighborhood Selection
Pearson Correlation Coefficients
cosine similarity
Determination of Interpolation Weights for Each
of k-Neighbors
simple solution use Pearson Correlation
Coefficient
more complex Bellkor method
Calculate predictions
for user u and item i, rui ?j?N(i,u) wij ruj

17
Bellkor Data Normalization

Complex Normalization
removal systematic effects
e.g. systematic tendencies for some users to give
higher ratings than others
removal of characteristic effects
e.g. ratings for some movies may fall over time
but not be evident in the set of ratings
removal of variable effects
e.g. if the number of ratings for a movie is low
the average rating may not necessarily be a good
measurement

18
Bellkor Determination of Interpolation Weights

More advanced then using Pearson Correlation
Coefficients
Accounts for interactions among neighbors
Avoids triple counting
e.g. Neighbor set containing trilogy
Avoids over-fitting
e.g. The user has not watched any movies similar
to the item

19
Our Semi-Clone Implementation

Normalization
simply adjust for varying mean ratings across
users
Neighborhood Selection
Pearson Coefficient Constants
Determination of Interpolant Weights
clone Bellkor process

20
References

R.Bell and Y. Koren, ATT Research, Scalable
Collaborative Filtering with Jointly Derived
Neighborhood Interpolation Weights
http//www.netflixprize.com
http//www.research.att.com/volinsky/netflix

Write a Comment

User Comments (0)