Fast Maximum Margin Matrix Factorization for Collaborative Prediction

About This Presentation

Title:

Fast Maximum Margin Matrix Factorization for Collaborative Prediction

Description:

Threshold-based ordinal regression. Scaling-up MMMF to large problems ... Ordinal Regression: we want to minimize the absolute difference between labels ... – PowerPoint PPT presentation

Number of Views:51

Avg rating:3.0/5.0

Slides: 22

Provided by: nathan106

Learn more at: http://people.csail.mit.edu

Category:

more less

Transcript and Presenter's Notes

Title: Fast Maximum Margin Matrix Factorization for Collaborative Prediction

1
Fast Maximum Margin Matrix Factorization
forCollaborative Prediction
2
Collaborative Prediction

Based on partially observed matrix
) Predict unobserved entries

Will user i like movie j?
movies
users
3
Problems to Address

Underlying representation of preferences
Norm constrained matrix factorization (MMMF)
Discrete, ordered labels
Threshold-based ordinal regression
Scaling-up MMMF to large problems
Factorized objective, gradient descent
Ratings may not be missing at random

4
Linear Factor Model
User Preference Weights
Features
Feature Values
5
Ordinal Regression
Feature Vectors
Preference Weights
w1
6
Matrix Factorization
Feature Vectors
Preference Weights
q
Preference Scores
Ratings
7
Matrix Factorization
V
U
X
Y
q
8
Ordinal Regression
Feature Vectors
Preference Weights
w1
9
Max-Margin Ordinal Regression
Shashua Levin, NIPS 2002
10
Absolute Difference

Shashua Levins loss bounds the
misclassification error
Ordinal Regression we want to minimize the
absolute difference between labels

11
All-Thresholds Loss
Chu Keerthi, ICML 2005
Srebro et al., NIPS 2004
12
All-Thresholds Loss

Experiments comparing
Least squares regression
Multi-class classification
Shashua Levins Max-Margin OR
All-Thresholds OR
All-Thresholds Ordinal Regression
Lowest misclassification error
Lowest absolute difference error

Rennie Srebro, IJCAI Wkshp 2005
13
Learning Weights Features
3
5
2
3
4
2
3
3
5
2
2
3
4
2
3
5
2
2
5
2
1
4
5
2
3
5
2
1
4
3
3
5
2
3
1
4
2
3
5
2
3
1
3
2
4
2
2
3
5
2
2
3
4
q
2
3
2
1
2
2
3
5
2
1
2
3
3
5
2
2
3
4
2
3
5
5
2
1
3
3
5
2
3
1
4
2
3
5
2
3
4
Ratings
14
Low Rank Matrix Factorization
V
U

X rank k
¼

Sum-Squared Loss
Fully Observed Y
Classification Error Loss
Partially Observed Y

Use SVD to find Global Optimum
Non-convex No explicit soln.
15
Norm Constrained Factorization
Xtr minU,V
V
(UFro2 VFro2)/2
U
X
UFro2 ?i,j Uij2
Fazel et al., 2001
16
MMMF Objective
Original Objective
All-Thresholds
minX Xtr c loss(X,Y)
X
Srebro et al., NIPS 2004
17
Smooth Hinge
Gradient
Smooth Hinge
Hinge
18
Collaborative Prediction Results
URP Attitude Results Marlin, 2004
19
Local Minima?
Factorized Objective
minU,V (UFro2 VFro2)/2
c loss(UV,Y)
20
Local Minima?
Matrix Difference
X
Y
?
Data 100 x 100 MovieLens, 65 sparse
21
Summary