Sparsity, Scalability and Distribution in Recommender Systems
Description:
Sparsity, Scalability and Distribution in Recommender Systems. Doctoral Thesis Proposal ... How can we design distributed RSs to make them widely available? ... – PowerPoint PPT presentation
Title: Sparsity, Scalability and Distribution in Recommender Systems
1 Sparsity, Scalability and Distribution in Recommender Systems
Doctoral Thesis Proposal
Badrul M. Sarwar
Computer Science Engineering Dept.
University of Minnesota
Advisor Professor John Riedl
2 Talk Outline
Introduction to Recommender Systems
Research Challenges
Previous Work
Future Work and Completion Plan
Contributions and Conclusions
3 Information Overload 4 Computerized Solution techniques
Information Retrieval
Immediate information needs
Information Filtering
Content based filtering
Information filtering agents
Collaborative Filtering (CF)
Recommender systems (RS) - interface
Well use the term CF and RS interchangeably
5 Collaborative Filtering
Why another filtering technique?
Problems with content-based filtering
Limitations due to computer processing
Lack of aesthetic sense
Different techniques for different media
CF adds the missing piece into the picture
Human judgements
6 Collaborative Filtering Process
7 CF used successfully in e-commerce 8 Talk Outline
Introduction to Recommender Systems
Research Challenges
Previous Work
Future Work and Completion Plan
Contributions and conclusions
9 Research Challenges
RC1 How can we improve RS quality and performance by using dimensionality reduction techniques?
RC2 How can we design better interface for RS?
RC3 How can we design distributed RS to make them widely available?
RC4 How can utilize clustering algorithms to improve scalability in RS?
10 RC1 Motivation and Importance
RS Performance challenge
Meet two important goals
Quality
Best CF is 77 accurate
Scalability
Response time
Storage space
11 RC1 Motivation and Importance (contd.)
Stumbling blocks
High-dimensional data
Computational complexity
Noise and data over-fitting
Sparsity
Reduced number of predictions
Inferior quality
12 RC1 Specific Aims
Select a dimensionality reduction technique
Apply the technique
Evaluate quality
Study performance implications
13 Research Challenges
RC1 How can we improve RS quality and performance by using dimensionality reduction techniques?
RC2 How can we design better interface for RS?
RC3 How can we design distributed RS to make them widely available?
RC4 How can utilize clustering algorithms to improve scalability in RS?
14 RC 2 Motivation and Importance
Need for explanation interface
End-user point of view
Explanation of recommendations
Algorithmic explanation
Visual explanation
Visual explanation
Visualization amplifies cognition
Benefits
Increases usability and confidence
15 RC 2 Specific aims
Identify techniques
Use of dimension reduction results
Implementation
Evaluation
Usability study
Comparison with text-based system
16 Research Challenge 3
How can we improve RS quality and performance by using dimensionality reduction techniques?
How can we design better interface for RS?
How can we design distributed RSs to make them widely available?
How can utilize clustering algorithms to improve scalability in RS?
17 RC3 Motivation and Importance
Increasing needs for RS services
Availability challenge
Travelling users
Centralized RS problems
Problems of scale and robustness
Privacy concerns
18 RC3 Specific aims
Taxonomy of RS application space
Design framework
Key design issues
Implementation models
Evaluation criteria
Analysis of different models
19 Research Challenge 4
How can we improve RS quality and performance by using dimensionality reduction techniques?
How can we design better interface for RS?
How can we design distributed RS to make them widely available?
How can we utilize clustering algorithms to improve scalability in RSs?
20 RC4 Motivation and Importance
Scalability
Sparsity
Benefits of Clustering
Usenet (newsgroup)
Recent studies
Performance implications
21 RC4 Specific aims
Identify clustering algorithms
Soft cluster
Hard cluster
Partition the data set
Apply Galaxy algorithm
Evaluate results
22 Talk Outline
Introduction to Recommender Systems
Research Challenges
Previous Work
Future Work and Completion Plan
Contributions and conclusions
23 Research Approach 24 Dimension Reduction Experiments
Singular Value Decomposition
Matrix factorization
Dimension reduction
Prediction generation by re-constructing matrix
Result highlights
Quality of prediction improved
We expect to see improved performance
25 Applying dimension reduction in RS
We applied LSI/SVD based technique
SVD decomposes a matrix into three factors
The reconstructed matrix Rk Uk.Sk.Vk is the closest rank-k matrix to the original matrix R. 26 SVD as prediction generator 27 Results SVD as prediction generator 28 Visual Interface Initial Prototype
Used SVD results
Plotted user and items in 2-D feature space
Prototype tested in Spotfire
Problems
Distance is non-Euclidean
29 Design of Visual Interface
Use of LSI/SVD for user-item visualization
30 Distributed RS Work done
Taxonomy of the application space
Based on ltNeighborhood and predictiongt
Identification of key design issues
Three implementation models proposed
Local profile model
Central profile model
Geographically distributed profile model
31 Talk Outline
Introduction to Recommender Systems
Research Challenges
Previous Work
Future Work and Completion Plan
Contributions and conclusions
32 Future WorkDimension Reduction
Study performance implications
SVD based prediction
Offline (model building)
Online
Offline part is time-consuming
Incremental SVD
Fold-in
Online is very promising
33 Future WorkDistributed RS
Evaluation
Possible approaches
Identify suitable evaluation criteria
Select applications from taxonomy
Analyze using each model (hypothetical)
Analyze each implementation in terms of the evaluation criteria
34 Future WorkVisual Interface
Implement Visual interface
Perform usability studies
Setup live user experiment
Identify usability questionnaires
Conduct the usability survey
Analyze results
Revise/redesign interface
35 Future WorkClustering in RS
Identify effective clustering algorithms
For soft and hard cluster (K-means and E-M)
Partition the dataset
Apply galaxy algorithm
Test for quality
Accuracy and coverage
Test for performance
Response time
36 Future WorkCompletion Plan 37 Contributions
Use of dimension reduction technique (SVD) to be a high-quality prediction generator
Submitted to ICDE 2000
Framework design for distributed RS.
Submitted to CIKM99
Visual interfaces
Clustering to improve scalability
38 That's all folks! 39 Distributed RS Local Profile Model User Profile data 40 Distributed RS Central Profile Model CPS RS Remote RS User Profile storage Remote RS 41 Geographically Distributed RS GDPS 1 RS User Profile database User Remote RS GDPS 3 User GDPS 2 User Remote RS 42 Problems of high dimensional data A is highly correlated with B B is highly correlated with C We cant say that C is also highly correlated with A.
PowerShow.com is a leading presentation sharing website. It has millions of presentations already uploaded and available with 1,000s more being uploaded by its users every day. Whatever your area of interest, here you’ll be able to find and view presentations you’ll love and possibly download. And, best of all, it is completely free and easy to use.
You might even have a presentation you’d like to share with others. If so, just upload it to PowerShow.com. We’ll convert it to an HTML5 slideshow that includes all the media types you’ve already added: audio, video, music, pictures, animations and transition effects. Then you can share it with your target audience as well as PowerShow.com’s millions of monthly visitors. And, again, it’s all free.
About the Developers
PowerShow.com is brought to you by CrystalGraphics, the award-winning developer and market-leading publisher of rich-media enhancement products for presentations. Our product offerings include millions of PowerPoint templates, diagrams, animated 3D characters and more.