It Takes Variety to Make a World - PowerPoint PPT Presentation

1 / 31

About This Presentation

Title:

It Takes Variety to Make a World

Description:

It Takes Variety to Make a World. Recommendation: An Increasingly ... Serendipity: aim to return less relevant items that might give users a pleasant surprise. ... – PowerPoint PPT presentation

Number of Views:31

Avg rating:3.0/5.0

Slides: 32

Provided by: Yah993

Category:

more less

Transcript and Presenter's Notes

Title: It Takes Variety to Make a World

1
It Takes Variety toMake a World

Diversification in Recommendation Systems
Cong Yu1, Laks Lakshmanan2, Sihem Amer-Yahia1
1Yahoo! Research NYC 2University of British
Columbia
March 25th 2009 _at_ EDBT

2
Recommendation An Increasingly Important and
Ubiquitous Paradigm on the Web

The Recommendation Paradigm
Suggest content (in most cases, items) to users
based on her profile and past activities.
Why Recommendation?
Search queries can be generic e.g., gt90 of
Yahoo! Travel queries are general descriptions
like family trip.
More so for Social Content Sites ...

3
Recommendations on Social Content Sites

Social Content Sites
Sites where users make friends and share contents
E.g., del.icio.us, Flickr, etc.
Recommendation is an indispensible information
exploration paradigm on social content sites.
The rich activities and user connections provide
lots of opportunities for generating
recommendations.

4
Challenges in Recommendation

While relevance is important, other factors are
critical too
Diversity avoid returning items that are too
similar to each other.
Novelty avoid returning items that users are
likely to know already.
Serendipity aim to return less relevant items
that might give users a pleasant surprise.
Result Diversification

From the pool of relevant items, identify a list
of items that are dissimilar to each other and
maintain a high cumulative relevance, i.e.,
strike a good balance between relevance and
diversity.
5
Existing Solutions

Attribute-Based Diversification
Follow Three Steps
Obtain attributes of each relevant items
Define a pair-wise item-to-item distance function
based on those attributes
Perform Diversification
Optimizing an overall score as a weighted
combination of relevance and distance
Constraining either relevance or distance,
maximizing the other

6
Problems with ExistingAttribute-Based
Diversification

Lack of attributes for objects
URLs in del.icio.us and photos in Flickr
Overhead for retrieving attributes for certain
recommendation strategies
Difficulties in estimating the correct
parameters/thresholds for diversification
algorithms

Our Solutions Explanation-Based
Diversification Dynamic Diversification
Algorithms
7
Main Contributions

Formalized the Notion of Explanation-Based
Diversification
Designed and Implemented Algorithms for
Scalable Similarity Computation
Explanation Generation
Diversification
Experimentally Evaluated
The characteristics of diversifications
algorithms
The practicality of explanation-based
diversification
The performance overhead of explanation-based
diversification

8
Outline

Motivation
Problem Definition
Algorithms
Similarity Computation
Recommendation Generation with Explanation
Diversification
Experimental Evaluation
Conclusion

9
Recommendation Strategies Overview

Item-Based Strategies
Estimate the rating of an unrated item (i) by the
user (u) based on its similarity to items already
rated and how u rated those items.
Collaborative Filtering Strategies
Estimate the rating of i by u based on how us
similarity network (either explicit or implicit)
rated i.

10
Explanation

Basic Notion
The set of objects because of which a particular
item is recommended to the user
Explanation for Item-Based Strategies
Explanation for Collaborative Filtering Strategies

11
Explanation-Based Diversity

Pair-wise diversity distance between two
recommended items
Standard similarity measures like Jaccard
similarity and cosine similarity
E.g. (Distance based on Jaccard similarity)
Diversity for the set of recommended items (S)

12
Benefits of Explanation-Based Diversification

Applicable to items without attributes or whose
attributes are difficult to analyze
Common on social content sites
Explanations are by-products of many
recommendation processes
They can be maintained with little overhead

13
Top-K Recommendation with Diversification Given
a user u, find a subset S from the set of
candidate items, such that S k and the
overall relevance of items in S and the diversity
of S are balanced.
14
Outline

Motivation
Problem Definition
Algorithms
Similarity Computation (briefly)
Recommendation Generation with Explanation
(briefly)
Diversification
Experimental Evaluation
Conclusion

15
Diversification Balance between Relevance and
Diversity

Relevance cumulative relevance of all items in
the result set
Diversity average distance of all item pairs in
the result set as described earlier
Ideal Scenario identify a top-k result set that
maximizes both relevance and diversity
Such top-k set is often impossible to find

16
Naïve Solution Maximize Relevance
17
Naïve Solution Maximize Diversity
18
Smarter Solutions

Eliminate items with scores below a threshold and
choose k items among the remaining with the
maximum diversity
Eliminate item pairs with distance below a
threshold (by removing the item with lower score)
and choose k items with highest scores

Algorithm Swap combines both!
19
Algorithm Swap

Sort candidate items according to their relevance
Start by adding the K most relevant items to the
result set
Go through the rest of the candidate one by one,
swap an item into the result set if the item
Increases the set diversity above a certain
threshold
Does not drop the relevance by a certain
threshold
A simple top-2 example

c1
Results
Results
Results
c1
c1
c2
c1
c3
c3
c4
c3
c3
relevance c1 0.9 c2 0.6 c3 0.4 c4 0.3
c2
c4
initial set
diversity increase out-weighs the relevance drop
no change
20
Challenge The appropriate thresholds are often
difficult to identify to produce the
right top-K recommendations
21
Algorithm Iterative Greedy

Dynamically identify the thresholds
Establish two diversity bounds, Upper and Lower
Bounds.
At each iteration, scan the candidates
items passing upper bound go to KeepList
items not passing lower bound go to DiscardList.
At the end of each iteration, bounds are adjusted
Stop when exactly K items are generated

22
Algorithm Iterative Greedy, contd
If DivList KeepList lt K
DivList
KeepList
pass
pass
UB
UB B
Candidates
B (UBLB)/2
Next Iteration
Candidates
LB B
LB
no-pass
no-pass
SimList
DiscardList
23
Brief Overview of Similarity Computation

Explicit network is not enough
E.g., only 10 of users have at least one friend
in del.icio.us
Similarities between users can be generated based
on their activities
Costly with pair-wise comparison
E.g., 1 million users gt 1 trillion comparisons
Only a small fraction of those comparisons result
in similarity above a given threshold
Item-Based Similarity Computation
Organize items based on the number of raters
Start with items with the largest number of
raters
Compare two users only if they share enough rated
items
Details in the paper

24
Brief Description of Recommendation Generation
with Explanation

Post-Processing Approach
Generate the recommendation result set
Generate candidates
Compute scores for candidates
Sort the candidates
For each item in the result set, fetch its
explanations
Integrated Approach
Maintain the list of similar items or similar
users when the candidates are being generated
Create the explanations for each item while the
scores are being computed
Details in the paper

25
Outline

Motivation
Problem Definition
Algorithms
Similarity Computation (briefly)
Recommendation Generation with Explanation
(briefly)
Diversification
Experimental Evaluation
Conclusion

26
Experimental Data

Real world data sets
del.icio.us online bookmark sharing site
Y! Movies Yahoo!s online movie sharing site

27
Result Comparison
28
Explanations can Serve as a Good Basis for
Diversification

Leveraging the Yahoo! Movies data set, we compare
diversified results obtained based on explanation
with those obtained based on attributes

29
Overhead of Diversification is Small
30
Outline

Motivation
Problem Definition
Algorithms
Similarity Computation (briefly)
Recommendation Generation with Explanation
(briefly)
Diversification
Experimental Evaluation
Conclusion

31
Conclusion

Recommendation is becoming an indispensible
information exploration paradigm
Explanation-Based Diversification is a practical
alternative to attribute-based diversification
Algorithms Swap and Iterative Greedy strike a
good balance between relevance and diversity
Questions?

Write a Comment

User Comments (0)