It Takes Variety to Make a World - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

It Takes Variety to Make a World

Description:

It Takes Variety to Make a World. Recommendation: An Increasingly ... Serendipity: aim to return less relevant items that might give users a pleasant surprise. ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 32
Provided by: Yah993
Category:

less

Transcript and Presenter's Notes

Title: It Takes Variety to Make a World


1
It Takes Variety toMake a World
  • Diversification in Recommendation Systems
  • Cong Yu1, Laks Lakshmanan2, Sihem Amer-Yahia1
  • 1Yahoo! Research NYC 2University of British
    Columbia
  • March 25th 2009 _at_ EDBT

2
Recommendation An Increasingly Important and
Ubiquitous Paradigm on the Web
  • The Recommendation Paradigm
  • Suggest content (in most cases, items) to users
    based on her profile and past activities.
  • Why Recommendation?
  • Search queries can be generic e.g., gt90 of
    Yahoo! Travel queries are general descriptions
    like family trip.
  • More so for Social Content Sites ...

3
Recommendations on Social Content Sites
  • Social Content Sites
  • Sites where users make friends and share contents
  • E.g., del.icio.us, Flickr, etc.
  • Recommendation is an indispensible information
    exploration paradigm on social content sites.
  • The rich activities and user connections provide
    lots of opportunities for generating
    recommendations.

4
Challenges in Recommendation
  • While relevance is important, other factors are
    critical too
  • Diversity avoid returning items that are too
    similar to each other.
  • Novelty avoid returning items that users are
    likely to know already.
  • Serendipity aim to return less relevant items
    that might give users a pleasant surprise.
  • Result Diversification

From the pool of relevant items, identify a list
of items that are dissimilar to each other and
maintain a high cumulative relevance, i.e.,
strike a good balance between relevance and
diversity.
5
Existing Solutions
  • Attribute-Based Diversification
  • Follow Three Steps
  • Obtain attributes of each relevant items
  • Define a pair-wise item-to-item distance function
    based on those attributes
  • Perform Diversification
  • Optimizing an overall score as a weighted
    combination of relevance and distance
  • Constraining either relevance or distance,
    maximizing the other

6
Problems with ExistingAttribute-Based
Diversification
  • Lack of attributes for objects
  • URLs in del.icio.us and photos in Flickr
  • Overhead for retrieving attributes for certain
    recommendation strategies
  • Difficulties in estimating the correct
    parameters/thresholds for diversification
    algorithms

Our Solutions Explanation-Based
Diversification Dynamic Diversification
Algorithms
7
Main Contributions
  • Formalized the Notion of Explanation-Based
    Diversification
  • Designed and Implemented Algorithms for
  • Scalable Similarity Computation
  • Explanation Generation
  • Diversification
  • Experimentally Evaluated
  • The characteristics of diversifications
    algorithms
  • The practicality of explanation-based
    diversification
  • The performance overhead of explanation-based
    diversification

8
Outline
  • Motivation
  • Problem Definition
  • Algorithms
  • Similarity Computation
  • Recommendation Generation with Explanation
  • Diversification
  • Experimental Evaluation
  • Conclusion

9
Recommendation Strategies Overview
  • Item-Based Strategies
  • Estimate the rating of an unrated item (i) by the
    user (u) based on its similarity to items already
    rated and how u rated those items.
  • Collaborative Filtering Strategies
  • Estimate the rating of i by u based on how us
    similarity network (either explicit or implicit)
    rated i.

10
Explanation
  • Basic Notion
  • The set of objects because of which a particular
    item is recommended to the user
  • Explanation for Item-Based Strategies
  • Explanation for Collaborative Filtering Strategies

11
Explanation-Based Diversity
  • Pair-wise diversity distance between two
    recommended items
  • Standard similarity measures like Jaccard
    similarity and cosine similarity
  • E.g. (Distance based on Jaccard similarity)
  • Diversity for the set of recommended items (S)

12
Benefits of Explanation-Based Diversification
  • Applicable to items without attributes or whose
    attributes are difficult to analyze
  • Common on social content sites
  • Explanations are by-products of many
    recommendation processes
  • They can be maintained with little overhead

13
Top-K Recommendation with Diversification Given
a user u, find a subset S from the set of
candidate items, such that S k and the
overall relevance of items in S and the diversity
of S are balanced.
14
Outline
  • Motivation
  • Problem Definition
  • Algorithms
  • Similarity Computation (briefly)
  • Recommendation Generation with Explanation
    (briefly)
  • Diversification
  • Experimental Evaluation
  • Conclusion

15
Diversification Balance between Relevance and
Diversity
  • Relevance cumulative relevance of all items in
    the result set
  • Diversity average distance of all item pairs in
    the result set as described earlier
  • Ideal Scenario identify a top-k result set that
    maximizes both relevance and diversity
  • Such top-k set is often impossible to find

16
Naïve Solution Maximize Relevance
17
Naïve Solution Maximize Diversity
18
Smarter Solutions
  • Eliminate items with scores below a threshold and
    choose k items among the remaining with the
    maximum diversity
  • Eliminate item pairs with distance below a
    threshold (by removing the item with lower score)
    and choose k items with highest scores

Algorithm Swap combines both!
19
Algorithm Swap
  • Sort candidate items according to their relevance
  • Start by adding the K most relevant items to the
    result set
  • Go through the rest of the candidate one by one,
    swap an item into the result set if the item
  • Increases the set diversity above a certain
    threshold
  • Does not drop the relevance by a certain
    threshold
  • A simple top-2 example

c1
Results
Results
Results
c1
c1
c2
c1
c3
c3
c4
c3
c3
relevance c1 0.9 c2 0.6 c3 0.4 c4 0.3
c2
c4
initial set
diversity increase out-weighs the relevance drop
no change
20
Challenge The appropriate thresholds are often
difficult to identify to produce the
right top-K recommendations
21
Algorithm Iterative Greedy
  • Dynamically identify the thresholds
  • Establish two diversity bounds, Upper and Lower
    Bounds.
  • At each iteration, scan the candidates
  • items passing upper bound go to KeepList
  • items not passing lower bound go to DiscardList.
  • At the end of each iteration, bounds are adjusted
  • Stop when exactly K items are generated

22
Algorithm Iterative Greedy, contd
If DivList KeepList lt K
DivList
KeepList
pass
pass
UB
UB B
Candidates
B (UBLB)/2
Next Iteration
Candidates
LB B
LB
no-pass
no-pass
SimList
DiscardList
23
Brief Overview of Similarity Computation
  • Explicit network is not enough
  • E.g., only 10 of users have at least one friend
    in del.icio.us
  • Similarities between users can be generated based
    on their activities
  • Costly with pair-wise comparison
  • E.g., 1 million users gt 1 trillion comparisons
  • Only a small fraction of those comparisons result
    in similarity above a given threshold
  • Item-Based Similarity Computation
  • Organize items based on the number of raters
  • Start with items with the largest number of
    raters
  • Compare two users only if they share enough rated
    items
  • Details in the paper

24
Brief Description of Recommendation Generation
with Explanation
  • Post-Processing Approach
  • Generate the recommendation result set
  • Generate candidates
  • Compute scores for candidates
  • Sort the candidates
  • For each item in the result set, fetch its
    explanations
  • Integrated Approach
  • Maintain the list of similar items or similar
    users when the candidates are being generated
  • Create the explanations for each item while the
    scores are being computed
  • Details in the paper

25
Outline
  • Motivation
  • Problem Definition
  • Algorithms
  • Similarity Computation (briefly)
  • Recommendation Generation with Explanation
    (briefly)
  • Diversification
  • Experimental Evaluation
  • Conclusion

26
Experimental Data
  • Real world data sets
  • del.icio.us online bookmark sharing site
  • Y! Movies Yahoo!s online movie sharing site

27
Result Comparison
28
Explanations can Serve as a Good Basis for
Diversification
  • Leveraging the Yahoo! Movies data set, we compare
    diversified results obtained based on explanation
    with those obtained based on attributes

29
Overhead of Diversification is Small
30
Outline
  • Motivation
  • Problem Definition
  • Algorithms
  • Similarity Computation (briefly)
  • Recommendation Generation with Explanation
    (briefly)
  • Diversification
  • Experimental Evaluation
  • Conclusion

31
Conclusion
  • Recommendation is becoming an indispensible
    information exploration paradigm
  • Explanation-Based Diversification is a practical
    alternative to attribute-based diversification
  • Algorithms Swap and Iterative Greedy strike a
    good balance between relevance and diversity
  • Questions?
Write a Comment
User Comments (0)
About PowerShow.com