MRI: Meaningful Interpretations of Collaborative Ratings - PowerPoint PPT Presentation

About This Presentation
Title:

MRI: Meaningful Interpretations of Collaborative Ratings

Description:

MRI: Meaningful Interpretations of Collaborative Ratings Mahashweta Das Sihem Amer-Yahia Cong Yu Gautam Das 37th International Conference on Very Large Data ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 47
Provided by: Mahashw
Learn more at: https://www.vldb.org
Category:

less

Transcript and Presenter's Notes

Title: MRI: Meaningful Interpretations of Collaborative Ratings


1
MRI Meaningful Interpretations of Collaborative
Ratings
  • Mahashweta Das Sihem Amer-Yahia Cong
    Yu
  • Gautam Das

 37th International Conference on Very Large Data
Bases, 2011 _at_ Seattle
2
Roadmap
  • Introduction
  • Motivation
  • Problem MRI
  • Sub problem DEM
  • Sub problem DIM
  • Data Model
  • Algorithms
  • Experiments
  • Quantitative
  • Qualitative
  • Conclusion Future Work

3
Roadmap
  • Introduction
  • Motivation
  • Problem MRI
  • Sub problem DEM
  • Sub problem DIM
  • Data Model
  • Algorithms
  • Experiments
  • Quantitative
  • Qualitative
  • Conclusion Future Work

4
Motivation
5
Motivation
6
Motivation
7
Motivation
  • Examining reviews vs. trusting overall aggregate
    rating
  • IMDB ratings demographic breakdown not meaningful
  • enough

8
MRI Problem
  • Examining reviews vs. trusting overall aggregate
    rating
  • IMDB ratings demographic breakdown not meaningful
  • enough
  • Novel and powerful third option Meaningful
    Rating Interpretation
  • Explain ratings by leveraging user and item
    attribute information

9
MRI Problem
  • Examining reviews vs. trusting overall aggregate
    rating
  • IMDB ratings demographic breakdown not meaningful
  • enough
  • Novel and powerful third option Meaningful
    Rating Interpretation
  • Explain ratings by leveraging user and item
    attribute information
  • Example

10
MRI Problem
  • Examining reviews vs. trusting overall aggregate
    rating
  • IMDB ratings demographic breakdown not meaningful
  • enough
  • Novel and powerful third option Meaningful
    Rating Interpretation
  • Explain ratings by leveraging user and item
    attribute information
  • Example

11
MRI Sub-problem
  • DEM Meaningful Description Mining
  • Identify groups of reviewers who consistently
    share similar ratings on items

12
MRI Sub-problem
  • DEM Meaningful Description Mining
  • Identify groups of reviewers who consistently
    share similar ratings on items

13
MRI Sub-problem
  • DIM Meaningful Difference Mining
  • Identify groups of reviewers who consistently
    disagree on item ratings

14
MRI Sub-problem
  • DIM Meaningful Difference Mining
  • Identify groups of reviewers who consistently
    disagree on item ratings

15
Roadmap
  • Introduction
  • Motivation
  • Problem MRI
  • Sub problem DEM
  • Sub problem DIM
  • Data Model
  • Algorithms
  • Experiments
  • Quantitative
  • Qualitative
  • Conclusion Future Work

16
Data Model
  • Collaborative rating site ltSet of Items, Set of
    Users, Ratingsgt
  • Rating tuple ltitem attributes,
    user attributes, ratinggt
  • Group Set of ratings describable by a set of
    attribute values
  • Notion of group based on data cube
  • OLAP literature for mining multidimensional data

ID Title Genre Director Name Gender Location Rating
1 Titanic Drama James Cameron Amy Female New York 8.5
2 Schindlers List Drama Steven Speilberg John Male New York 7.0
17
Data Model
  • Notion of group based on data cube lattice

Each node in lattice is a data cube/cuboid
Query condition on database
Figure 4-Dimensional Data Cube Lattice
18
Data Model
  • Notion of group based on data cube lattice

Each node in lattice is a data cube/cuboid
Query condition on database
A Gender B Age C Location D Occupation
Figure 4-Dimensional Data Cube Lattice
19
Data Model
Each node/data cube/ cuboid in lattice is a group
Selection Query Condition
A Gender Male B Age Young C Location
CA D Occupation Student
Figure Partial Rating Lattice for a
Movie (MMale, YYoung, CACalifornia, SStudent)
20
Data Model
Each node/data cube/ cuboid in lattice is a group
Selection Query Condition
A Gender Male B Age Young C Location
CA D Occupation Student
Figure Partial Rating Lattice for a
Movie (MMale, YYoung, CACalifornia, SStudent)
21
Data Model
Task Quickly indentify good groups in the
lattice that help users understand ratings
effectively
Figure Partial Rating Lattice for a
Movie (MMale, YYoung, CACalifornia, SStudent)
22
Roadmap
  • Introduction
  • Motivation
  • Problem MRI
  • Sub problem DEM
  • Sub problem DIM
  • Data Model
  • Algorithms
  • Experiments
  • Quantitative
  • Qualitative
  • Conclusion Future Work

23
DEM Meaningful Description Mining
  • For an input item covering RI ratings, return set
    C of cuboids, such that
  • description error is
    minimized, subject to
  • C k
  • coverage a
  • Description Error
  • Measures how well a cuboid average rating
    approximates the numerical score of each
    individual rating belonging to it
  • Coverage
  • Measures the percentage of ratings covered by
    the returned cuboids
  • DEM is NP-Hard Proof details in paper

24
DEM Algorithms
  • Exact Algorithm (E-DEM)
  • Brute-force enumerating all possible combinations
    of cuboids in lattice to return the exact (i.e.,
    optimal) set as rating descriptions
  • Random Restart Hill Climbing Algorithm
  • Often fails to satisfy Coverage constraint Large
    number of restarts required
  • Need an algorithm that optimizes both Coverage
    and Description Error constraints simultaneously
  • Randomized Hill Exploration Algorithm (RHE-DEM)

25
RHE-DEM Algorithm
Satisfy Coverage Minimize Error
C Male, Student California, Student
Figure Partial Rating Lattice for a Movie k2,
a80 (MMale, YYoung, CACalifornia, SStudent)
26
RHE-DEM Algorithm
Satisfy Coverage Minimize Error
C Male, Student California, Student
Say,C does not satisfy Coverage Constraint
Figure Partial Rating Lattice for a Movie k2,
a80 (MMale, YYoung, CACalifornia, SStudent)
27
RHE-DEM Algorithm
Satisfy Coverage Minimize Error
C Male, Student California, Student
C Male California,Student
C Student California,Student
Figure Partial Rating Lattice for a Movie k2,
a80 (MMale, YYoung, CACalifornia, SStudent)
28
RHE-DEM Algorithm
Satisfy Coverage Minimize Error
v
C Male California, Student
Say, C satisfies Coverage Constraint
Figure Partial Rating Lattice for a Movie k2,
a80 (MMale, YYoung, CACalifornia, SStudent)
29
RHE-DEM Algorithm
Satisfy Coverage Minimize Error
v
C Male California, Student
Figure Partial Rating Lattice for a Movie k2,
a80 (MMale, YYoung, CACalifornia, SStudent)
30
RHE-DEM Algorithm
Satisfy Coverage Minimize Error
v
C Male California, Student
Figure Partial Rating Lattice for a Movie k2,
a80 (MMale, YYoung, CACalifornia, SStudent)
31
RHE-DEM Algorithm
Satisfy Coverage Minimize Error
v
v
C Male Student
Figure Partial Rating Lattice for a Movie k2,
a80 (MMale, YYoung, CACalifornia, SStudent)
32
DIM Meaningful Difference Mining
  • For an input item covering RI RI- ratings,
    return set C of cuboids, such that
  • difference balance
    is minimized, subject to
  • C k
  • a n
    a
  • Difference Balance
  • Measures whether the positive and negative
    ratings are mingled together" (high balance) or
    separated apart" (low balance)
  • Coverage
  • Measures the percentage of , - ratings covered
    by the returned cuboids
  • DIM is NP-Hard Proof details in paper

33
DIM Algorithms
  • Exact Algorithm (E-DIM)
  • Randomized Hill Exploration Algorithm (RHE-DIM)
  • Unlike DEM error, DIM balance computation is
    expensive
  • Quadratic computation scanning all possible
    positive and negative ratings for each set of
    cuboids
  • Introduce the concept of Fundamental Regions to
    aid faster balance computation
  • Partition space of all ratings and aggregate
    rating tuples in each region

34
DIM Algorithms Fundamental Region
C1 Male, Student C2 California, Student
Balance
Figure Computing Balance using Fundamental
Region Set of k2 cuboids having
75 ratings (44, 31-),10 ratings (6, 4-)
35
Roadmap
  • Introduction
  • Motivation
  • Problem MRI
  • Sub problem DEM
  • Sub problem DIM
  • Data Model
  • Algorithms
  • Experiments
  • Quantitative
  • Qualitative
  • Conclusion Future Work

36
Experiments
  • Dataset
  • MovieLens100,000 ratings for 1682 movies by 943
    users
  • Each user has 4 attributes Gender, Age,
    Occupation, Location
  • Binning the movies Order movies according to
    number of ratings and then partition into 6 bins
  • Bin 1 movies with fewest ratings, Bin 6 movies
    with highest ratings
  • Evaluation
  • Quantitative Indicator Efficiency, Quality and
    Scalability
  • Qualitative Indicator Mechanical Turk User
    Study

37
Quantitative Experiments DEM
38
Quantitative Experiments DEM
39
Qualitative Experiments User Study
  • Amazon Mechanical Turk study
  • Two sets one for description mining, one for
    difference mining
  • Each set 4 randomly chosen movies, 30
    independent single-
  • user tasks
  • Study 1 Users prefer simple aggregate ratings
    over rating
  • interpretations
  • Study 2 Users prefer rating interpretations by
    exact algorithm or
  • heuristic randomized hill
    exploration algorithm

40
Qualitative Experiments User Study
41
Roadmap
  • Introduction
  • Motivation
  • Problem MRI
  • Sub problem DEM
  • Sub problem DIM
  • Data Model
  • Algorithms
  • Experiments
  • Quantitative
  • Qualitative
  • Conclusion Future Work

42
Conclusion and Future Work
  • Novel problem of meaningful rating interpretation
    (MRI) in collaborative rating sites
  • Meaningful Description Mining
  • Meaningful Difference Mining
  • Heuristic algorithmic solutions that generate
    equally good rating interpretations as exact
    brute-force with much less execution time
  • Meaningful interpretations of ratings by
    reviewers of interest
  • Additional constraints such as diversity of
    rating explanations

43
Related Work
  • Data Cubes
  • Gray et. al, A relational aggregation operator
    generalizing group-by, cross-tab, and sub-totals,
    ICDE 1996
  • Sathe et. al, Intelligent rollups in
    multidimensional olap data, VLDB 2001
  • Lakshmanan et. al, Quotient cube how to
    summarize the semantics of a data cube, VLDB 2002
  • Ramakrishnan et. al, Exploratory mining in cube
    space, ICDM 2006
  • Wu et. al, Promotion analysis in
    multi-dimensional space, VLDB 2009
  • Clustering Dimensionality Reduction
  • Agrawal et. al, Automatic subspace clustering of
    high dimensional data for data mining
    applications, SIGMOD 1998
  • Recommendation Explanation
  • Herlocker et. al, Explaining collaborative
    filtering recommendations, CSCW 2000
  • Bilgic et. al, Explaining recommendations
    Satisfaction vs. promotion, IUI 2005

44
Thank You
  • Questions

45
Quantitative Experiments DIM
46
Quantitative Experiments DEM, DIM
Write a Comment
User Comments (0)
About PowerShow.com