Matchin: Eliciting User Preferences with an Online Game - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Matchin: Eliciting User Preferences with an Online Game

Description:

Matchin: Eliciting User Preferences with an Online Game. Severin Hacker, and Luis von Ahn ... Each chess player's performance in a game is modeled as a normally ... – PowerPoint PPT presentation

Number of Views:79
Avg rating:3.0/5.0
Slides: 24
Provided by: NRL1
Category:

less

Transcript and Presenter's Notes

Title: Matchin: Eliciting User Preferences with an Online Game


1
Matchin Eliciting User Preferences with an
Online Game
  • Severin Hacker, and Luis von Ahn
  • Carnegie Mellon University
  • SIGCHI 2009

2
Matchin
  • A game that asks two randomly chosen partners
    "which of these two images do you think your
    partner prefers?"

3
Some Findings
  • It is possible to extract a global "beauty"
    ranking within a large collection of images.
  • It is possible to extract the person's general
    image preferences.
  • Their model can determine a player's gender with
    high probability.

4
A Taxonomy of Methods
  • Absolute Versus Relative Judgments
  • Total Versus Partial Judgments
  • Random Access Versus Predefined Access
  • "I Like" Versus "Others Like"
  • Direct Versus Indirect

5
Existing Methods
  • Flickr Interestingness
  • Voting
  • Hot or Not

6
The Mechanism
  • Matchin is a two-player game that is played over
    the Internet.
  • Every game takes two minutes.
  • One pair of images usually takes between two to
    five seconds.
  • Matchin uses a collection of 80,000 images from
    Flickr that were gathered October 2007.

7
The Scoring Function
  • Matchin uses a sigmoid function for scoring
    games.
  • Constant scoring function
  • Players could get many points by quickly picking
    the images at random.
  • Exponential scoring function
  • The rewards sometimes became too high

8
The Data
  • The game was launched on May 15, 2008.
  • Within only four months, 86,686 games had been
    played by 14,993 players.
  • There have been 3,562,856 individual decisions
    (clicks) on images.
  • An individual decision/record is stored in the
    form
  • ltid, game_id, player, better, worse, time,
    waiting_timegt

9
Ranking Functions
  • Empirical Winning Rate (EWR)
  • ELO Rating
  • TrueSkill Rating

10
Empirical Winning Rate (EWR)
  • Function
  • Two problems
  • For images that have a low degree, the empirical
    winning rate might be artificially high or low.
  • It does not take the quality of the competing
    image into account.

11
ELO Rating (1/2)
  • The ELO rating system was introduced for rating
    chess players.
  • Each chess players performance in a game is
    modeled as a normally distributed random
    variable.
  • The mean of that random variable should reflect
    the players true skill and is called the
    players ELO rating.

12
ELO Rating (2/2)
  • Expected score
  • ELO rating

p dp
0.99 677
0.9 366
0.8 240
0.7 149
0.6 72
0.5 0
0.4 -72
0.3 -149
0.2 -240
0.1 -366
0.01 -677
13
TrueSkill Rating (1/2)
  • Every players skill s is modeled as a normally
    distributed random variable centered around a
    mean µ and per-player variance s2.
  • A players particular performance in a game then
    is drawn from a normal distribution with mean s
    and a per-game variance ß 2.

14
TrueSkill Rating (2/2)
  • Update
  • Conservative skill estimate

15
Collaborative Filtering (1/2)
  • In the collaborative filtering setting, they want
    to find out about each individual's preferences
  • recommend images to each user based on his/her
    preferences
  • compare users and images with each other
  • They have developed a new collaborative filtering
    algorithm they call Relative SVD

16
Collaborative Filtering (2/2)
  • The user feature vectors
  • The image feature vectors
  • The amount by which user i likes image j
  • Data a set D of triplets (i,j,k)
  • The error for a particular decision
  • The total sum of squared errors (SSE)

17
Comparison of the Models
18
Local Minimum
  • Do humans learn while playing the game?
  • They compared the agreement rate of first-time
    players and other players.
  • the first-time players 69.0
  • the more experienced players 71.8
  • They have also measured if people learn within a
    game.
  • the first half of the game 67
  • the second half of the game 64

19
Gender Prediction
  • The conditional entropy
  • The necessary conditional probabilities
    Pr(GgXx) can be computed with Bayes' rule
    given the class conditionals Pr(XxGg).
  • The naïve Bayes classifier will maximize the
    likelihood of the data
  • The total accuracy is 78.3

20
The Top Ranked Images
21
Discussion (1/2)
  • The highest ranked pictures
  • sunsets, animals, flowers, churches, bridges, and
    famous tourist attractions
  • neither provocative nor offensive
  • The worst pictures
  • taken indoors and include a person
  • blurry or too dark
  • screenshots or pictures of documents or text

22
Discussion (2/2)
  • There are substantial differences among players
    in judging images, and taking those differences
    into account can greatly help in predicting the
    users behavior on new images.
  • More experienced players had about the same error
    rate as new players.

23
Conclusion
  • The main contribution of this paper is to provide
    a new method to elicit user preferences .
  • They compared several algorithms for combining
    these relative judgments into a total ordering
    and found that they can correctly predict a
    users behavior in 70 of the cases.
  • They describe a new algorithm called Relative SVD
    to perform collaborative filtering on pair-wise
    relative judgments.
  • They present a gender test that asks users to
    make some relative judgments and can predict a
    random users gender in roughly 4 out of 5 cases.
Write a Comment
User Comments (0)
About PowerShow.com