Accurately Interpreting Clickthrough Data as Implicit Feedback - PowerPoint PPT Presentation

About This Presentation
Title:

Accurately Interpreting Clickthrough Data as Implicit Feedback

Description:

Accurately Interpreting Clickthrough Data as Implicit Feedback ... Find the page displaying route map for Greyhound buses. User Study. Data Collection ... – PowerPoint PPT presentation

Number of Views:119
Avg rating:3.0/5.0
Slides: 17
Provided by: rostaf
Learn more at: https://sites.pitt.edu
Category:

less

Transcript and Presenter's Notes

Title: Accurately Interpreting Clickthrough Data as Implicit Feedback


1
Accurately Interpreting Clickthrough Data as
Implicit Feedback
  • Thorsten Joachims, Laura Granka, Bing Pan, Helene
    Hembrooke, Geri Gay
  • Cornell University
  • SIGIR 2005
  • Presented by Rosta Farzan
  • PAWS Group Meeting

2
Problem
Adapting retrieval systems requires large amount
of data
Explicit Data
Implicit Data
Expensive
Noisy and unreliable
3
Goal
  • Evaluate which types of implicit feedback can
    reliably be extracted from observed users behavior

4
Outline
  • Introduction
  • User Study
  • Analysis
  • Discussion

5
Introduction
  • Designing a study to evaluate the reliability of
    implicit feedback
  • How users interact with the list of ranked
    results from Google search
  • Two types of analysis
  • Analysis of users behavior
  • Using eye-tracking logging
  • Do users scan from top to bottom?
  • How many abstracts do they read before clicking?
  • How does users behavior change if the result are
    manipulated artificially?
  • Analysis of Implicit Feedback
  • Comparing implicit feedback with explicit
    feedback collected manually

6
User Study
  • Task
  • Five navigational
  • Find related web pages
  • Five informational
  • Find specific information
  • Users read each question in turn and answered
    orally when they found the answer
  • Participants
  • Phase I
  • 34 undergraduate, different major
  • Used data from 29 because of eye-tracking issues
  • Phase II
  • 22 participants, 16 were used
  • Conditions
  • Phase I
  • Normal - Googles search result with no
    manipulation
  • Phase II
  • Normal - Googles search result with no
    manipulation
  • Swapped -Top two results were switched in order
  • Reversed - 10 search results in reversed order

Navigation Find the homepage of Michael Jordan, the statistician. Find the page displaying route map for Greyhound buses.
Informational Where is the tallest mountain in New York located? Which actor starred as the main character in the original Time Machine movie?
7
User Study
  • Data Collection
  • Implicit data
  • HTTP-proxy server logs all click-stream data
  • Eye-tracking
  • fixations
  • Explicit data
  • Five judges for each two questions plus 10
    results pages from two other questions
  • Order the randomized results by how relevant they
    are
  • Relative decision making
  • Inter-judges agreement
  • Phase I (ordering top 10) 89.5
  • Phase II (ordering all results) 82.5

8
Analysis of User Behavior
  • Which links do users view and click?
  • Do users scan links from top to bottom?
  • Which links do users evaluate before clicking?

9
Which Links do Users View and Click?
User click substantially more often on the first
than second link
Scrolling
10
Do Users Scan Links from Top to Bottom?
On average users tend to read from top to bottom
There is a big gap before viewing the
third-ranked
Users first scan the viewable results quite
thoroughly before scrolling
11
Which Links do Users Evaluate before Clicking?
They view substantially more abstracts above than
below the click
12
Analysis of Implicit Feedback
  • How relevance of the document to the query
    influence clicking decision?
  • What Clicks tell us about the relevance of a
    document?

13
Does Relevance Influence User Decision?
  • Using reversed condition
  • Lower quality of retrieval
  • Users react to the relevance of the presented
    links
  • Users view lower ranked links more frequently
  • Scan significantly more abstracts
  • Users clicked less on first rank
  • Users clicked more often on low ranked

14
Are Clicks Absolute Relevance Judgments?
  • Trust bias
  • Ranked first receives
  • many more clicks
  • Quality bias
  • Comparing clicking behavior in normal condition
    vs. reversed condition.
  • On lower quality, users click on abstracts that
    are on average less relevant

15
Are Clicks Relative Relevance Judgments?
  • Consider not-clicked links as well as clicks as
    feedback signals
  • Example l1 l2 l3 l4 l5 l6 l7
  • Strategy 1 Click gt Skip Above
  • Rel(l3) gt rel(l2), rel(l5) gt rel(l2), rel(l5) gt
    rel(l4)
  • Phase I data supports this strategy but phase II
    doesnt
  • Strategy 2 Last Click gt Skip Above
  • Earlier clicks might be less informed than later
    clicks
  • Rel(l5) gt rel(l2), rel(l5) gt rel(l4)
  • Still not supported by phase II data

16
Strategies
  • Strategy 3 Click gt Earlier Click
  • Click later in time are on more relevant
    abstracts
  • Assuming order of clicks as 3, 1, 5
  • Rel(l1)gtrel(l3), rel(l5)gtrel(l3), rel(l5)gtrel(l1)
  • Not supported by data
  • Strategy 4 Last Click gt Skip Previous
  • Constraint only between a clicked link and a
    not-clicked link immediately above
  • Result is similar to strategy 1
  • Strategy 5 Click gt No-Click Next
  • Constraint between a clicked link and an
    immediately following link
Write a Comment
User Comments (0)
About PowerShow.com