Improving Search Results Quality by Customizing Summary Lengths - PowerPoint PPT Presentation

About This Presentation
Title:

Improving Search Results Quality by Customizing Summary Lengths

Description:

Improving Search Results Quality by Customizing Summary Lengths. Michael Kaisser, Marti Hearst ... types of queries require responses of different lengths? ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 31
Provided by: K2118
Category:

less

Transcript and Presenter's Notes

Title: Improving Search Results Quality by Customizing Summary Lengths


1
Improving Search Results Quality by Customizing
Summary Lengths
  • Michael Kaisser?, Marti Hearst?
  • and John B. Lowe?
  • ?University of Edinburgh,
  • ?UC Berkeley, ?Powerset, Inc.
  • ACL-08 HLT

2
Talk Outline
  • How best to display search results?
  • Experiment 1 Is there a correlation between
    response type and response length?
  • Experiment 2 Can humans predict the best
    response length?
  • Summary and Outlook

3
Motivation
  • Web Search result listings today are largely
    standardized display a documents surrogate
    (Marchionini et al., 2008)
  • Typically One header line, two lines text
    fragments, one line for URL
  • But Is this the best way to present search
    results? Especially Is this the optimal length
    for every query?

(Source Yahoo!)
4
Experiment 1 Research Question
  • Do different types of queries require responses
    of different lengths?
  • (And if so, is the preferred response type
    dependent on the expected semantic response
    type?)

5
Experiment 1 Setup
  • Data used
  • 12,790 queries from Powersets query database
  • Contains search engines query logs and hand
    crafted queries
  • disproportionally large number of natural
    language queries

6
Experiment 1 Setup
  • Disproportionally large number of natural
    language queries.
  • Examples
  • date of next US election
  • Hip Hop
  • A synonym for material
  • highest volcano
  • What problems do federal regulations cause?
  • I want to make my own candles
  • industrial music

7
Excursus Mechanical Turk
  • Amazon web services API for computers to
    integrate "artificial artificial intelligence"
  • requesters can upload Human Intelligence Tasks
    (HITs)
  • Workers work on these HITs and are paid small
  • sums of money
  • Examples
  • can you see a person in the photo?
  • is the document relevant to a query?
  • is the review of this product positive or
    negative?

8
Excursus Mechanical Turk
  • Amazon web services API for computers to
    integrate "artificial artificial intelligence"
  • requesters can upload Human Intelligence Tasks
    (HITs)
  • Workers work on these HITs and are paid small
  • sums of money
  • ? Mechanical Turk is/can also be seen as a
    platform for online experiments

9
Experiment 1
  • Turkers are asked to classify queries by
  • Expected response type
  • Best response length
  • Each query is done by three different subjects.

10
(No Transcript)
11
Experiment 1 Results
  • Distribution of length categories differs across
    individual expected response categories.
  • Some results are intuitive
  • Queries for numbers want short results
  • Advice queries want longer results
  • Some results are more surprising
  • Different length distributions for Person vs.
    Organization

12
Experiment 2 Research Question
  • Can human judges correctly predict the preferred
    result length?

13
Experiment 2 Setup
  • Experiment 1 produced 1099 high-confidence
    queries (where all three turkers agreed on
    semantic category and length)
  • For 170 of these turkers manually created
    snippets from Wikipedia of different lengths
  • Phrase
  • Sentence
  • Paragraph
  • Section
  • Article (in this case a link to the article was
    displayed)
  • Note Categories differ slightly from first
    experiment

14
Experiment 2 Setup
Manually created snippets from Wikipedia of
different lengths
15
Experiment 2 Setup
  • Displayed
  • Instructions
  • Query
  • One response from one length category
  • Rating scale
  • Each Hit was shown to ten turkers.

16
Experiment 2 Setup
Instructions Below you see a search engine
query and a possible response. We would like you
to give us your opinion about the response. We
are especially interested in the length of the
response. Is it suitable for the query? Is there
too much or not enough information? Please rate
the response on a scale from 0 (very bad
response) to 10 (very good response).
17
(No Transcript)
18
Experiment 2 Significance
Significance results of unweighted linear
regression on the data for the second experiment,
which was separated into four groups based on the
predicted preferred length.
19
Experiment 2 Details
  • 146 queries
  • 5 length categories per query
  • 10 judgments per query
  • 7,300 judgments
  • 124 judges
  • 16 judges did more than 146 hits
  • 2 of these 16 were excluded (scammers)
  • 0.01 per judgment
  • 73 paid at judges, plus 73 Amazon fees
  • 146 for Experiment 2 (excluding snippet
    generation)

20
Experiment 2 Results
  • Results
  • Human judges can predict the preferred result
    lengths (at least for a subset of especially
    clear queries)

21
Experiment 2 Results
  • Results
  • Human judges can predict the preferred result
    lengths (at least for a subset of especially
    clear queries)
  • ? Standard results listings are often too short
    (and sometimes too long)

22
Outlook
  • Can queries be automatically classified according
    to their predicted result length?
  • Initial Experiment
  • Unigram word counts
  • 805 training queries, 286 test queries
  • Three length bins (long, short, other)
  • Weka NaiveBayesMultinomial
  • Initial Result
  • 78 of queries correctly classified

23
  • Thank you!

24
MT Demographics - Age
Survey, data and graphs from Panos Ipeirotis
blog http//behind-the-enemy-lines.blogspot.com/
2008/03/mechanical-turk-demographics.html
25
MT Demographics - Gender
Survey, data and graphs from Panos Ipeirotis
blog http//behind-the-enemy-lines.blogspot.com/
2008/03/mechanical-turk-demographics.html
26
MT Demographics - Education
Survey, data and graphs from Panos Ipeirotis
blog http//behind-the-enemy-lines.blogspot.com/
2008/03/mechanical-turk-demographics.html
27
MT Demographics - Income
Survey, data and graphs from Panos Ipeirotis
blog http//behind-the-enemy-lines.blogspot.com/
2008/03/mechanical-turk-demographics.html
28
MT Demographics - Purpose
Survey, data and graphs from Panos Ipeirotis
blog http//behind-the-enemy-lines.blogspot.com/
2008/03/mechanical-turk-demographics.html
29
(No Transcript)
30
Excursus Mechanical Turk
Example HIT (not ours)
Write a Comment
User Comments (0)
About PowerShow.com