What - PowerPoint PPT Presentation

About This Presentation
Title:

What

Description:

Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and ... Matt Damon. 11/29/09. 18. Having said all that... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 33
Provided by: DavidH234
Learn more at: https://nlp.stanford.edu
Category:
Tags: damon

less

Transcript and Presenter's Notes

Title: What


1
Whats in store for question-answering?Prognosti
cations based on corpus analysis of several
hundred million questions
  • John B. Lowe
  • Vice President for Language Engineering and Chief
    Linguist
  • Ask Jeeves, Inc. Emeryville, CA
  • October 7, 2000
  • Joint SIGDAT Conference on Empirical Methods in
    Natural Language Processing and Very Large
    Corpora
  • Hong Kong

2
Overview
  • Take-home messages when considering the Q-A
    task
  • Make sure you understand the question
  • Know what constitutes an answer
  • Robustness, Robustness, Robustness
  • Some anecdotes and a few statistics
  • Query types (keywords, questions, stories, etc.)
  • From both the consumer side of AJ (I.e. ask.com)
    as well as the corporate side
  • Prognostications
  • The best systems will be hybrids
  • Knowledge cliff is as tall as ever, if not
    taller, and it will be some time before it is
    climbed. Set expectations accordingly!

3
Another View of an Overview
  • This presentation contains
  • 12 Actual or nearly actual UQs (indeed all
    queries cited are real, unless cited from the
    literature or specifically marked)
  • 7 Rhetorical questions
  • 5 Summary statistics covering a subcorpus of
    approximately 1B UQs

4
Definitions What is an answer?
  • Short, coherent, responsive snippet of text
  • Result of a Computation or Deduction
  • The Trace of the process of arriving at a result
  • Longer snippet of text (a Passage)
  • Reference to a document or part of a document
  • Summary or extract from a document
  • Document
  • Set of documents
  • Audio, video, etc.
  • Some combination of the above

Increasing length, complexity
5
Definitions what is a query?
  • One or more Keywords
  • Keywords with Boolean Operators or additional
    user supplied structure
  • numeric or other Parametric Values set via UI
  • Phrases (keywords with linguistic coherence)
  • (Grammatical) Sentences with interrogative or
    imperative syntax
  • Short Discourses, usually concluded with a
    question
  • Audio, video, etc.
  • Some combination of the above

Increasing length, complexity
6
Question-Answering vs. IR
  • Classically, question-answering systems provided
    answers in response to questions
  • In contrast, IR systems provided documents in
    response to queries, normally composed of
    keywords
  • Corollaries
  • Providing documents in response to questions is
    not question-answering
  • Providing answers in response to queries is not
    IR
  • However, the world is not black and white. More
    like black and grey Graham Greene

7
Question-Answering vs. IR
Classical Question- Answering Classical Infor
mation Retrieval TREC-8 Like Hybrid
8
Query types arranged on a Difficulty Scale
  • Keywords and Keywords Plus
  • Short, factual (TREC-8-like) Questions
  • Hard Lookup Questions
  • Questions that look hard but arent
  • Questions that look hard and are
  • Story Problems Two Flavors
  • These will usually be all jumbled together!

Difficulty
9
An Anecdotal Analysis of the Question-Answering
Task, Based on User Behavior and Expectations,
as Reflected by What They Ask
  • NB
  • Most of these UQs are from ask.com, the
    open-domain consumer-oriented web site.
  • Some UQs from corporate implementations have been
    modified to protect the anonymity of customers

10
Users may not be trainable
  • Users often expect the system to derive or
    otherwise obtain appropriate context
  • At minimum, POS, WSD and other basic linguistic
    and semantic distinctions are expected.
  • Users may attempt to provide such context if they
    feel it is important or unavailable to the system
  • In which case, watch out!
  • Users may evaluate the system to determine how
    best to provide input (and context)
  • Often this is done by experimental input
  • Muddies the user log and challenges adaptive
    approaches

11
Intentions of users are complex
NB this data is for ask.com!
12
Short, factual questions
  • Where is Greenwich, CT? (Lehnert 1982)
  • 42N, 80W
  • About 90 miles north of New York City
  • Where is the Taj Mahal? (TREC-8)
  • Atlantic City, New Jersey, USA
  • Agra, Uttar Pradesh, India
  • NB answer reflects cultural bias of corpus from
    which answer was obtained
  • What is the meaning of life?
  • For Ask Jeeves, it is (found in) a URL

13
Answers to Hard Lookup Questions
  • Can my gynacologist sic tell my parents if Im
    pregnant?
  • Answers can be very, very short!
  • In this case, however, though the length of the
    answer is a single bit, an authoritative document
    is probably the best response

14
Hard Lookup questions (contd)
  • Can my gynacologist sic tell my parents if Im
    pregnant?
  • Sometimes spelling errors reflect language
    competence (and therefore indirectly age)
  • Sex of asker is (nominally) clear. This bit of
    personalization is of course based on real world
    knowledge
  • Use of if rather than that prevents
    presupposing the asker is pregnant
  • Utility of the answer is very different depending
    on whether the presupposition is true or not!

15
Another Hard Lookup Question
  • What did Tom Hanksi say to Private Ryan as hei
    was dying?
  • Answer is a 34 second snippet which occurs at
    about 23600 out of 24800 total duration
  • Soundtrack is complex at this moment hard to
    pick out even for native speakers, but the
    utterance seems to be
  • earn this earn it

16
How do we get the answer?
  • Assumptions
  • We have the film and permission to use it.
  • We have time-aligned markup of text and video
  • We have the tools to handle such multimedia
    access
  • All of these technical issues are still a
    challenge
  • But the really tough part is still the
    relationship between the language and the real
    world (i.e. primarily linguistics)
  • Does the markup indicate the states of people --
    alive, dead, or in between (i.e. dying)?
  • More importantly, interpreting the question seems
    to require Mental Spaces (Fauconnier 1985, 1988,
    c).

17
Mental Spaces Required?
In reality, Tom Hanks never said anything to
Private Ryan.
Conclusion while IR may bring one within
striking distance of the answer, high-level NLU
potentially required to determine if you really
got the right one.
18
Having said all that
  • Purely serendipitously, there is an IR solution
    to this PARTICULAR question (using the
    encyclopedic aspect of the web)
  • Some search engines retrieve discussions about
    this apparently important moment (which in some
    ways is the climax of the movie)

19
Some hard questions are easy
  • Sometimes, Big Differences are not important
    the following two sentences have quite
    different syntax, but share most of
  • their answers in common. But note
    both cannot be answered Oh, not far!

English What is the distance from Tokyo to Yoko
hama ? How far is it from Tokyo to Yokohama ?
Japanese ?? ? ?? ? ??????????????? toukyou
to yokohama no aida no kyori ha, ikura desu
ka? ?? ?? ?? ???? ????????????? toukyou
kara yokohama made ha, dono gurai hanarete imasu
ka?
20
Some easy questions are hard
  • Small Differences may be important
  • Books by kids
  • Books for kids these differ only by
    stopwords
  • Books for under 20
  • Books about kids
  • (Rilloff et. al (1994), Pustejovsky, Lexeme,
    NPR 6/2000)
  • In this case the stop words are critical

21
Story Problem 1 (conventional)
  • After Bobrow 1967 (and Dreyfus 1972, 1992)
  • NB requires you to Show your work! (I.e.
    display trace)

Elizabeth, Brian, Dean and Leslie want to cross
a bridge. They all begin on the same side and
have only 17 minutes to get everyone across to
the other side. It is night and there is only one
flashlight. A max of two people can cross at one
time. Any party who crosses, either 1 or 2
people, must have the flashlight with them. The
flashlight must be walked back and forth it
cannot be thrown. Each student walks at a
different speed - Elizabeth 1 minute, Brian 2
minutes, Dean, 5 minutes, and Leslie 10 minutes.
A pair must walk together at the rate of the
slower students pace. How can they get everyone
across in 17 minutes?
22
Story Problem 2, (a users lament)
  • Typically, a customer support problem
  • Often, these are not really questions

PLEASE HELP ME! I don't know who to ask. I
want to mail merge a specific category from
address book in email program and all I can
figure out how to do is merge the entire
mailing list. If you can't help me, please tell
me who can. Thank you
23
Story Problem 3, (Ive almost got it)
  • Sentence punctuation is poor
  • Identification and tokenization of NEs is a
    challenge

I cant install Age of Empires now that I 've
upgraded to win98 from Win 95 computer says I
have 1GB of hard drive space but the installation
failed after taking 30 minutes with the words not
enough hard drive space. Should I update the
drive to FAT 32 and try again?
24
Story Problem 4, (share my misery)
  • "I have SuperOS 1776 and a Hogwarts Color 999
    printer. I had to reformat my computer and now I
    haven't been able to find a driver to reinstall
    the printer.. I've only found a driver for
    SuperOS 1492 and 1789. I tried it anyway, and big
    surprise, it didn't work.
  • Deep NLP is going to have some trouble here (even
    people do!)
  • Would IR work better?

25
Story Problem 5, (a poem)
  • i want to customize my mouse and keyboard
  • by having mouse in 3-dimensional
  • and having mouse trails
  • i also want to slow down the cursor blink rate
  • am also having trouble ith the left mouse butoon
  • as i am left handed
  • which steps do i take to change these things
  • The e e cummings approach to keyboard entry
  • The point tremendous stylistic variation exist
    in typography, orthography, conceptualization,
    and so on.

26
Statistical Properties of AJ UQs
Average length of user query (on ask.com punctuation excluded) 4.8
UQs which contain an unknown token 30
Unknown tokens which are errors 60
Queries which begin with wh-words 37

27
Building REs from NEs
  • for Britney Spears
  • Br.n.y Sp.r.s
  • How to build these from scratch?
  • But see Brill, et. al. in this workshop for a
    solution!

28
Distribution of query lengths
!
29
Zipfs Law applies to user queries
  • Rank-frequency distribution of UQs with 3600 gt f
    gt 1100 (for a day or so)
  • 3524 where can i browse lyrics?
  • 2713 where can i find online airfare specials?
  • 2532 is jeeves gay?
  • 2216 how can i find someone?
  • 2190 where can i find a reverse phone directory?
  • 2120 where can i find information on captech?
  • 1980 (filtered)
  • 1934 (filtered)
  • 1852 where can i find erotica from white shadows?
  • 1787 where can i get driving directions between
    cities?
  • 1585 am i in love?
  • 1567 how do i make a web page?
  • 1544 cars
  • 1520 where can i find a reverse email directory?
  • 1485 how do i use the internet to find a job?
  • 1420 (filtered)
  • 1336 where can i find the lyrics to songs by
    eminem?
  • 1323 where can i listen to music online?

30
Conclusions
31
Sobering Insights (or Nothing New)?
  • The bar has been considerably raised!
  • Communication is not accomplished by the
    exchange of symbolic expressions. Communication
    is, rather, the successful interpretation by an
    addressee of a speakers intent in performing a
    linguistic act. (Green 1996)
  • Hybrid approaches will be de rigueur for many
    practical applications need to work on combining
    outputs from
  • Search engines
  • QA systems
  • Other inferencing engines (decision tree, CBR,
    etc.)
  • We must make friends with our users (and provide
    cognitively appealing UIs)
  • If you ask the same question, you get the same
    answer! (a distinctly unhuman behavior (e.g.
    Where?)
  • Knowing when you dont know understanding
    failure modes of the system and communicating
    this to the user

32
Thank You!
Write a Comment
User Comments (0)
About PowerShow.com