Question Answering and Logistics - PowerPoint PPT Presentation

1 / 9
About This Presentation
Title:

Question Answering and Logistics

Description:

First half: talk. Second half: discussion. Discussants: David Smith, Narayan ... the answer type is a required term in the search engine query, greatly reducing ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 10
Provided by: Kathleen268
Category:

less

Transcript and Presenter's Notes

Title: Question Answering and Logistics


1
Question AnsweringandLogistics
2
Class Logistics
  • Comments on proposals will be returned next week
    and may be available as early as Monday
  • Look at website for slides
  • Grades on presentations and discussant questions

3
Next week 2/21
  • Invited speaker John Prager, IBM
  • Location 7th Floor Interschool Lab
  • Class structure
  • First half talk
  • Second half discussion
  • Discussants David Smith, Narayan
  • Others raise questions
  • This is your chance to find out the details of
    how a system works

4
Using Knowledge-Based Constraints to Improve
Question-Answering Accuracy
  • Most Question-Answering systems use a combination
    of statistical and symbolic techniques for
    example, almost all use a search component, which
    fetches documents and/or passages using
    statistical matching formulae, and answer
    selection techniques which are often more
    linguistically-informed. The QA system at IBM
    Research, which has performed well in TREC-QA
    over the years, is no different in those
    respects, but we have at the same time been
    exploring various knowledge-based filtering
    chniques to constrain candidate answers. I will
    describe three such techniques. The first, which
    has been part of our core system since 1999, is
    what we call Predictive Annotation, a form of
    semantic indexing in which the answer type is a
    required term in the search engine query, greatly
    reducing the number of passages that need to be
    considered. QA-by-Dossier asks additional
    questions to the one from the user, and enforces
    real-world constraints between the different
    questions and answers, on the assumption that
    only correct answers will provide a consistent
    model. Finally, Question Inversion is a specific
    form of QA-by-Dossier in which initial candidate
    answers are inserted into a reformulated question
    with a term removed, with the expectation that
    only the correct answer will allow the removed
    term to be recovered. I will present
    experimental results from using these techniques,
    and discuss their pros and cons.

5
Questions
  • Kristen - Hermjaikob
  • The reformulation technique described in
    Hermjakob et al. require a person in the loop
    to generalize phrase synonyms from
    automatically-extracted patterns. These are
    likely to be high-quality at the cost of being
    low-coverage (420 assertions).
  • Some types of data are naturally suited to
    knowledge bases, for example, dictionaries,
    synonyms, lists of countries, etc. WordNet is an
    example of a highly successful knowledge base
    that is used a lot in NLP. Are phrasal
    reformulations suitable for storing in a
    knowledge base? ie, if we could expand their
    system to contain a million assertions, would
    this hand-crafted database be sufficient/useful?
  • These reformulations require the authors to
    identify anchor-patterns that can be
    reformulated only questions that match these
    patterns can be expanded. What are good resources
    for finding anchor patterns, ie, for finding
    questions that users are likely to ask? (besides
    prior TREC evaluations)
  • Can you think of ways that we could automatically
    extract patterns like this? Or ways where the
    human in the loop could do a lot less work?

6
Kristen - annotation
  • All of the papers rely heavily on data
    annotation, e.g., marking names, numbers,
    syntactic parsing, dependency parsing, etc.
    However, automatic annotation is never perfect.
    (For example, the best named entity tagger gets
    90-95 F-measure on English newswire, which is a
    very good score, but this means that 1 out of
    every 10-20 named entities is incorrectly
    tagged!)
  • Can automatic annotation ever be perfect? What
    implications does this have for higher-order NLP
    processing, such as question answering?
  • What measures do these systems take to address
    these problems?

7
Kristen and Mahav - Evaluation
  • TREC has spurred a lot of research into question
    answering over newswire, and created reusable
    data for system comparison.
  • What are some of the downsides to having a
    standardized bake-off with a shared
    corpus/language/question type? If you worked for
    a web search company, would you implement the
    systems we've read about?
  • . Is the TREC evaluation methodology suitable?
    Should the answer nuggets be constructed
    independently of system outputs? What would you
    do to evaluate a QA system?

8
Madhav Statistical vs. Symbolic
  • 1. Moldovan et al. describe a heuristic based
    answer extraction system whereas Ittycheriah et
    al. use a statistically driven method. What are
    the drawbacks and advantages of each? Would these
    techniques, given the various heuristics (or
    features) used, adapt to different domains?
    Moldovan et al. claim that their system is
    open-domain.

9
Madhav Complexity
  • 3. Current QA systems seem to have to perform a
    great deal of processing in order to fulfill the
    task. When systems are as complex as this, and
    are made-up of many sub-components, is it at all
    possible (say, as a researcher) to pin-point the
    "weak-link" within a system in order to improve
    its performance? Can one evaluate the
    sub-components independently? Is that necessary?
    For instance, IBM used a dependency parser - do
    you think that similar higher-level NLP
    techniques would be more helpful in solving the
    problem at hand? Or will lower level bag-of-words
    (IR) techniques be more suitable?
Write a Comment
User Comments (0)
About PowerShow.com