Question Answering and Logistics

About This Presentation

Title:

Description:

Number of Views:47

Avg rating:3.0/5.0

Slides: 10

Provided by: Kathleen268

Learn more at: http://www1.cs.columbia.edu

Category:

Tags: answering | halfterm | logistics | question

Transcript and Presenter's Notes

Title: Question Answering and Logistics

1
Question AnsweringandLogistics
2
Class Logistics

Comments on proposals will be returned next week
and may be available as early as Monday
Look at website for slides
Grades on presentations and discussant questions

3
Next week 2/21

4
Using Knowledge-Based Constraints to Improve
Question-Answering Accuracy

5
Questions

Kristen - Hermjaikob
The reformulation technique described in
Hermjakob et al. require a person in the loop
to generalize phrase synonyms from
automatically-extracted patterns. These are
likely to be high-quality at the cost of being
low-coverage (420 assertions).
Some types of data are naturally suited to
knowledge bases, for example, dictionaries,
synonyms, lists of countries, etc. WordNet is an
example of a highly successful knowledge base
that is used a lot in NLP. Are phrasal
reformulations suitable for storing in a
knowledge base? ie, if we could expand their
system to contain a million assertions, would
this hand-crafted database be sufficient/useful?
These reformulations require the authors to
identify anchor-patterns that can be
reformulated only questions that match these
patterns can be expanded. What are good resources
for finding anchor patterns, ie, for finding
questions that users are likely to ask? (besides
prior TREC evaluations)
Can you think of ways that we could automatically
extract patterns like this? Or ways where the
human in the loop could do a lot less work?

6
Kristen - annotation

All of the papers rely heavily on data
annotation, e.g., marking names, numbers,
syntactic parsing, dependency parsing, etc.
However, automatic annotation is never perfect.
(For example, the best named entity tagger gets
90-95 F-measure on English newswire, which is a
very good score, but this means that 1 out of
every 10-20 named entities is incorrectly
tagged!)
Can automatic annotation ever be perfect? What
implications does this have for higher-order NLP
processing, such as question answering?
What measures do these systems take to address
these problems?

7
Kristen and Mahav - Evaluation

TREC has spurred a lot of research into question
answering over newswire, and created reusable
data for system comparison.
What are some of the downsides to having a
standardized bake-off with a shared
corpus/language/question type? If you worked for
a web search company, would you implement the
systems we've read about?
. Is the TREC evaluation methodology suitable?
Should the answer nuggets be constructed
independently of system outputs? What would you
do to evaluate a QA system?

8
Madhav Statistical vs. Symbolic

9
Madhav Complexity