Title: MultiPerspective Question Answering
1Multi-Perspective Question Answering
- ARDA NRRC Summer 2002 Workshop
2Participants
- Janyce Wiebe
- Eric Breck
- Chris Buckley
- Claire Cardie
- Paul Davis
- Bruce Fraser
- Diane Litman
- David Pierce
- Ellen Riloff
- Theresa Wilson
3Problem
- Finding and organizing opinions in the world
press and other text
4Our Work will Support
- Finding a range of opinions expressed on a
particular topic, event, issue - Clustering opinions and their sources
- Attitude (positive, negative, uncertain)
- Basis for opinion (supporting beliefs,
experiences) - Expressive style (sarcastic, vehement, neutral)
- Building perspective profiles of individuals and
groups over many documents and topics
5- Describe the collective perspective w.r.t.
issue/object presented in an individual article,
across a set of articles, - Describe the perspective of a particular
writer/individual/government/news service w.r.t.
issue/object in an individual article, across a
set of articles, - Create a perspective profile for agents, groups,
news sources, etc.
6Task Annotation
- Manual annotation scheme for linguistic
expressions of opinions -
- It is heresy, said Cao. The Shouters claim
- they are bigger than Jesus.
(writer,Cao)
(writer,Cao,Shouters)
(writer,Cao)
(writer,Cao)
7Task Conceptualization
- Various ways perspective is manifested in
language - Implications for higher-level tasks
8Task Automate Manual Annotations
- Machine learning
- Identification of opinionated phrases, sources of
opinions,
9Task Organizing Perspective Segments
- Unsupervised clustering
- Text features features from the annotation
scheme higher-level features
10Evaluation
- Exploratory manual clustering
- Evaluation of automatic annotations against
manual annotations - End-user evaluation of how well the system groups
text segments into clusters of similar opinions
about a given topic - Development of other end-user evaluation tasks
11Outline
- Annotation
- (Conceptualization)
- Architecture
- End-user evaluation
12Annotation
- Find opinions, evaluations, emotions,
speculations (private states) expressed in
language
Private state state that is not open to
objective observation or verification.
Quirk, Greenbaum, Leech, Svartvik (1985). A
Comprehensive Grammar of the English Language.
13Annotation
- Explicit mentions of private states and speech
events - The United States fears a spill-over from the
anti-terrorist campaign - Expressive subjective elements
- The part of the US human rights report about
China is full of absurdities and fabrications.
14Annotation
The US fears a spill-over, said Xirao-Nima, a
professor of foreign affairs at the Central
University for Nationalities.
15Annotation
- Whether opinions or other private states are
expressed in speech - Type of private state (negative evaluation,
positive evaluation, ) - Object of positive or negative evaluation
- Strengths of expressive elements and private
states
16Example
-
- It is heresy, said Cao. The Shouters claim
- they are bigger than Jesus.
17Example
The Foreign Ministry said Thursday that it was
surprised, to put it mildly
by the U.S. State Departments criticism of
Russias human rights
record and objected in particular to the odious
section on Chechnya.
18Sample Gate Annotation
19Accomplishments
- Fairly mature annotation scheme and instructions
- Representation supporting manual annotation using
GATE (Sheffield) - Annotation corpus
- Significant training of 3 annotators
- Participants understand the annotation scheme
20Architecture Overview
- Solution architecture includes
- Application Architecture
- supports high-level QA task
- Learning Architecture
- supports development of low- and mid-level system
components via machine learning - Annotation Architecture
- supports manual annotation
21Solution Architecture
Annotation Architecture
AnnotationTool
Learning Architecture
LearningAlgorithms
Trained Taggers
Application Architecture
PerspectiveTagging
DocumentRetrieval
DocumentClustering
Question
Other Taggers
22 Application Architecture
Multi-perspective Classifiers
Document Clustering
Documents
Annotation Database
Gate NE
CASS
Feature Generators
23 Learning Architecture
Evaluation
Training Data
Weka Learner
Weka Learner
Annotation Database
Gate NE
CASS
Feature Generators
24 Learning Tasks
- Identify subjective phrases
- Identify nested sources
- Discriminate Facts and Views
- Classify Opinion Strength
25Learning Features
- Name recognition
- Syntactic features
- Lists of words
- Contextual features
- Density
26Annotation Architecture
TopicDocuments
GateAnnotationTool
HumanAnnotators
Gate XML
MPQADatabase
27Data Formats
- Gate XML Format
- standoff
- structured
- MPQA Annotation Format
- standoff
- flat
- Machine Learning Formats (e.g., ARFF)
28End-User Evaluation Goal
- Establish framework for evaluating tasks that
would be of direct interest to analyst users - Do an example evaluation
29User Task Topic
- U1 User states topic of interest and interacts
with IR system - S1 System retrieves set of relevant documents,
and automatically performs perspective annotation
30Example Topic
- U1 2002 election in Zimbabwe
- S1 System returns
- 03.47.06-11142 Mugabe confident of victory in
- 04.33.07-17094 Mugabe victory leaves West in
- 05.22.13-11526 Mugabe says he is wide awake
- 06.21.57-1967 Mugabe predicts victory
- 06.37.20-8125 Major deployment of troops
- 06.47.23-22498 Zambia hails results
31User Task Question
- U2 User states particular perspective question
on topic. - Question should
- identify source type (e.g., governments,
individuals, writers) of interest. - Be a yes/no (or pro/con) question for now
32Example Question
- Give a range of perspectives by national
governments - Was the election process fair and free of voter
intimidation?
33User Task Question Response
- S2System clusters documents
- based on question, text, annotations
- goalgroup together documents with same answer
and perspective (including expressive content). - System, for now, does not attempt to label each
group with specific answers. - Target a small number of clusters (4?)
34Future User Task add constraint
- U3 User states constraints on clustered
documents or segments. - geographic, temporal, ideological, political,
religious - S3 System shows sub-clusters or highlighted
documents
35Example user adds constraint
- U3 Highlight governments by regions
- S3 System shows docs with African governments
opinions in red, North American in blue, European
in green, Asian in purple. Multi-color if docs
have more than one source.
36- User gets impression of how regional opinions are
distributed among clusters - Example
- Red docs (African) are mostly in one cluster,
- Blue and green (NA and EU) in another
- Purple docs are scattered in both clusters.
37Document Collection
- Large collection of 270,000 foreign news
documents from June, 2001 to May, 2002 - Almost all FBIS documents with a small number of
other relevant docs. - From MITRE MiTAP system
38Topics (Goal)
- About 12 Topic statements.
- Clause or Sentence
- 25-50 known relevant docs per topic to be
manually annotated with perspective annotations
(perhaps some in less depth) - 1-5 binary questions per topic
- For evaluation, documents manually annotated with
answers to questions (yes, no, or both)
39Evaluation on Topic/Question
- Artificially construct 75 doc retrieved set
- Include the known (25-50) rel docs
- Add top retrieved docs from SMART
- System automatically annotates set
- System clusters based on annotation.
40Evaluation (cont)
- Evaluate homogeneity of clusters with respect to
answers (yes, no, both). Compare with - Base Case 1 Cluster docs into same number of
clusters without any annotations - Base Case 2 Cluster docs into same number of
clusters based on manual annotations.
41Current Status
- Document collection prepared, indexed
- 8 topics (more coming)
- 16 questions total
- 10-40 rel docs per topic (more coming)
42Summary
- Annotation
- Conceptualization
- Architecture
- End-user evaluation
43Example
The Annual Human Rights Report of the US State
Department has been strongly criticized and
condemned by many countries. Though the report
has been made public for 10 days, its contents,
which are inaccurate and lacking good will,
continue to be commented on by the world media.
Many countries in Asia, Europe, Africa, and
Latin America have rejected the content of the US
Human Rights Report, calling it a brazen
distortion of the situation, a wrongful and
illegitimate move, and an interference in the
internal affairs of other countries. Recently,
the Information Office of the Chinese People's
Congress released a report on human rights in the
United States in 2001, criticizing violations of
human rights there. The report quoting data from
the Christian Science Monitor, points out that
the murder rate in the United States is 5.5 per
100,000 people. In the United States, torture and
pressure to confess crime is common. Many people
have been sentenced to death for crime they did
not commit as a result of an unjust legal system.
More than 12 million children are living below
the poverty line. According to the report, one
American woman is beaten every 15 seconds.
Evidence show that human rights violations in the
United States have been ignored for many years.
44Example
The Annual Human Rights Report of the US State
Department has been strongly criticized and
condemned by many countries. Though the report
has been made public for 10 days, its contents,
which are inaccurate and lacking good will,
continue to be commented on by the world media.
Many countries in Asia, Europe, Africa, and
Latin America have rejected the content of the US
Human Rights Report, calling it a brazen
distortion of the situation, a wrongful and
illegitimate move, and an interference in the
internal affairs of other countries. Recently,
the Information Office of the Chinese People's
Congress released a report on human rights in the
United States in 2001, criticizing violations of
human rights there. The report quoting data from
the Christian Science Monitor, points out that
the murder rate in the United States is 5.5 per
100,000 people. In the United States, torture and
pressure to confess crime is common. Many people
have been sentenced to death for crime they did
not commit as a result of an unjust legal system.
More than 12 million children are living below
the poverty line. According to the report, one
American woman is beaten every 15 seconds.
Evidence show that human rights violations in the
United States have been ignored for many years.
45Example
neg-attitude