Title: Action Modeling
1TREC-2006 at Maryland Blog, Enterprise and QA
Tracks
Douglas Oard, Tamer Elsayed, Yejun Wu, Pengyi
Zhang, Eileen Abels, Jimmy Lin, and Dagobert
Soergel
College of Information Studies / Computer Science
Department / UMIACS, University of Maryland,
College Park, USA
- Participation Goals
- Building an expert search baseline system
- Applying models of identity to public mailing
lists - Building a reference-resolution infrastructure
Participation Goals
- To explore the effectiveness of single-iteration
written clarification dialogs - To explore different strategies for clarifying
user needs in question answering - To better understand the nature of complex,
template-based questions.
Candidate List
Methods
.
Three types of interaction
1
Interaction Questions Topic 026 1. What types of
smuggled disks are you interested in? Check all
that apply ? VCDs ? CDs ? DVDs
? Other. Please specify
Example Question Topic 26. Question What
evidence is there for transport of smuggled
VCDs from Hong Kong to China? Narrative
The analyst is particularly interested in knowing
the volume of smuggled VCDs and also the ruses
used by smugglers to hide their efforts.
Email Addresses
Full Names
Nicknames
Enriched Candidate Models
Models of Identity
Candidate Scoring
Reference Recognition
2
Ranked List
Importance of Answer Types Topic 042 Please rate
the importance of following types of evidence. 1.
General claim of effects of aspirin. ?
Important. ? Somewhat important. ? Not
needed at all. 2. Guideline of how aspirin can be
used to treat heart diseases. ? Important.
? Somewhat important. ? Not needed at
all.
Topic
Retrieval Engine
- External resources
- CIA World Fact Book
- Google
- WordNet
- Rogets Thesaurus
- Wikipedia
Reference Credit Wf for Email Fields Reference Credit Wf for Email Fields Reference Credit Wf for Email Fields Reference Credit Wf for Email Fields
Sender 2.0 Receiver 1.0
Subject 1.0 New text tf
Quoted sender 1.0 Quoted receiver 0.5
Quoted text tf
Analysis of Interaction Responses
Questions
Duplicate Removal
W3C Mailing Lists
Queries
Email and Thread Index
Interaction Forms Generation
Document Retrieval
Results
3
Retrieval Retrieval Support Support
Query Approach MAP P_at_10 MAP P_at_10
Title Email 0.195 0.406 0.072 0.182
Title Narrative Email 0.350 0.504 0.141 0.298
Title Thread 0.218 0.449 0.090 0.198
Title Narrative Thread 0.343 0.514 0.139 0.294
Title Description Thread 0.315 0.502 0.119 0.278
Avg. of Medians Avg. of Medians 0.341 0.508 0.154 0.294
Removing non email-supported Removing non email-supported 0.365 0.525 0.147 0.311
Relevance Feedback Topic 055 Please indicate the
relevance of the following answers. 1. Most of
Sierra Leone's diamonds were and still are
smuggled into neighboring Liberia for sale,
according to several human rights groups and
diamond industry experts. ? Relevant. ?
Somewhat relevant. ? Not relevant.
Top 20 relevant documents
Ordered Answers
Answer Generation
Refined Answers
Answer Ranking
Unordered Answers
Retrieval Results
Supported Retrieval Results
Results and Analysis
Analysis 1 Interaction Performances by Type of
Interaction
Run F-Score
UMDM1pre UMDM1post 0.316 0.350 (10.6)
UMDA1pre UMDA1post 0.224 0.180 (-19.4)
Hom Much Email Support Over Topics?
Comparison at Topic Relevance
Type Topics Avg. Improvement
1 10 -0.0124 (-4.0)
2 12 0.0300 (8.2)
3 8 0.106 (44.0)
R-Prec
P_at_10
Bpref
MAP
Runs
0.3490
0.6200
0.3998
0.2849
ParTitDesDef
Sample relevance feedback
Clarification questions
Importance of answer types
0.3501
0.6200
0.4040
0.2845
ParTiDesDmt2
0.3542
0.6200
0.4034
0.2812
ParTiDesDmt3
Performance Relative to Email Support
Analysis 2 Consistency in Judgment
Analysis 3 Relevant Sentences vs. Answer Nuggets
0.3162
0.5280
0.3580
0.2362
ParTiDef
0.3516
0.5800
0.3866
0.2733
PasTiDesDef
Pre Post Manual Manual AutoFiller Automatic
Consistent judgments Consistent judgments 427 (87.9) 995 (90.3) 452 (90.0)
Y Y 194 224 78
N N 233 771 374
Inconsistent judgments Inconsistent judgments 59 (12.1) 107 (9.7) 50 (10.0)
Y N 37 48 20
N Y 22 59 30
Difference Difference -15 11 10
Relevant Sentences Partially Relevant Sentences Not Relevant Sentences
Nugget 74 8 16
Not Nugget 258 69 270
All 332 77 286
Percentage 22 10 6
- Conclusions
- Paragraphs better for both topic and opinion
retrieval. - TitleDescription queries beat title only.
- Demoting non-opinionated documents had little
effect.
- Future Work
- Parameter tuning for
- Low frequency words.
- Paragraph detection, passage size.
- Aggregation of opinion scores.
- Threshold of opinion scores.
Conclusion
Future Work
Conclusion
Future Work
- Relevance feedback does not always work for QA
- The error margin of nugget judgments is 10
- Relevant sentence ? answer nugget
- Examination of possible systematic errors in
- nugget judgments
- Exploration of the relationship between relevant
- sentences and answer nuggets
- Improved reference resolution
- Parameter tuning for weighted-field credit
- Learning from reply features
- Average performance
- Threads help in short queries
- More email support ? more accurate