Diapositiva 1 - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Diapositiva 1

Description:

CLEF Cross Language Evaluation Forum. Question Answering at CLEF 2003 ... Learning University, Madrid Spain ... to design real multilingual systems ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 32
Provided by: xxx3170
Category:

less

Transcript and Presenter's Notes

Title: Diapositiva 1


1
The Multiple Language Question Answering Track
at CLEF 2003
Bernardo Magnini, Simone Romagnoli, Alessandro
Vallin Jesús Herrera, Anselmo Peñas, Víctor
Peinado, Felisa Verdejo Maarten de
Rijke ITC-irst, Centro per la Ricerca
Scientifica e Tecnologica, Trento - Italy
magnini,romagnoli,vallin_at_itc.it UNED,
Spanish Distance Learning University, Madrid
Spain jesus.herrera,anselmo,victor,felisa
_at_lsi.uned.es Language and Inference
Technology Group, ILLC, University of Amsterdam -
The Netherlands mdr_at_science.uva.nl
2
Overview of the Question Answering track at CLEF
2003
  • Report on the organization of QA tasks
  • Present and discuss the participants results
  • Perspectives for future QA campaigns

3
  • QA find the answer to an open domain question in
    a large collection of documents
  • INPUT questions (instead of keyword-based
    queries)
  • OUTPUT answers (instead of documents)
  • QA track at TREC
  • Mostly fact-based questions
  • Question Who invented the electric light?
  • Answer Edison
  • Scientific Community
  • NLP and IR
  • AQUAINT program in USA
  • QA as an applicative scenario

4
Purposes
  • Answers may be found in languages different from
    the language of the question
  • Interest in QA systems for languages other than
    English
  • Force the QA community to design real
    multilingual systems
  • Check/improve the portability of the technologies
    implemented in current English QA systems
  • Creation of reusable resources and benchmarks for
    further multilingual QA evaluation

5
  • QA_at_CLEF WEB SITE ( http//clef-qa.itc.it )
  • CLEF QA MAILING LIST ( clef-qa_at_itc.it )
  • GUIDELINES FOR THE TRACK (following the model
    of TREC 2001)

6
200 questions
target corpus
exact answers
50 bytes answers
7
(No Transcript)
8
1
1
0
1
1
1
3
1
9
4 p/d for 1 run (600 answers)
QA system
Assessment
English answers
English text collection
Italian questions
English questions
Question extraction
Translation
1 p/m for 200 questions
2 p/d for 200 questions
10
Corpora licensed by CLEF in 2002
  • Dutch Algemeen Dagblad and NRC Handelsblad
    (years 1994 and 1995)
  • Italian La Stampa and SDA press agency (1994)
  • Spanish EFE press agency (1994)
  • English Los Angeles Times (1994)

MONOLINGUAL TASKS
BILINGUAL TASK
11
QUESTIONS SHARING
ILLC
ITC-irst
UNED
300 ItaSpa
300 DutSpa
300 ItaDut
ENGLISH
DATA MERGING
150 Dutch/English
150 Italian/English
the DISEQuA corpus
150 Spanish/English
12
  • 200 fact-based questions for each task
  • queries related to the events occurred in the
    years 1994 and/or 1995, i.e. the years of the
    target corpora
  • coverage of different categories of questions
    date, location, measure, person, object,
    organization, other
  • questions were not guaranteed to have an answer
    in the corpora 10 of the test sets required
    the answer string NIL

13
  • 200 fact-based questions for each task
  • queries related to the events occurred in the
    years 1994 and/or 1995, i.e. the years of the
    target corpora
  • coverage of different categories of questions
    (date, location, measure, person, object,
    organization, other)
  • questions were not guaranteed to have an answer
    in the corpora 10 of the test sets required
    the answer string NIL
  • - definition questions (Who/What is X)
  • - Yes/No questions
  • - list questions

14
  • Participants were allowed to submit up to three
    answers per question and up to two runs
  • answers must be either exact (i.e. contain just
    the minimal information) or 50 bytes long strings
  • answers must be supported by a document
  • - answers must be ranked by confidence
  • Answers were judged by human assessors, according
    to four categories
  • CORRECT (R)
  • UNSUPPORTED (U)
  • INEXACT (X)
  • INCORRECT (W)

15
(No Transcript)
16
The score for each question was the reciprocal of
the rank of the first answer to be found correct
if no correct answer was returned, the score was
0. The total score, or Mean Reciprocal Rank
(MRR), was the mean score over all questions.
In STRICT evaluation only correct (R) answers
scored points. In LENIENT evaluation the
unsupported (U) answers were considered correct,
as well.
17
(No Transcript)
18
Comparison between the number and place of origin
of the participants in the past TREC and in this
years CLEF QA tracks
19
Performances at TREC-QA
  • Evaluation metric Mean Reciprocal Rank (MRR)
  • 1
  • rank of the correct answer
  • Best result
  • Average over 67 runs

?
/ 500
20
MONOLINGUAL TASKS
21
MONOLINGUAL TASKS
22
CROSS-LANGUAGE TASKS
23
CROSS-LANGUAGE TASKS
24
MONOLINGUAL TASKS
25
CROSS-LANGUAGE TASKS
26
(No Transcript)
27
Two main different approaches used in
Cross-Language QA systems
translation of the question into the target
language (i.e. in the language of the document
collection)
1
question processing
answer extraction
question processing in the source language to
retrieve information (such as keywords, question
focus, expected answer type, etc.)
2
translation and expansion of the retrieved data
answer extraction
28
Two main different approaches used in
Cross-Language QA systems
translation of the question into the target
language (i.e. in the language of the document
collection)
1
CS-CMU
question processing
ISI
Limerik
answer extraction
DFKI
preliminary question processing in the source
language to retrieve information (such as
keywords, question focus, expected answer type,
etc.)
2
ITC-irst
translation and expansion of the retrieved data
RALI
answer extraction
29
  • A pilot evaluation campaign for multiple
    language Question Answering Systems has been
    carried on.
  • Five European languages were considered three
    monolingual tasks and five bilingual tasks
    against an English collection have been
    activated.
  • Considering the difference of the task, results
    are comparable with QA at TREC.
  • A corpus of 450 questions, each in four
    languages, reporting at least one known answer in
    the respective text collection, has been built.
  • This year experience was very positive we intend
    to continue with QA at CLEF 2004.

30
  • Organization issues
  • Promote larger participation
  • Collaboration with NIST
  • Financial issues
  • Find a sponsor ELRA, the new CELCT center,
  • Tasks (to be discussed)
  • Update to TREC-2003 definition questions, list
    questions
  • Consider just exact answer 50 bytes did not
    have much favor
  • Introduce new languages in the cross-language
    task this is easy to do
  • New steps toward multilinguality English
    questions against other language collections a
    small set of full cross-language tasks (e.g.
    Italian/Spanish).

31
  • Find 200 questions for each language (Dutch,
    Italian, Spanish), based on CLEF-2002 topics,
    with at least one answer in the respective
    corpus.
  • Translate each question into English, and from
    English into the other two languages.
  • Find answers in the corpora of the other
    languages (e.g. a Dutch question was translated
    and processed in the Italian text collection).
  • The result is a corpus of 450 questions, each in
    four languages, with at least one known answer in
    the respective text collection. More details in
    the paper and in the Poster.
  • Questions with at least one answer in all the
    corpora were selected for the final question set.
Write a Comment
User Comments (0)
About PowerShow.com