CLEF 2005: Multilingual Retrieval by Combining Multiple Multilingual Ranked Lists presentation

About This Presentation

Transcript and Presenter's Notes

Title: CLEF 2005: Multilingual Retrieval by Combining Multiple Multilingual Ranked Lists

1
CLEF 2005 Multilingual Retrieval by Combining
MultipleMultilingual Ranked Lists

Luo Si and Jamie CallanLanguage Technology
Institute, School of Computer ScienceCarnegie
Mellon University
CLEF 2005

2
Task Definition

Multi-8 Two Years On
multilingual information retrieval
Multi-8 Merging Only
participants merge provided bilingual ranked
lists

3
Task 1Multilingual Retrieval System

Method Overview

4
Task 1Multilingual Retrieval System

Text Preprocessing
Stop words
Stemming
Decompounding
Word translation

5
Task 1Multilingual Retrieval System

Method Overview

6
Task 1 Multilingual Retrieval System

Method 1
Multilingual Retrieval via Query Translation no
query feedback
Raw score merge and Okapi system
Method 2
Multilingual Retrieval via Query Translation with
query feedback
Raw score merge and Okapi system
Method 3
Multilingual Retrieval via Document Translation
no query feedback Raw score merge and Okapi
system
Method 4
Multilingual Retrieval via Document Translation
with query feedback Raw score merge and Okapi
system
Method 5 UniNe System

7
Task 1Multilingual Retrieval System

Method Overview

8
Task 1Multilingual Retrieval System

Normalization
drsk_mj denote the raw document score for the jth
document retrieved from the mth ranked list for
kth query,

9
Task 1Multilingual Retrieval System

Method Overview

10
Task 1Multilingual Retrieval System

Combine Multilingual Ranked Lists
(wm , rm) represents the weight of the vote and
the exponential normalization factor for the mth
ranked list

11
Task 1Experimental Results Multilingual
Retrieval

Qry/Doc what was translated
fb/nofb with/without pseudo relevance back
UniNe UniNE system

12
Task 1Experimental Results Multilingual
Retrieval

MX Combine models
W1/Trn Equal or learned weights

13
Task 2Results Merge for Multilingual Retrieval

merge ranked lists of eight different languages
(i.e., bilingual or monolingual) into a single
final list
Logistic model (rank , doc score)?
language-specific methods ?
query-specific language-specific

14
Task 2Results Merge for Multilingual Retrieval

Learn Query-Independent and Language-Specific
Merging Model
estimated probability of relevance of document
dk_ij
Model parameter
maximizing the log-likelihood (MLE)
maximizing MAP

15
Task 2Results Merge for Multilingual Retrieval

Learn Query-Specific and Language-Specific
Merging Model
Calculate comparable scores for top ranked
documents in each language
(1) Combine scores of query-based and doc-based
translation methods
(2) Build language-specific query-specific
logistic models to transform language-specific
scores to comparable scores

16
Task 2Results Merge for Multilingual Retrieval

(2) Build language-specific query-specific
logistic models to transform language-specific
scores to comparable scores
logistic model parameter estimate
minimize the mean squared error between exact
normalized comparable scores and the estimated
comparable scores
Estimate comparable scores for all retrieved
documents in each language
Use comparable scores to create a merged
multilingual result list

17
Task 2Experimental Results Results Merge

Query-independent , language-specific
Mean average precision of merged multilingual
lists of different methods on UniNE result lists
Mean average precision of merged multilingual
lists of different methods on HummingBird result
lists

MAP is more accurate
18
Task 2Experimental Results Results Merge

Query-specific , language-specific
Mean average precision of merged multilingual
lists of different methods on UniNE result lists
C_X top X docs from each list merged by exact
comparable scores.
Top_X_0.5 top X docs from each list downloaded
for logistic model to estimate comparable scores
and combine them with exact scores by equal
weight

This means that the combination of estimated
comparable scores and exact comparable scores can
be more accurate than exact comparable scores in
some cases
19
Task 2Experimental Results Results Merge

Query-specific , language-specific
Mean average precision of merged multilingual
lists of different methods on HummingBird result
lists
Outperform query-independent and
language-specific algorithm

Write a Comment

User Comments (0)

About PowerShow.com

CLEF 2005: Multilingual Retrieval by Combining Multiple Multilingual Ranked Lists PowerPoint PPT Presentation