CLEF 2005: Multilingual Retrieval by Combining Multiple Multilingual Ranked Lists PowerPoint PPT Presentation

presentation player overlay
1 / 19
About This Presentation
Transcript and Presenter's Notes

Title: CLEF 2005: Multilingual Retrieval by Combining Multiple Multilingual Ranked Lists


1
CLEF 2005 Multilingual Retrieval by Combining
MultipleMultilingual Ranked Lists
  • Luo Si and Jamie CallanLanguage Technology
    Institute, School of Computer ScienceCarnegie
    Mellon University
  • CLEF 2005

2
Task Definition
  • Multi-8 Two Years On
  • multilingual information retrieval
  • Multi-8 Merging Only
  • participants merge provided bilingual ranked
    lists

3
Task 1Multilingual Retrieval System
  • Method Overview

4
Task 1Multilingual Retrieval System
  • Text Preprocessing
  • Stop words
  • Stemming
  • Decompounding
  • Word translation

5
Task 1Multilingual Retrieval System
  • Method Overview

6
Task 1 Multilingual Retrieval System
  • Method 1
  • Multilingual Retrieval via Query Translation no
    query feedback
  • Raw score merge and Okapi system
  • Method 2
  • Multilingual Retrieval via Query Translation with
    query feedback
  • Raw score merge and Okapi system
  • Method 3
  • Multilingual Retrieval via Document Translation
    no query feedback Raw score merge and Okapi
    system
  • Method 4
  • Multilingual Retrieval via Document Translation
    with query feedback Raw score merge and Okapi
    system
  • Method 5 UniNe System

7
Task 1Multilingual Retrieval System
  • Method Overview

8
Task 1Multilingual Retrieval System
  • Normalization
  • drsk_mj denote the raw document score for the jth
    document retrieved from the mth ranked list for
    kth query,

9
Task 1Multilingual Retrieval System
  • Method Overview

10
Task 1Multilingual Retrieval System
  • Combine Multilingual Ranked Lists
  • (wm , rm) represents the weight of the vote and
    the exponential normalization factor for the mth
    ranked list

11
Task 1Experimental Results Multilingual
Retrieval
  • Qry/Doc what was translated
  • fb/nofb with/without pseudo relevance back
  • UniNe UniNE system

12
Task 1Experimental Results Multilingual
Retrieval
  • MX Combine models
  • W1/Trn Equal or learned weights

13
Task 2Results Merge for Multilingual Retrieval
  • merge ranked lists of eight different languages
    (i.e., bilingual or monolingual) into a single
    final list
  • Logistic model (rank , doc score)?
  • language-specific methods ?
  • query-specific language-specific

14
Task 2Results Merge for Multilingual Retrieval
  • Learn Query-Independent and Language-Specific
    Merging Model
  • estimated probability of relevance of document
    dk_ij
  • Model parameter
  • maximizing the log-likelihood (MLE)
  • maximizing MAP

15
Task 2Results Merge for Multilingual Retrieval
  • Learn Query-Specific and Language-Specific
    Merging Model
  • Calculate comparable scores for top ranked
    documents in each language
  • (1) Combine scores of query-based and doc-based
    translation methods
  • (2) Build language-specific query-specific
    logistic models to transform language-specific
    scores to comparable scores

16
Task 2Results Merge for Multilingual Retrieval
  • (2) Build language-specific query-specific
    logistic models to transform language-specific
    scores to comparable scores
  • logistic model parameter estimate
  • minimize the mean squared error between exact
    normalized comparable scores and the estimated
    comparable scores
  • Estimate comparable scores for all retrieved
    documents in each language
  • Use comparable scores to create a merged
    multilingual result list

17
Task 2Experimental Results Results Merge
  • Query-independent , language-specific
  • Mean average precision of merged multilingual
    lists of different methods on UniNE result lists
  • Mean average precision of merged multilingual
    lists of different methods on HummingBird result
    lists

MAP is more accurate
18
Task 2Experimental Results Results Merge
  • Query-specific , language-specific
  • Mean average precision of merged multilingual
    lists of different methods on UniNE result lists
  • C_X top X docs from each list merged by exact
    comparable scores.
  • Top_X_0.5 top X docs from each list downloaded
    for logistic model to estimate comparable scores
    and combine them with exact scores by equal
    weight

This means that the combination of estimated
comparable scores and exact comparable scores can
be more accurate than exact comparable scores in
some cases
19
Task 2Experimental Results Results Merge
  • Query-specific , language-specific
  • Mean average precision of merged multilingual
    lists of different methods on HummingBird result
    lists
  • Outperform query-independent and
    language-specific algorithm
Write a Comment
User Comments (0)
About PowerShow.com