1 of 15 - PowerPoint PPT Presentation

About This Presentation
Title:

1 of 15

Description:

Attempts to improve recall (adding synonyms) usually involve constructed ... Synsets of top-10 re-ranked document are merged according to word category and sense ... – PowerPoint PPT presentation

Number of Views:10
Avg rating:3.0/5.0
Slides: 16
Provided by: Chris1470
Category:
Tags: readding

less

Transcript and Presenter's Notes

Title: 1 of 15


1
Expanding Query Terms in Context
  • Chris Staff and Robert Muscat
  • Department of Computer Science AI
  • University of Malta

2
Aims of this presentation
  • Background
  • The Vocabulary Problem in IR
  • Scenario
  • Using retrieved documents to determine how to
    expand query
  • Approach
  • Evaluation

3
The Vocabulary Problem
  • Furnas et al, 1987, find that any two people
    describe the same concept/object using the same
    term with a probability of less than .2
  • This is a huge problem for IR
  • High probability of finding some documents about
    your term (but watch ambiguous terms!)
  • Low probability of finding all documents about
    your concept (so low coverage)

4
Whats Query Expansion?
  • Adding terms to query to improve recall while
    keeping precision high
  • Recall is 1 when all relevant docs are retrieved
  • Precision is 1 when all retrieved docs are
    relevant

5
Whats Query Expansion?
  • Attempts to improve recall (adding synonyms)
    usually involve constructed thesaurus (Qiu et al,
    1995, Mandala et al, 1999, Voorhees, 1994)
  • Attempts to improve precision (by adding
    restricting terms) now based around automatic
    relevance feedback (e.g., Mitra et al, 1998)
  • Indiscriminate query expansion can lead to loss
    of precision (Voorhees, 1994) or hurt recall

6
Scenario
  • Two users search for information related to the
    same concept C
  • User queries Q1 and Q2 have no terms in common
  • R1 and R2 are results sets of Q1 and Q2
    respectively
  • Rcommon R1 ? R2

7
Scenario
  • We assume that Rcommon is small and non-empty
    (Furnas, 1985 and Furnas et al, 1987)
  • If Rcommon is large then Q1 and Q2 will both
    retrieve same set of documents
  • Can determine (using WordNet) if any term in Q1
    is the synonym of a term in Q2
  • Some doc Dk in Rcommon probably includes both
    terms (because of way Web IR works)!

8
Scenario
  • If t1 in Q1 and t2 in Q2 are synonyms
  • Can expand either in future queries containing t1
    or t2
  • As long as doc Dk appears in results set (the
    context)

9
Approach
  • Learning synonyms in context
  • Query Expansion

10
Learning Synonyms in Context
  • A document is associated with a bag of words
    ever used to retrieve doc
  • A term, document pair is associated with a synset
    for the term in the context of the doc
  • Word sense from WordNet also recorded to reduce
    ambiguity

11
Query Expansion in Context
  • Submit unexpanded original user query Q to obtain
    results set R
  • For each document Dk in R (k is rank) retrieve
    synsets for terms in Q
  • Same query term in context of different docs in R
    may yield inconsistent synsets
  • Countered using Inverse Document Relevance

12
Inverse Document Relevance
  • IDR is relative frequency with which doc d is
    retrieved in rank k when term q occurs in the
    query
  • IDRq,d Wq,d / Wd (where Wd is number of times d
    retrieved, Wq,d number of times d retrieved when
    q occurs in query)

13
Term Document Relevance
  • We then re-rank documents in R based on their TDR
  • TDRq,d,k IDRq,d x Wq,d,k / Wd,k
  • Synsets of top-10 re-ranked document are merged
    according to word category and sense
  • Most frequently occurring word category, word
    sense pair synset used to expand q in query

14
Evaluation
  • Need huge query log, ideally, with relevance
    judgements for queries
  • We have TREC QA collection, but well need to
    index them before running the test queries
    through them (using, e.g., SMART)
  • Disadvantage that there might not be enough
    queries
  • User Studies

15
  • Thank you!
Write a Comment
User Comments (0)
About PowerShow.com