Relevance Feedback - PowerPoint PPT Presentation

About This Presentation
Title:

Relevance Feedback

Description:

ni is number of docs in corpus containing term ki. Computing Similarity (Term ... Ideal query is approximated by shared terms in relevant documents. Assumptions ... – PowerPoint PPT presentation

Number of Views:95
Avg rating:3.0/5.0
Slides: 17
Provided by: CSU67
Category:

less

Transcript and Presenter's Notes

Title: Relevance Feedback


1
Relevance Feedback
  • User tells system whether returned/disseminated
    documents are relevant to query/information need
    or not
  • Feedback
  • usually positive
  • sometimes negative
  • always incomplete
  • Hypothesis relevant docs should be more like
    each other than like non-relevant docs

2
Relevance Feedback Purpose
  • Augment keyword retrieval Query Reformulation
  • give user opportunity to refine their query
  • tailored to individual
  • exemplar based different type of information
    from the query
  • Iterative, subjective improvement
  • Evaluation!

3
Relevance Feedback Examples
  • Image Retrieval
  • http//www.cs.bu.edu/groups/ivc/ImageRover/
  • http//nayana.ece.ucsb.edu/imsearch/imsearch.html
  • http//www.mmdb.ece.ucsb.edu/demo/corelacm/

4
Relevance Feedback Early Usage by Rocchio
  • Modify original keyword query
  • strengthen terms in relevant docs
  • weaken terms in non-relevant docs
  • modify original query by weighting based on
    amount of feedback

5
Relevance Feedback Early Results
  • Evaluation
  • how much feedback needed
  • how did recall/precision change
  • Conclusion
  • improved recall precision over even 1 iteration
    and return of up to 20 non-relevant docs
  • Promising technique

6
Query Reformulation
  • User does not know enough about document set to
    construct optimal query initially.
  • Querying is iterative learning process repeating
    two steps
  • expand original query with new terms (query
    expansion)
  • assign weights to the query terms (term
    reweighting)

7
Query Reformulation Approaches
  • Relevance feedback based
  • vector model (Rocchio )
  • probabilistic model (Robertson Sparck Jones,
    Croft)
  • Cluster based
  • Local analysis derive information from retrieved
    document set
  • Global analysis derive information from corpus

8
Vector Based Reformulation
  • Rocchio (1965)with adjustable weights
  • Ide Dec Hi (1968) counts only the most similar
    non-relevant document

9
Probabilistic Reformulation
  • Recall from earlier
  • still need to estimate probabilities
  • do so using relevance feedback!

10
Estimating Probabilities by Accumulating
Statistics
  • Dr is set of relevant docs
  • Dr,i is set of relevant docs with term ki
  • ni is number of docs in corpus containing term ki

11
Computing Similarity (Term Reweighting)
  • assume term independence and binary document
    indexing
  • Cons no term weighting, no query expansion,
    ignores previous weights

12
Croft Extensions
  • include within document frequency weights
  • initial search variant

Last term is normalized within-document
frequency. C and K are adjustable parameters.
13
Query Reformulation Summary so far
  • Relevance feedback can produce dramatic
    improvements.
  • However, must be careful that previously judged
    documents are not part of improvement and
    techniques have limitations.
  • Next round of improvements requires clustering

14
Croft Feedback Searches
  • Use probability updates as in Robertson

15
Assumptions
  • Initial query was a good approximation.
  • Ideal query is approximated by shared terms in
    relevant documents.

16
Assumptions
  • Initial query was a good approximation.
  • polysemy? synonyms?
  • slang? concept drift?
  • Ideal query is approximated by shared terms in
    relevant documents.
Write a Comment
User Comments (0)
About PowerShow.com