Similarity Measures for Query Expansion in TopX - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Similarity Measures for Query Expansion in TopX

Description:

the already used measure in TopX. NAGA similarity. applied measure for YAGO ... larger knowledge base. more flexibility. increased complexity ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 18
Provided by: cam162
Category:

less

Transcript and Presenter's Notes

Title: Similarity Measures for Query Expansion in TopX


1
Similarity Measures for Query Expansion in TopX
  • Caroline Gherbaoui

Universität des Saarlandes Naturwissenschaftlich-
Technische Fak. I Fachrichtung 6.2 - Informatik
Max-Planck-Institut für Informatik AG 5 -
Datenbanken und Informationssysteme Prof. Dr.
Gerhard Weikum
2
Overview
  • background knowledge
  • similarity measures for the query expansion
  • evaluation of the computed similarity values
  • changes in TopX
  • conclusion

3
Background
  • top-k query processing
  • provides k most relevant results
  • query expansion
  • extends source query terms
  • word sense disambiguation
  • extracts correct meaning
  • ontology
  • amount of terms with their meanings and semantic
    relations

4
Word Sense Disambiguation
java, coffee
island
coffee
java
programming language

5
Query Expansion
COFFEE
drink, espresso
6
TopX
  • top-k retrieval engine
  • text and XML data
  • word sense disambiguation
  • query expansion
  • ontology

7
TopX WordNet Ontology
  • lexicon for the English language
  • hierarchical relations
  • one relation ? one direction
  • 160,000 words
  • 120,000 synsets
  • 210,000 relations

8
TopX YAGO Ontology
  • Wikipedia and WordNet
  • hierarchical and not hierarchical relations
  • one relation ? two directions
  • 2,100,000 words
  • 2,200,000 concepts
  • 6,000,000 relations

9
Similarity Measures
  • Dice similarity
  • the already used measure in TopX
  • NAGA similarity
  • applied measure for YAGO
  • Best WordNet similarity
  • measure with best result among WordNet measures

10
Dice Similarity Measure
  • sdfsdf
  • measures the intersection of two regions

11
NAGA Similarity Measure
  • sdfasfsdf
  • combination of the confidence of a relation and
    the informativeness of a relation

12
Best WordNet Similarity Measure
  • sdfsdfsdf
  • product of the transfer function of the path
    length and the transfer function of the concept
    depth

13
Evaluation
14
Evaluation
  • DICE measure ? applicable
  • also on the YAGO ontology
  • NAGA measure ? applicable
  • with omitting of the forward direction
  • Best WordNet measure ? not applicable
  • due to the density of YAGO

15
Changes for TopX
  • tuning of some procedures
  • Dijkstra algorithm
  • word sense disambiguation
  • query expansion
  • extension of configuration file

16
Conclusion
  • larger knowledge base
  • more flexibility
  • increased complexity
  • further measure for the similarity computation ?
    NAGA similarity

17
  • Questions?
Write a Comment
User Comments (0)
About PowerShow.com