Semantic Metrics - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Semantic Metrics

Description:

Ontology authoring and publishing is far easier than we once could imagine ... the aggregated distance between an ontology and a group of concepts. Future ... – PowerPoint PPT presentation

Number of Views:103
Avg rating:3.0/5.0
Slides: 22
Provided by: bohu2
Category:

less

Transcript and Presenter's Notes

Title: Semantic Metrics


1
Semantic Metrics
  • Bo Hu Y. Kalfoglou, H. Alani,
  • D. Dupplaw, P. Lewis, N. Shadbolt
  • School of Electronics and Computer Science
  • University of Southampton, UK

2
Gist
  • Yet another metrics?
  • Current approaches
  • Why we need yet another one?
  • Background
  • Distance measures
  • Description logics
  • Distances between concepts
  • signaturing concepts
  • How far apart are they?
  • Beyond concept distances
  • Discussion and Conclusions

3
Why we need it?
  • Ontology authoring and publishing is far easier
    than we once could imagine
  • Gxxgle person ontology 574 returns
  • Swxxgle person 3967 results
  • person is defined in FoaF, Akt-portal,
    umbc-publication, etc. in different ways.
  • Semantic distance is the root for many semantic
    web tasks
  • Ontology mapping/alignment/articulation/merging
  • Ontology search, ranking and reuse
  • Ontology segmentation
  • Ontology evaluation

4
Current approaches
  • Name-based and lexical approaches
  • Consider only names and URIs with help from
    lexicons, e.g. WordNet, UMLS
  • Ignore concept definitions
  • Graph-based approaches
  • Convert ontologies into graphs
  • Ignore the difference among links
  • Feature-based approaches
  • Discompose concepts as features
  • Process of featuring concepts is not formally
    studied
  • Should interim concepts along inheritance paths
    or role chains be considered equally important?
  • Logic-based approaches
  • Rewrite as SAT problem
  • Only gives abstract relations

5
Background
6
Distance functions
  • A distance function satisfies non-negative,
    symmetry and triangle inequality
  • Some distance measures
  • Minkowski distance
  • Vector space model
  • Kullback-Leibler divergence

7
Description logics
  • Subset of FOPL describing knowledge with concepts
    and roles
  • father is a man whose child is a human
  • Morden DL systems are built on Tableau-based
    algorithms
  • Underlying logic for OWL-Lite, OWL-DL
  • OWL-DL is based on SHOIN
  • SHOIN constructs

8
Semantic distance measure
  • Take full advantage of DLs
  • Unfold concepts into signatures
  • Weight semantics-bearing signatures
  • Compute distance

9
Signaturing concepts1/3
  • In DLs, concepts are restricted with roles
    through role value (range) and role cardinality
    restrictions
  • E.g.
  • Unfold all the relevant restrictions
  • Beth definability implicit restriction can be
    made explicit
  • Acyclic definitions

10
Signaturing concepts2/3
  • Unfolding rules
  • Tableau construction rules of DL (Baader, et.al.
    2003)
  • Disjoint is explicated as complement

11
Signaturing concepts3/3
  • Examples
  • Signature of book, elements are tuples headtail,
    A-Box assertions
  • Signature of author
  • Implicit restrictions on authors have been
    expanded as part of the Book signature

12
Weighting signatures
  • tf-idf
  • Why tf-idf vector space model?
  • Provide a foundation for computing numeric
    similarity
  • Computationally cheap and easy to implement
  • Emphasise on differences
  • Pure syntactic approach
  • Add semantic flavour
  • Assign -1 to complement
  • Reduce the weight value based on its contexts
  • Weight of hasPublication property referred to
    in author concept is

13
Computing distances
  • Distance between concepts is reduced to distance
    between signature vectors
  • Compute similarity of signature
  • vectors
  • Compute distance of signatures
  • Find the minimum distance of concepts

14
The distances
  • Symmetric, non-negative
  • Only a relative value
  • Bigger distance value suggests two concepts are
    further apart
  • 0 suggests identical concepts
  • Disadvantages
  • Only a distance value, not able to represent
    abstract relations, e.g. more general than, more
    specific than
  • A mix of semantic and syntactic methods,
    difficult to formalise
  • Cannot handle negative similarity values.
  • Advantages
  • Anonymous concepts, e.g. ltowlRestrictiongt, are
    made explicit.
  • Non-primitive concepts are recursively unfolded
  • Interim concepts are replaced with their
    signatures.
  • Weights are computed systematically.

15
Extending semantic metric of concepts
16
Distance between concept from different ontologies
  • Effectively, ontology mapping
  • Inputs from other systems and human experts are
    necessary
  • ?(C, C) need to be initiated with mappings among
    primitive concepts and properties from different
    ontology
  • Initial mappings can be defined based on concept
    names
  • String distance based algorithms
  • WordNet, UMLS, etc.
  • Compute distance
  • Compute similarity of signatures
  • vectors

17
Distance between concept and a set of concepts
  • Distance between C and group of concepts O
  • Accumulate individual similarities
  • ?(O,C) is accumulated as the information gain of
    each concept in O with regard to C.
  • Emulate probability with similarity
  • Yet, probability ? similarity
  • p(DC) sim(D,C) / ?D?O sim(D, C)

18
Distance between groups of concepts
  • Compute as an aggregation of pair-wise concept
    distances
  • Aggregation method
  • Mindowski ?1,2,
  • Kullback-Leibler
  • A perfect concept C0 is defined as the
    reference point
  • p(Ck) computed as the relative tightness with
    regard to C0

19
Discussion and conclusions
  • Possible applications
  • Ontology mapping
  • Ontology segmentation
  • Compute distance between concepts
  • Cut off concept below the threshold
  • Ontology ranking
  • Compute the aggregated distance between two
    ontologies
  • Or compute the aggregated distance between an
    ontology and a group of concepts.
  • Future work
  • Neat solution for un-unfolded universal
    quantification
  • Find a proper p(DC) replacement
  • Problem with tf-idf
  • How to treat negative concept similarity
  • Syntactic based, a semantic-oriented weight
    schema
  • More evaluation is forthcoming

20
Questions?
21
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com