Ontology Maintenance with an Algebraic Methodology: a Case Study - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Ontology Maintenance with an Algebraic Methodology: a Case Study

Description:

Example with Webster's Dictionary. Automatic Thesaurus Extraction from Dictionary ... with the S operator when extracting the relevant parts of the dictionary ... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 25
Provided by: Lei109
Category:

less

Transcript and Presenter's Notes

Title: Ontology Maintenance with an Algebraic Methodology: a Case Study


1
Ontology Maintenance with an Algebraic
Methodology a Case Study
  • Jan Jannink, Gio Wiederhold
  • Presented by
    Lei Lei

2
Challenges
  • Obstacle Autonomy of diverse knowledge sources
  • Data volatility and amount increases cost
  • Major challeges Establish and maintain
    application specific portion of knowledge sources

3
An Algebraic Approach
  • Construct virtual knowledge bases geared to a
    specific application
  • Use composable operators to transform contexts
    into contexts
  • Operators express relevant parts of a source and
    the conditions using rules
  • Rules define a valid context transformation

4
On-line DictionaryWebster
  • Autonomously maintained to develop a novel
    thesaurus application
  • 120,000 entries, two million words
  • Semi-annual updates
  • Errors and inconsistencies help robustness

5
Target Application
  • Construct a graph of the definitions to determine
    related terms, and automatically generate
    thesaurus entries

6
Related Work
  • Ontology composition (Wiederhold 1994)
  • Rule-based approach to semantic integration
    (Bergamaschi et al. 1999)
  • Semantic reconciliation (Siegel 1991)
  • Uschold et al. 1998
  • Specification morphisms, (Smith 1993)
  • WordNet system (Miller al. 1990)
  • WHIRL (Cohen 1998)
  • PageRank (PageBrin 1998)
  • Latent semantic indexing (Deerwester 1990)
  • Hypertext authority (Kleinberg 1998)

7
Outline
  • Algebra Usage Scenario
  • Background
  • Context Creation
  • Ontology Maintainance
  • Future Work
  • Conclusion
  • My Evaluation

8
Typical Algebra Usage Scenario
A minimal sufficient set of Linkage between
items in different resources
9
Background
Canonical unary to establish and refine a context
within which the source knowledge meets the
application requirement
  • Algebraic Operators

10
Background(Cont.)
  • Semantic Context
  • No global notion of consistency
  • Defined as objects that encapsulate other
    objects
  • Congruity relevance of source info. to
    application
  • Similarity equivalent and mergeable
    objects
  • between different
    sources

11
Rule Language(Cont.)
  • Allow uninterpreted components of an object to
    become attributes of the object
  • Constructors
  • create new objects
  • Constructors
  • generate proxy objects
  • Editors convertors
  • modify the objects

12
Object Model(Cont.)
  • Subsume existing models
  • Only objects have an identity to which others can
    refer
  • Correspond to XML supplemented with obj. identity
  • Rich to model complex relationship

13
Context Creation
  • Summarize Operator (S operator)
  • Transforms source data based on a predicate
  • Create object Encapsulates populates
  • Data classificationGroups source into
    equivalent classes
  • Syntax (given
    contexts c1,c2, a matching rule e)

14
Context Creation(Cont.)
  • 1.Predicate e partitions the objects of c1 into n
    equivalent parts
  • 2. c2 consists of n1 values e,s1,s2,,sn
  • 3.One is an exception class, not match e

15
Example with Websters Dictionary
  • Automatic Thesaurus Extraction from Dictionary

16
Example(Cont.)
  • Construct a directed
  • graph from definition
  • 1.Each head word and
  • definition grouping is a node
  • 2.Each word in a definition node
  • is an arc to the node having
  • that head word
  • Definition
    from the dictionary data for Egoism

17
Context Creation(Cont.)
  • Syllable and accent markers in head words
  • Misspelled head words
  • Mis-tagged fields
  • Stemming and irregular verbs(Hopelessness)
  • Common abbreviations in definitions(etc.)
  • Undefined words with common prefixes(un-)
  • Multi-word head words(Water Buffalo)
  • Underfined hyphenated and compound words
  • Set 99 accuracy in the conversion from data to
    graph stru.

18
Constructing the Congruity Expression
  • An object that represents the entire source
  • Subdivided into chunks
  • One head word
  • One definition
  • Express congruity relationship between the
    dictionary and thesaurus application

19
Ontology Maintenance
  • Context Refinement
  • Return the ten longest head words of the
    dictionary

20
Maintaining the Ontology
  • Changes in source help correct and extend dict.
  • Maintain statistics with the S operator when
    extracting the relevant parts of the dictionary
  • Find no longer needed rules
  • Note which rules no longer needed
  • A comparison of the terms reveals new errors

21
Future Work
  • A web based interface to display ArcRank
    algorithm based on PageRank
  • (http//www-db.stanford.edu/SKC)

22
Conclusion
  • An on-line dictionary is good test-bed
  • An algebraic approach improving maintainability
  • Congruity simplified identification and handling
    of changes
  • Use Summarize to define and refine a context that
    prepare the dictionary data for thesaurus service
    use

23
My Assessment
  • Strength
  • Decouple the selection of congruent parts of
    the source data
  • Congruity and similarity measure use algebra
    rather than single language
  • Mirror classes using operators of the algebra
    instead of low level abstract primitives that are
    difficult to compose
  • Weakness
  • Details of ciS(ci) are needed
  • Difficult to grasp the capability of S
    operator
  • Accuracy and error accumulation problem
  • Ambiguous Rules Generation

24
Questions?
Write a Comment
User Comments (0)
About PowerShow.com