Title: FCA-MERGE: Bottom-up Merging of Ontologies
1FCA-MERGE Bottom-up Merging of Ontologies
- Gred Stumme Alexander Maedche
- Presenter Yihong Ding
2FCA-Merge method
O1
3The Framework
uses
dictionaries/natural language texts
Propose new concepts/ relations
4FCA-Merge
- Instance extraction (linguistic analysis based)
and context generation - FCA-Merge core algorithm that generates the
pruned concept lattice - Generating the new ontology from the concept
lattice
5Framework
uses
dictionaries/natural language texts
Propose new concepts/ relations
6Information Extraction Engine (SMES)
- Linguistic
- Knowledge Pool
- Lexical database
- 700.000 word forms
- Named entity lexica,
- compound tagging
- rules
- Finite State Grammers
Text Chart
Conceptual System Ontology Domain-specific
semantic knowledge Domain
Lexicon Domain-specific mapping of words to the
Conceptual system
( )
( )
( )
( )
( )
( )
( )
( )
Shallow Text Processing Word Level Sentence
Level
- Tokenizer
- Lexical Processor
- POS-Tagger
-
- Named Entity Finder
- Phrase Recognizer
- Clause Recognizer
7Linguistic Analysis and Context Generation
8Three Assumptions
- Documents have to be relevant.
- Documents have to cover all concepts.
- Documents have to separate the concepts well
enough.
9FCA-Merge
- Instance extraction (linguistic analysis based)
and context generation - FCA-Merge core algorithm that generates the
pruned concept lattice - Generating the new ontology from the concept
lattice
10Framework
uses
Propose new concepts/ relations
references
uses
Text Processing Server
Domain lexicon
Lexical DB
11Formal Concept Analysis
- Arose in the 1980s in Darmstadt as a mathematical
theory - Formalize the concept of concept
- Used for deriving conceptual hierarchies from
data tables - Provide a visualization of the hierarchies by
line diagrams - Used here as a method for conceptual clustering
12A formal context about National Parks in
California
13Intent B
- Def. A formal concept
- is a pair (A,B) where
- A is a set of objects
- (the extent of the concept),
- B is a set of attributes
- (the intent of the concept),
- A?B is a
- maximal rectangle
- in the binary relation.
National Parks in California
Extent A
14The blue concept is a subconcept of the yellow
one, since its extent is contained in the yellow
one.
National Parks in California
15Generating the Pruned Concept Lattice
The ontology concepts are clustered by the
algorithm TITANIC.
16FCA-Merge
- Instance extraction (linguistic analysis based)
and context generation - FCA-Merge core algorithm that generates the
pruned concept lattice - Generating the new ontology from the concept
lattice
17Framework
uses
Propose new concepts/ relations
models
references
uses
Text Processing Server
Domain lexicon
Lexical DB
18Generating the new Ontology from the Concept
Lattice
Concepts from the same ontology may also be
merged.
Concepts which generate alone a formal concept
are taken over into the new ontology.
Formal concepts without attributes give rise to
new concepts or relations (or subsumptions).
Concepts generating the same formal concept are
suggested to be merged.
19Ontology Environment OntoMat
20FCA-Merge (Summary)
Concepts generating the same cluster are
suggested to be merged.
Appearance of concepts in documents is discovered.
The concepts are clustered.
21System Summary
- FCA-Merge approach is extensional, i.e., it is
based on objects which appear in both ontologies. - Concepts having the same extent are supposed to
be merged. - The idea of FCA-Merge is to create, based on the
source ontologies, a concept hierarchy - the
concept lattice -containing the original
concepts. - Ontology concepts having the same extent are
identified in the concept lattice. - The knowledge engineer can then create the target
ontology interactively, based on the insights
gained from the concept lattice.
22Assessment
- Smart, clean, beautiful, learning-based approach
- Instance-level matching
- Can only handle 11 mappings
- But it is possible to extend to 1n and nm
- Works for taxonomic relations
- Not sure for non-taxonomic relations
- Require well-covered, well-separated, and
relevant document sets - Derive merged ontology manually, heavily relying
on domain experts background knowledge