Title: Ontologydriven Resolution of Semantic Heterogeneities in GDB Conceptual Schemas
1Ontology-driven Resolution of Semantic
Heterogeneities in GDB Conceptual Schemas
- Guillermo Nudelman Hess
- Prof. Dr. Cirano Iochpe
2Introduction
- Geographic Information Systems (GIS)
popularization - Geographic databases (GDB) project
- Complex
- Repeatable
3Architecture to integrate schemas
CSF Canonic Syntactic File SCSSF Canonic
Syntactic and Semantic File
4Issues on the semantic integration
- Heterogeneities to be handled Visser, 1997
- Naming (explanation)
- Synonym
- Homonym
- Acronym
- Structure
- Attributes (relation)
- Associations (relation)
- Taxonomy (categorization)
- Constructors (paradigm)
5Applying Ontology
- Search and Update algorithm
- Semi-automate
- Associated with similarity matching Cohen, 1998
techniques.
6The ontoGeo Ontology
- Geographic domain
- Basic cartography
- Hidrography
- Relief
- Vegetation
- Transport
- Locality
- Features
- Spatial representation
- Object
- Field
- Temporality
- Classes
- Attributes
- Relationships
- Something on network
7The ontoGeo Ontology
- Construction
- Protégé tool
- RDF/RDFs language
- Knowledge model extended ? Synonyms
8Search and Update Algorithm
- Semi-automate
- Associated with similarity matching Cohen, 1998
techniques - Parameters
- Acceptance threshold
- Analysis Threshold
- Confidence ratio (delta value)
9(No Transcript)
10Similarity Matching
Levenshtein
Sim(Cc,Co)WN.SimName(Cc,Co)WA.SimEst(Cc,Co) WH.
SimHier(Cc,Co)WR.SimRel(Cc,Co)
WN,WA, WH and WR are the similarity weights for
each component
SimEst(Cc,Co) (?ni1f(Cci,Coi)xWati)/Nat
Wat 1 (Ca/C)
F(Cc, Co) ? levenshtein function applied over
each attribute Wat ? Attributes weight Ca
Number of classes having this attribute C Total
classes Nat ? Number of attributes of the
conceptual schema
11Similarity Matching
SimHier(Cc,Co) (?(Hier(Cc,Pc).Wt(c,p))/
NHier(Cc,Pc))
Wt(c,p) (E).(d(p)1).(IC(c) IC(p)) E(p)
d(p)
IC(c) -log((?(1/sup(w))).1/N)
Wt(c,p) ? Weight of each IS-A association E
density (classes in the hierarchy) E(p) parent
node density (number of child nodes) d(p) depth
(in the hierarchy) IC Information content
(amount of information) sup(W) number of
parent nodes N Number of nodes in the hierarchy
SimRel(Cc,Co) (?(Rel(Cc,Co))/Rel(Cc))
12Experiment
- Geographic ontology with about 170 classes
- Hidrology subset
- Parameters
- Accpetance 0,75 (75)
- Threshold 0,4 (40)
- delta 0,1 (10)
13Experimental results
14Conclusions
- Ontology
- Schema unification
- Pattern storage.
- Algorithmic methodology to mediate schema
integration - Similarity Matching
- Balance of similarities
- Different schemas ? Different weights.
15Future work
- Categorize the different types of conceptual
schemas - Try other similarity matching methodologies for
each class of schema - Balance of the WN, WA, WH and WR parameters,
depending on the input conceptual schema - Add spatial relationships to the algorithm
- Algorithm optimization.
- Ontology maintenance
16Thank you
- Guillermo Nudelman Hess
- hess_at_inf.ufrgs.br
?
17Example
18Example similarity marching
19Example similarity matching
- Lago e arroio
- SimName 0,17
- SimAt 0 (arroio does not have attributes)
- SimHier (10,9)/1 0,9 (only 1 hierarchy)
- SimRel 0 (no aggregation associations)
- WN WA WH WR 0,25
- 0,25(0,17) 0,25(0) 0,25(0) 0,25(0,9) 0,27
- WN 0,4 WA 0,3 WH 0,3 WR 0
- 0,4(0,17) 0,3(0) 0(0) 0,3 (0,9) 0,34
20Example similarity matching
- Lago e Lago
- SimName 1
- SimAt (1(0,5) 0,2(0,75))/3 0,22
- SimHier (10,9)/1 0,9 (only 1 hierarchy)
- SimRel 0 (no aggregation associations)
- WN WA WH WR 0,25
- 0,25(1) 0,25(0,22) 0,25(0) 0,25(0,9) 0,53
- WN 0,4 WA 0,3 WH 0,3 WR 0
- 0,4(1) 0,3(0,22) 0(0) 0,3 (0,9) 0,74