Title: Ostensive Automatic Schema Mapping for Taxonomybased PeertoPeer Systems
1Ostensive Automatic Schema Mapping for
Taxonomy-based Peer-to-Peer Systems
- Yannis Tzitzikas and Carlo Meghini
- Instituto di Scienza e Tecnologie dell
Informazione ISTI - Consiglio Nazionale delle Ricerche CNR, Pisa,
Italy
2Outline of the presentation
- Context and Problem
- One solution Ostensive Mapping
- Basic idea
- Formal Framework
- The Method/Protocol
- Application for Taxonomy-based Sources
- Ostensive Articulation and P2P Systems
- Concluding Remarks Further Research
3Context P2P systems that support semantic
retrieval services
Tzitzikas et al.ER2003, Tzitzikas et al.
CoopIS2003
4Integrated Global Access Requires Semantic
Articulation
Who is going to define these mappings ?
- Semantic articulation is laborious
- In the P2P paradigm it is even harder as
- No central control or authority
- Large number of participants
- The membership of the nodes is dynamic and
unpredictable - The fully heterogeneous conceptual models makes
uniform global access extremely challenging.
5Approaches for Constructing Mappings State of
the Art
- Model Driven approaches
- Based on the similarity of names or structure
- Exploit dictionaries and user input
- Data Driven approaches
- Require two dbs with common objects
- They can be automated
- Current State
- Applicable if the objects are texts
- They are used to integrate two entire conceptual
models - They cannot produce mapping between queries.
- The game requires at least 3 players
6Two autonomous agents
We come in peace
Vettä, kiitos
No communication is possible
7The Method
The ostensive method
8Formal Framework
We view each source as a function
? P(Obj), Obj the set of stored objects
Let QN be the set of names (QN ?Q).
However this is not always possible as S is not
always an onto function (A ?? P(Obj))
9Approximate Naming Functions
Let (Q,??) be the ordering over queries (q ?q
if S(q) ? S(q')). qq if q ?q and q ?q ?
is a partial order over Q (the set of
equivalence classes induced by over Q).
10The Method
Consider two sources
and
Assume that
Objective find the relationships that are
extensionally valid in C.
What kind of mappings we want ?
11The Method (II)
The naming functions is all that we want
Suppose that Si wants to articulate a query qi
of Qi.
qi should be articulated as follows
12The Protocol for articulating a term or query qi
a12
gt Only two messages have to be exchanged
The sources can run this protocol for one,
several or all of its terms (or queries)
13Application to Taxonomy-based Sources
- We view a taxonomy-based source S as a quadruple
S (T, ??, I, Q) where - T is a finite set of names called terms, e.g.
tomatoes, apples, - ? is a reflexive and transitive binary relation
over T called subsumption, - e.g. apples ? foods
- I is a function IT? P(Obj) called interpretation
where Obj is a finite set of objects, e.g.
I(tomatoes) 1,2, - Q is the set of queries defined by the grammar
q t q ??q q?q ?q(q) where t is a term
in T.
14Query Answering in Taxonomy-based Sources
The answer S(q) of a query q if defined as
15Naming Functions for Taxonomy-based Sources
QN All queries that do not contain negation
16Examples of Upper and Lower names
foods
apples
tomatos
5
1
2
3
4
green
red
fruits
17Example
18Ostensive Mapping and P2P
What can we do if the domain of two peers is
disjoint ?
gt Reference Collections
- Each peer before joining the network has to index
a small set of objects X - It can then run the articulation protocol on
this reference collection X.
19Ostensive Mapping Concluding Remarks
- It can be used in order to articulate both simple
terms and queries - It can be used in order to articulate only the
desired terms/queries of a CM - It can be implemented efficiently by a
communication protocol - The common domain of the peers is discovered
during the run of the protocol - Is independent of the nature of the objects(i.e.
the objects may be images, audio, videos)
20Further Research
- Naming functions for other kinds of sources
- relational databases
- semi-structured databases
- Description Logics-based information sources
- Information Retrieval Systems
- Optimal algorithms for full articulation