Title: COOPERATION STRATEGIES FOR INFORMATION INTEGRATION
1COOPERATION STRATEGIES FORINFORMATION INTEGRATION
Istituto di Informatica - Università degli studi
di Ancona
- Maurizio Panti, Luca Spalazzi, Loris Penserini
- panti,spalazzi,pense_at_inform.unian.it
2Talk Overview
- Motivations and Goals
- Local strategies
- Cooperation strategies
- the choice of partners
- the choice of queries
- the choice of answers
- Discussion
3Motivation
- Information systems are collections of
information sources and information consumers - Distributed
- Heterogeneous
- at physical level
- at logical level
- at conceptual level (names and schemas)
- Dynamic
- changes of information sources or their schemas
- changes of information consumers or their needs
4Goal
- Rewriting a consumers query into queries to
specific information sources - when we have
- a distributed, heterogeneous, and strongly
dynamic information system.
5Related Work
- Usually query rewriting and information
integration systems adopt the - Mediator Architecture
- TSIMMIS, Squirrel, WHIPS, Carnot, SIMS,
Information Manifold, Infomaster - dynamic sources systems are overloaded with
expensive updating operations - dynamic consumers systems do not perform user
profiling.
6Information Source
- wrapper
- a description logic as data modelling and query
language e.g. C-Classic - source query processing is based on rewriting
query using views over the local source
adapted by Beeri, Levy, Rousset
7Mediator
- mediated schema
- a description logic as data modelling and query
language e.g. C-Classic - query processing is based on rewriting query
using views over the distributed sources
adapted by Beeri, Levy, Rousset
8Rewriting query using views
Rewriting Composition of rewriting of retrieved
concepts
Retrieval Conjunction of concepts that are
maximally contained in the query
query Q "pub.(ai ? db) ? "pub.acm
9Query Execution
10Local Failures
- In query rewritingthe mediator is not able to
rewrite (some or all the components) of the input
query. - In query executionthe mediator is not able to
execute (some or all the components) of the
rewrited query.
11Cooperation Strategies
12Cooperation with MediatorsAsking for Rewriting
Mediator M
Mediator N
13Cooperation with MediatorsAsking for Data
Mediator M
Mediator N
14Cooperation with Sources
Mediator M
Mediator N
15Strategy Comparison
m number of mediators s number of sources s
number of sources that cooperate with Ni
16Strategy Comparison
17Conclusion
- 1st scenario (s ms)
- mediators can be used for user profiling,
- mediators can be used to solve name heterogeneity
and integrate data, - in order to solve schema heterogeneity, for a
mediator the most efficient and effective
strategy is to directly cooperate with sources, - in order to update its schemas, for a mediator a
lazy approach can be not appropriate.
18Conclusion
- 2nd scenario (s gtgt ms)
- mediators can be used for user profiling,
- the most efficient strategy is the cooperation
with other mediators, - cooperation with wrappers is useful only when
mediators are not able to rewrite a given query, - in order to update its schemas, for a mediator a
lazy approach is appropriate.
19(No Transcript)
20Ms Mediated Schema
21Rewriting query using views
Composition of rewriting of retrieved concepts
22Rewriting query using views
23Local Failure in Query Rewriting
query Q"pub.(ai?db)?"affiliation.Stanford
24Local Failure in Query Execution
no answer
25Cooperation with MediatorsAsking for Rewriting
retrieved concepts"pub.acm_tocl, "pub.acm_tods
rewriteview("pub.acm_tocl),view("pub.acm_tods)
26Cooperation with MediatorsAsking for Rewriting
Mediator M
rewriteview("pub.acm_tocl),view("pub.acm_tods)
27Cooperation with MediatorsAsking for Data
retrieved concepts"pub.acm_tocl, "pub.acm_tods
rewriteview("pub.acm_tocl),view("pub.acm_tods)
28Cooperation with MediatorsAsking for Data
rewriteview("pub.acm_tocl),view("pub.acm_tods)
29Cooperation with MediatorsAsking for Data
rewriteview("pub.acm_tocl),view("pub.acm_tods)
30Cooperation with Sources
retrieved"pub.acm_ccs,"pub.acm_jacm,"pub.acm_tod
s
31Redundancy
Cn(M) mediated schema of M after n interactions
with N C(N) mediated schema of N
32Recall
Cn(M) mediated schema of M after n interactions
with S1, Sn n information need of a consumer(a
view of S1, Sn )
33Precision
Cn(M) mediated schema of M after n interactions
with S1, Sn n information need of a consumer(a
view of S1, Sn )
34Ns Mediated Schema
35Ms Mediated Schema (updated)
36Ms Mediated Schema (updated)
37Ms Mediated Schema (updated)
38Cooperation with Mediators
Theorem. Cn(M) mediated schema of M after n
interactions with N C(N) mediated schema of N
redundancy
39Cooperation with Mediators
Theorem. Cn(M) mediated schema of M after n
interactions with N n information need of a
consumer(a view of S1, Sn )
recall
precision
40Cooperation with Sources
Theorem. Cn(M) mediated schema of M after n
interactions with S1, Sn n information need of
a consumer(a view of S1, Sn )
recall
precision