Title: Multidatabase and Distributed Manipulations Part 1
1Multidatabase and Distributed ManipulationsPart 1
2Plan
- Introduction
- Technical problems
- Origin of the concept
- Approach Centralized DB (ANSI-SPARC)
- Approach DDB (top-down)
- Approach Global Schema (bottom-up)
- Reference architectures
- Multidatabase architecture
- Federated architecture
- Autonomy, semantic heterogeneity , common model
3Multidatabase Model
- Single database model ANSI-SPARC
- The real universe is modeled as one DB
- The real universe is modeled as multiple dbs
- Autonomous
- Semantically heterogeneous
- Manipulated through a multibase language
- "Multidatabase interoperability". Litwin, W.
Abdellatif, A. Multidatabase systems An advanced
solution for global information sharing. Hurson,
A., R., Bright, M., W., Pakzad, S., H., (Ed.).
IEEE press, 1993
4Multidatabase Model
Autres BDs sur Internet
Rest.
Cours étudiants
Mes-amis
Bibliothèque
Privé
Paris 9
Employés
Folio
Cine
Teletel
5Problems
- Reference architecture
- Semantic heterogeneity in presence of local
autonomy - Common data model
- Fonctions of MDB langage
- Transactions
- Protocols standards
- Performance
6Reference Architectures
- Multidatabase architecture
- Generalization of the ANSI-SPARC architecture
- Federated dbs architecture
- Generalization of the federated DB architecture
- Others
7ANSI-SPARC Architecture Centralized Integrated
BD (1960-70)
ES - External Schema
CS - Conceptual Schema
PS - Physical or Internal Schema
8Distributed DB
- Origin of the concept (years 1970)
- WAN development ( 20 kb/s)
- Overload of centralized dbs
9Ddb
- Idea distribution of functions other than local
communication ("top-down" approach) - Which ones ?
- Distr. Execution (OS)
- File access
- The DB
- Then, what data model for CS ?
- Network (codasyl) ?
- Relational
10Relation Fragmentation
Hotels (H, Ville, Cat, Chambres)
1 Fragment
(H, Ville)
(H, Cat, Chambres)
11Problems
- GS scalability
- GS utility for a local user
- Query performance (bad case)
- Data migration from ldbs
- IMS, IDMS, Socrate
- Local applications
12Problemes With the "Bottom-up" Approach
ES
ES
- Creation of GS
- Semantic heterogeneity
- Time for the integration / local restructuring
autonomy - In other words scalability of GS
- Updates
- Performance
- Heterogeneos views
GS
CS
CS
CS
PS
PS
PS
GS Approach ("bottom-up")
13MDB Architecture Absence
of (GS)
ES
ES
- User may have data in multiple dabs ANSI-SPARC
compatibles - One may face multiple css
- In general, it will be impossible to create a
GS
GS
CS
CS
CS
PS
PS
PS
GS Approach ("bottom-up")
14MBD Architecture Concept of the MDB Language
- Language for definition and manipulation of
collections of dbs (multidatabases) - Definition of MDB ESs
- Perhaps GSs
- Definition of MDB dependancies
- Semantics, integrity, security, manipulation...
- Formulation of MDB queries (explicitly)
- Referencing DB names
- With MDB joins...
- Find in DB michelin and in DB gaumont all
restaurants '' and cinemas at the same street
15MBD Architecture Multibases
- A multidatabase (MDB) is a collection of dbs
with MDB language - E.G. MSQL
- A collection of dbs without MDB langage is not
an MDB, but just a collection of dbs - As a collection of flat files (tables) without a
DB language , SQL for example, is not a DB
16Potential Multibases
Other DBs sur Internet
Rest.
Cours étudiants
Mes-amis
Bibliothèque
Privé
Paris 9
Employés
Folio
Cine
Teletel
17MDB Architecture Concept of Internal Logical
Sublayer
- The legacy data models can be heterogeneous
- Different SQL dialects
- Relational, hierarchical, network
- OO and object-relational
- It is preferable to have one model at MDB
layer - One needs a sub-layer for translations
- Also local DBA may wish not to show some of the
data at MDB layer - Solution ILS - internal logical schema
- Unknown of ANSI-SPARC
- Called also gateway
18Multibase Architecture Result of All This
(W. Litwin Al, Années 1980)
Req. MDB
ES multibase
19Federated Architecture (Hambiger Mcleod,
Années 1980)
- Every DB should be autonomous
- In general, there will be no GS
- Global integration is against the autonomy
- The dbs used in common should form a federation
of autonomous dbs - Every DB in a federation should be provided
with three schemes - ES export schema
- IS import schema
- PS private schema for all the private data,
of ES and of IS included - There should be some federation dictionary (FD)
20Federated Architecture (Hambiger Mcleod,
Années 1980)
21Comparison
- MDB architecture focuses on the concept of MDB
language - Federated architecture focuses on the concept
of autonomy - No MDB language
- But there is the notion of autonomy also in MDB
arch. - MDB architecture is decentralized
- No equivalent of FD
- Several DSs
- Both architectures are popular
22Comparison MDB lt-gt Fed.
23DB Autonomy(local autonomy)
- Capability to control local DB data by the DBA
- Naming
- Value Types
- Data Structures
- Physical Structures
- Query Execution
- Security
- Priority to local queries
24Multidatabase Autonomy
- Capability to controle multiple DBs by a DBA
- Same aspects as for the local autonomy
- Naming...
- May create a conflict with a DB autonomy
- Priority to local autonomy
B1
B2
B3
25Semantic Heterogeneity
- Differences in representations of the same real
properties - Names André ??Andrew
- Value Types
- Representations
- units of measure cm/s ??pied/h
- precision 1 g ?? 1 Kg
- Data Structures
- One table in 2 NF ??several tables in 3 NF
26Solutions (partial)
- Schemas descriptive
- Protocols descriptive
- Data Dictionaries
- Thesaurus
- Automatic Representation Conversions
- Automatic Unit Conversion
- Implicit Joints
- Higher level models and manip. languages
- IDL (Krishnamourthy, Litwin, Kent, ACM-Sigmod 92
27Common Models
- Ext. Relational
- EDA-SQL
- MSQL (research)
- ODBC Microsoft SQL
- Object-Relational
- SQL 3
- CCS language for inf. retr. DBs
- Numerous gateways towards SQL
- IMS ??SQL
- Codasyl ??SQL
28UniSQL/M
UniSQL/M
IMS
UniSQL
Sybase
Oracle
29Other gateways
UniSQL/M
IMS
UniSQL
Sybase
Oracle
30Yet other gateways
UniSQL/M
IMS
UniSQL
Sybase
Oracle
EDA-SQL
31The future (ODBC, JDBC, UDA)
ODBC x
32Conclusion
- Modern DBMSs are in general MDBMSs
- UniSQL/M, Oracle, Sybase, MsAccess, SQL Server,
InterBase, DB2... - MDB access requires new functions for the
management of - autonomy
- semantic heterogeneity
- physical distribution of data
33Conclusion
- technical solutions are basd on
- new reference architectures
- multidatabase architecture
- federated architecture
- common data models
- relational and object-relational
- Gateways in rapid development
- Any DBMS towards any other using ODBC, JDBC
- Relational or XML wrappers
34Conclusion
- MDB Languages
- MSQL et SQL-x x gt 2
- New transaction models
- Protocoles et Standards
- ODBC, JDBC
- All this is dealt with more in depth
- later on
- in the class books several others
35Exercises Research Problems
- All these already in the text
- Difference between the concepts of a DB, DDB,MDB
and FDB. - What does it mean  reference architecture ,
as ANSI-SPARC for example ? - Differences between the architectures
 top-down  bottom-up , multibase et
federated. - Comment on the actual architecture of federated
DBs in DB2 V. 6 and later (see DB2 Help or
RedBooks at the IBM web site). - Comment on the actual architecture of federated
DBs in SQL Server 2000 (see SQL Server 2000 Help
or the papers at the MS web site). - Design SQL SELECT statements realizing the
fragmentation of the Hotels DB. - Can you propose then how to deal with the
updates ? - Comment on the concept of ILS, of gateway and
of mediator - What multidatabase common model is most used
today ? - Comment the concept of local autonomy (what,
why, how) - Provide examples of various types of semantic
heterogeneity - Prove that usual associability of equijoints
does not hold if units of measure of values to
join may have different precisions - What are consequences for relational DBMSs ?
- Propose an extension to SQL for units of measure
and the corresponding query optimization (PH.D
Thesis).
36TheEND
37(No Transcript)