Schema mappings are logical assertions that - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

Schema mappings are logical assertions that

Description:

Hide implementation details, allow for optimizations ... Translate data conforming to a source schema S into data conforming to a target ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 2
Provided by: IBMU440
Category:

less

Transcript and Presenter's Notes

Title: Schema mappings are logical assertions that


1
SPIDER a Schema Mapping Debugger
Bogdan Alexe Laura Chiticariu Wang-Chiew
Tan University of California, Santa Cruz
Debugging a Data Exchange
Debugging Schema Mappings
Schema Mappings and Data Exchange
  • Schema mappings are logical assertions that
  • describe the correspondence between two
    schemas
  • Higher-level, declarative programming constructs
  • Hide implementation details, allow for
    optimizations
  • Key elements in data exchange and data
    integration systems
  • Data Exchange FKMP03
  • Translate data conforming to a source schema S
    into data conforming to a target schema T so that
    the schema mapping M is satisfied
  • The process of exploring, understanding and
    refining a schema mapping through the use of
    (test) data at the level of schema mappings
  • Approach 1 At the level of the implementation
    (unsatisfactory)
  • Specific to the exchange engine
  • Specific to implementation language. E.g.,
    XQuery, XSLT, etc
  • Commercial tools available Altova MapForce,
    Stylus Studio, etc
  • Approach 2 At the level of the schema mapping
    (desirable)
  • Currently, NO SUPPORT!!!
  • Motivation for debugging at the level of the
    schema mappings Uniformity in specifying and
    debugging
  • Reduce programming effort by allowing a user to
    specify and debug at the level of schema mappings

M
Source schema S
Target Schema T
SPIDER
Source instance I
Target instance J
M
routes
Source schema S
Target Schema T
tuple selection
tuple selection
Source instance I
Target instance J
  • SPIDER is the first prototype
  • debugger for schema mappings

XQuery/SQL/Java
Example
Main Idea Debugging Schema Mappings with Routes
Features
  • Routes illustrate the relationship between source
    and target data with the schema mapping
  • Declarative semantics, based on the logical
    satisfaction of the dependencies
  • Independent of any implementation of the schema
    mapping
  • Concept applies to any mapping-based data
    exchange or data integration system
  • Computing routes for selected source or target
    data
  • Compute all routes
  • Compute one route
  • Compute alternative routes on demand
  • Guided exploration of all routes
  • Standard debugging features
  • Breakpoints on dependencies
  • Watch windows zoom into details about each step
    in the routes
  • Schema-level exploration of routes
  • Facilitates the understanding of schema mappings
    directly at the level of source and target
    schemas
  • Implementation details
  • On top of the Clio data exchange system
  • Supports relational and XML schema mappings

Source schema
Target schema
MANHATTAN CREDIT CardHolders cardNo
² limit ² ssn
² name ²
Dependents accNo ² ssn
² name ²
FARGO FINANCE Accounts ² accNo ² creditLine ²
accHolder Clients ² ssn ² name
D1
fk1
C1
D2
Debugging scenario 1
Debugging scenario 2
Source-to-target dependencies
Accounts
Accounts
D1 foreach s0 in MANHATTAN-CREDIT.CardHolders
exists t0 in FARGO-FINANCE.Accounts, t1 in
FARGO-FINANCE.Clients, where
t0.accHoldert1.ssn with s0.cardNo
t0.accNo and s0.ssn t0.accHolder and s0.name
t1.name
Unknown credit limit?
Unknown account number?
ID2
L2
A2
D2 foreach s0 in MANHATTAN-CREDIT.Dependents
exists t0 in FARGO-FINANCE.Clients with
s0.ssn t0.ssn and s0.name t0.name
A route for the Accounts tuple
Accounts
CardHolders
ID1
L1
123
D1
Target dependencies
Alice
ID1
15K
123
Clients
C1 foreach s0 in FARGO-FINANCE.Clients
exists t0 in FARGO-FINANCE.Accounts with s0.ssn
t0.accHolder
Alice
ID1
123 is not copied to the target as Bobs account
number
15K is not copied over to the target
Source instance I
Target instance J
D1 foreach s0 in MANHATTAN-CREDIT.CardHolders
exists t0 in FARGO-FINANCE.Accounts, t1 in
FARGO-FINANCE.Clients, where
t0.accHoldert1.ssn with s0.cardNo
t0.accNo and s0.ssn t0.accHolder and s0.name
t1.name and s0.limit t0.creditLine
CardHolders
Accounts
D2 foreach s0 in MANHATTAN-CREDIT.Dependents,
s1 in MANHATTAN-CREDIT.CardHold
ers where s0.accNos1.cardNo
exists t0 in FARGO-FINANCE.Clients, t1 in
FARGO-FINANCE.Accounts where t1.accHoldert0.ssn
with s0.ssn t0.ssn and s0.name t0.name
and s1.cardNo t1.accNo and
s1.limit t1.creditLine
Clients
Dependents
Solution for I under the schema mapping
Schema-level exploration of routes
Compute All Routes and Compute One Route
  • Illustrate the schema mapping at the level of the
    source and target schemas

Forest of routes for the Account tuple
Routes obtained from the forest
Accounts
MANHATTAN CREDIT CardHolders cardNo
² limit ² ssn
² name ²
Dependents accNo ² ssn
² name ²
FARGO FINANCE Accounts ² accNo ² creditLine ²
accHolder Clients ² ssn ² name
ID2
L2
A2
Clients
selected schema element
Bob
ID2
CardHolders
Accounts
D1
C1
Bob
ID2
7K
456
ID2
L2
A2
C1
D1
Accounts
fk1
C1
ID2
L3
456
D2
Clients
Bob
ID2
Dependents
Accounts
Clients
D1
D2
D2
C1
Bob
ID2
123
ID2
L2
A2
Bob
ID2
CardHolders
Dependents
Bob
ID2
7K
456
Bob
ID2
123
Towards a full-fledged debugger
  • Compute all routes
  • For each selected target tuple ts, consider every
    possibility for witnessing t. Do not consider the
    same tuple twice.
  • Complete, polynomial time algorithm
  • The route forest is a polynomial representation
    of all routes (possibly exponentially many) for
    the selected tuples
  • Computation can be user-guided, or stopped with
    breakpoints on dependencies
  • Compute one route
  • Non-exhaustive adapted compute all routes to
    stop when one witness is found
  • Inference procedure to deduce all consequences
    of a proven tuple and avoid recomputation of
    branches
  • Complete, polynomial time algorithm
  • Future work
  • Extension to handle nested schema mappings
  • Adapt the target instance with changes in the
    schema mapping
  • Acknowledgements
  • Daniel Pepper, UC Santa Cruz
  • The Clio team in IBM Almaden Research Center
Write a Comment
User Comments (0)
About PowerShow.com