Propagating%20Updates%20Through%20XML%20Views%20Using%20Lineage%20Tracing - PowerPoint PPT Presentation

About This Presentation
Title:

Propagating%20Updates%20Through%20XML%20Views%20Using%20Lineage%20Tracing

Description:

Propagating Updates Through XML Views Using Lineage Tracing Leonidas Fegaras University of Texas at Arlington Updating XML Views Problem: Given a virtual XML view ... – PowerPoint PPT presentation

Number of Views:98
Avg rating:3.0/5.0
Slides: 15
Provided by: Leoni56
Learn more at: https://lambda.uta.edu
Category:

less

Transcript and Presenter's Notes

Title: Propagating%20Updates%20Through%20XML%20Views%20Using%20Lineage%20Tracing


1
Propagating Updates Through XML Views Using
Lineage Tracing
  • Leonidas Fegaras
  • University of Texas at Arlington

2
Updating XML Views
  • Problem Given a virtual XML view over a
    relational database expressed in XQuery,
    translate an XQuery update over the XML view to
    SQL updates over the base tables (embedded into
    pure XQuery)
  • Assumptions
  • We use the XQuery Update Facility (XUF)
  • Snapshot semantics one XQuery with updates a
    single transaction
  • Goals
  • prevent most view side-effects statically
  • ... but detect the rest efficiently at run-time
  • collectively consider all compatible updates in a
    transaction
  • This is far more expressive than considering each
    update in isolation

3
Highlights
  • Prevent many forms of view side-effects by using
    syntactic restrictions and a static analysis
  • Allow the full extent of XQuery to define views
    but restrict the table connections in a view to
    trees
  • Use a conservative static analysis to detect
    exclusive data sources
  • These are the table columns that can not cause
    view side-effects when updated
  • Based on polymorphic type inference and type
    usage analysis
  • Detect the rest of the view side-effects using a
    general (but efficient) run-time analysis
  • Propagate information about the origins of the
    updatable data through views and XQuery updates
  • to be used at the update destinations
  • Methodology lineage tracing
  • The lineage is not just a tuple-id/column-name
  • an XML element has a context a chain of tuples
    used in reaching the element
  • The table updates are generated by considering
    all chains in a transaction

4
Exclusive Data Source (EDS)
  • An exclusive data source (EDS) is a table column
    whose values
  • do not appear in the view output at all
  • ie, they are not updatable
  • or they appear in the view output only once, but
    they are not accessed elsewhere in the view code
  • Exclusive data sources cannot cause view side
    effects when updated
  • Our goal is to prevent/detect side effects on
    non-EDS columns as well

5
How to Detect Exclusive Data Sources?
  • Use polymorphic type inference with a simple
    usage analysis
  • each table column a fresh type variable
  • if the principal type of a view is polymorphic
    over a type variable
  • gt the view is oblivious to
    the table column values
  • in addition, if the type variable appears once in
    the view principal type
  • gt the table column is an
    exclusive data source
  • A proof based on the parametricity theorem
    ('theorem for free')
  • Problems
  • Not all XQuery engines support static validation
  • May be too conservative
  • Limitation the analysis granularity is a table
    column
  • Not good for generic mappings, eg, a view that
    uses a single tag table
  • The tag table may be used multiple times in the
    view (once per tag)
  • Cannot statically deduce disjointness
  • But may use extra annotations in the view to
    specify disjointness

6
Updating non-EDS Columns
  • Database DB
  • Department ( dno, dname, dphone )
  • Employee ( ename, dno, salary )
  • View view
  • for i in DB/Employee
  • let d DB/Departmentdnoi/dno
  • return ltemployeegt i/ename,
    i/salary,

  • d/dname, d/dphone

  • lt/employeegt
  • EDS columns
  • Employee.ename, Employee.dno,
    Employee.salary
  • Update on a non-EDS column
  • replace viewename"Smith"/dname
    with "CS"

View tree
Employee
ename
dno
salary
dno
dname
dphone
Department
7
Example (cont.)
  • Update on an non-EDS column
  • replace viewename"Smith"/dname
    with "CS"
  • If the CS department exists, can we just connect
    Smith to CS?
  • May cause a side-effect on dphone
  • What if the dphone were not visible in the view?
  • What if there is an update
  • replace viewename"Smith"/dphone
    with "x5678"
  • in the same transaction?

ename
dno
salary
Smith
update
EE
x1234
CS
x5678
dno
dname
dphone
dno
dname
dphone
8
Checking for Side-Effects at Run-Time
  • An XQuery replace update generates database
    updates of the form
  • replace ( T.A, value, tids )
  • to replace T.A with a new value
  • tids is the tuple id list of the tuple chain to
    reach the update destination tuple
  • We do not need to start from a root
  • anchor tuple the lowest-level tuple in the
    chain whose foreign key in the chain is EDS
  • If T.A is EDS, then tids contains the tuple id of
    the destination only
  • At the end of each transaction
  • We collect all updates associated with the same
    anchor tuple
  • We construct a shadow tuple chain that contains
    the updated values
  • If there is an existing chain in the database
    with the same visible tuple values
  • ... we update the anchor foreign key to link it
    to the existing chain
  • A visible column is one that appears in the view
    output (from usage analysis)
  • Otherwise, we try to insert
    the tuples of the shadow chain
  • ... which may cause key constraint violation

9
Example
E must be an EDS column
10
More on Updates
  • This translation of a replace update will not
    cause side-effects
  • But is this what the programmer intended to do?
  • replace viewdname"EE"/dname with
    "CS"
  • Too expensive to test at run-time
  • Deletions are handled similarly
  • We generate replace calls to change T.A to null
  • If all visible column values in the shadow chain
    are null, then disconnect the chain from the
    anchor
  • Needs garbage collection
  • Very hard to handle insertions
  • Generate replace-null(T.A,value,tids) to
    replace only if T.A is null
  • Generate insert(T) to insert a new tuple
  • Hard to generate these calls from a general
    XQuery insert

11
Using Lineage Tracing
  • The replace/replace-null/insert calls are
    generated at compile-time
  • but the tuple chain ids (tids) used must be
    propagated at run-time
  • At run-time, every XML atomic value
    (xsanyAtomicType) is annotated with lineage
    attributes
  • tids sequence of tuple ids
  • path sequence of foreign key names T.A
  • Requires minor extensions to an existing XQuery
    engine
  • Constructed when an atomic value is generated
    from a table in a view
  • from the view tree
  • XQuery operations must propagate these values as
    is
  • Newly constructed values have empty lineage

12
Other Contributions
  • A general algorithm for translating XQuery
    updates to embedded SQL update calls
  • Generates pure XQuery with embedded
    replace/replace-null/insert calls
  • It is type-guided (based on the update
    destination type)
  • Limitation XQuery insert translations are not
    complete
  • requires that there is at least one child in the
    parent node to copy parts of its lineage
  • Each XQuery update query is translated to pure
    XQuery with
  • embedded SQL code to retrieve data,
  • calls to replace/replace-null/insert to update
    data
  • A general framework for optimizing XQuery with
    embedded SQL statements
  • Source-to-source rewrite heuristics to
  • promote XQuery predicates to SQL code
  • fuse SQL statements in pairs to form SQL join
    queries
  • An implementation in Haskell
    http//lambda.uta.edu/HXQ/

13
Conclusion
  • Presented a framework for updating virtual XML
    views
  • Characteristics
  • Hybrid of schema-based and data-driven methods
    for detecting view side-effects
  • Allows general XQueries but restricts the table
    connections in the views
  • XQuery updates are translated statically to
    embedded SQL code
  • ... but the resulting SQL updates use information
    that is propagated at run-time (through lineage
    tracing)
  • Current work
  • A source-to-source, type-guided approach to
    incremental maintenance of XML views

14
Related Work
  • L. Wang et al vldb'06 HUX
  • A hybrid of schema-based and data-driven methods
  • Uses data lineage
  • Uses the concept of clean extended source
  • Semi-decidable method to determine
    translatability of updates
  • deduced from relationship cardinalities
  • Barsalou et al sigmod'91 Object view updates
  • View trees from ownership connections
  • Literal translation to base updates within a
    dependency island
  • Braganholo et al vldb'04 Map XML views to
    relational views
  • B. Choi et al icde'07 data-driven with XPath
    views
  • Needs to maintain a reachability matrix
  • Bidirectionalization research in programming
    languages
  • Relational lenses Bohannon et al pods'06
  • Bidirectionalization for free Voigtlander
    popl'09
Write a Comment
User Comments (0)
About PowerShow.com