PPT – NTUA PowerPoint presentation | free to download

About This Presentation

Title:

NTUA

Description:

How to express the view? How to 'compose' the client query with the view, ... Mixed And Redundant Storage (MARS) initial configuration. view of proprietary. data ... – PowerPoint PPT presentation

Number of Views:143

Avg rating:3.0/5.0

Slides: 79

Provided by: Alin161

Category:

Tags: ntua

more less

Transcript and Presenter's Notes

Title: NTUA

1

XML Query Reformulation
Val Tannen
University of Pennsylvania
Joint work with Alin Deutsch, UC San Diego
and in part with Lucian Popa, IBM Almaden

2
Data Exchange Between Businesses Using XML
published data
published data
pharmaceutical company
insurance company
published data
published data
hospital
3
XML?
ltdruggt ltnamegtaspirinlt/namegt
ltpricegt4lt/pricegt ltnotesgt
ltside-effectsgtupset stomachlt/side-effectsgt
ltmakergtBayerlt/makergt lt/notesgt lt/druggt
text
4
A Simple Publishing Scenario
virtual data
ltstudygt ltcasegt ltdiaggtmigrainelt/diaggt
ltdruggtaspirinlt/druggt
ltusagegt2/daylt/usagegt lt/casegt ltcasegt
ltdiaggtallergylt/diaggt
ltdruggtcortisonelt/druggt ltusagegt3/daylt/usage
gt lt/casegt lt/studygt
patient name is hidden
XML query language standard (draft)
published data
proprietary data
prescription
patient
usage drug name 2/day
aspirin John 3/day cortisone
Jane
name diagnosis
John migraine Jane
allergy
How to express the view?
View query which, if executed,
would produce the virtual data
How to compose the client query with the
view, obtaining the reformulation?
5
The General Problem of Query Reformulation
client
query Q(P)
? reformulated query X(S)
schema P
schema S
schema correspondence
soundness
Given query Q(P), find query(ies) X(S) returning
same answer,
whenever such X(S) exists
completeness
6
Applications of Query Reformulation

data publishing
data integration
schema evolution
data security

we just saw it public schema / storage schema
P
S
global schema / local schema
P
S
old schema / new schema
P
S
illustrated next
7
An Application Data Security
client
query E(S) (exposes secret data correlation)
public schema P
proprietary schema S
schema correspondence
Only possible if Completeness Property holds!
8
More Complicated Data PublishingMixed And
Redundant Storage (MARS)
initial configuration
9
An Example With Tuning
XML
XML
drug,usage,diagnosis
simple publishing view
identity view
XML
drug,price,notes
drug,usage,name
name,diagnosis
10
Redundancy Enables Multiple Reformulations
client query find how much each treatment costs
XML
XML
drug,usage,diagnosis
simple publishing view
identity view
cached query
relational view
XML
XML
drug,price,notes
drug,price
drug,usage,name
name,diagnosis
diagnosis,drug
Some reformulations are potentially cheaper to
execute than others. Want to find an optimal
one!
11
Schema Correspondence Expressible in XQuery
The DB administrator must be able to specify the
correspondence.
XML
XML
XQuery
XQuery
XQuery
XQuery
XML
XML
encode
encode
XML
XML
Can use XQuery, fixing any of the common
encodings of relational tables in XML.
12
XQuery?
binding part
drug
for d in document/drug,
m in d//maker return ltproducedBygtm/text()lt/p
roducedBygt
name
price
notes
aspirin
4
side-effects
maker
tagging template
upset stomach
Bayer
// (descendant) is the transitive closure of /
(child)
Result should contain ltproducedBygtBayerlt/produce
dBygt
13
Approach XQuery Reformulation Reduced to
Relational Reformulation
14
XQuery Semantics
Variable binding stage
for d in document/drug, m
in d//maker return ltproducedBygtm/text()lt/p
roducedBygt
XML data model is a tagged tree
ltdruggt ltnamegtaspirinlt/namegt
ltpricegt4lt/pricegt ltnotesgt
ltside-effectsgtupset stomachlt/side-effectsgt
ltmakergtBayerlt/makergt lt/notesgt lt/druggt
tagging stage
15
Compiling the Binding Part of XQueries to
Relational Queries
XBind query binding part of XQuery (returns a
relation tuples of variable bindings)
a relational conjunctive query
compiles to P(d,m) - Root(r) , child(r,d)
, tag(d,drug) ,
desc(d,x) , child(x,m) , tag(m,maker)
But not all models of this schema correspond to
the intended model need GReX !
16
Sample Constraints from GReX

Relationship between child and descendant
navigation
?x?y child(x,y) ? desc(x,y)
desc contains child
?x el(x) ? desc(x,x)
desc is reflexive
?x?y?z desc(x,y) ? desc(y,z) ?
desc(x,z) desc is transitive
Tagged tree structure of XML
?r?x root(r) ? desc(x,r) ? x r
root has no ancestors
?x?y?z child(x,z) ? child(y,z) ? x y
at most one parent

These do not capture transitive closure
completely, nor is it possible to do it in
first-order logic STILL...
17
More Constraints from GReX

(some Tag) ?x el(x) ? ?t tag(x,t)
every element has a tag
(oneTag) ?x?t1?t2 tag(x,t1) ? tag(x,t2) ?
t1 t2 one tag per element
(noLoop) ?x?y desc(x,y) ? desc(y,x) ? x
y no non-trivial cycles
(noShare) ?x?y?u?v child(x,u) ? child(x,v)
? unique path between
desc(u,y)
? desc(v,y) ? u v elements
(inLine) ?x?y desc(x,u) ? desc(y,u) ?
ancestors of an element
x y ?
desc(x,y) ? desc(y,x) are collinear

18
Which Reformulations Do We Find This Way?
client XQuery
Mappings (?) as XQueries
schema correspondence
GReX built-in constraints capture XML data model
reformulated queries (multiple solutions)
all of them?
19
Restrictions on XQuery

Main restriction no aggregates (to be
investigated)
Leaving out aggregates, most common queries can
be processed.
Minor restrictions
no user-defined functions (of course!)
limited use of negation (or else the problem
becomes undecidable)
limited use of document order (to be
investigated)
no navigation to parent or wildcard child (of
unspecified tag) (unintuitive, but we can show
that this needs another algorithm, unless NP ? 2)

p
20
The Reduction is Sound and Complete

For the restricted XQuery fragment,
Given
- XBind query B
? compiled to a relational query
c(B)
- schema correspondence C given by XQueries ?
compiled to set of constraints c(C)

Relative Completeness Theorem R
is a minimal reformulation of B under
C iff
c(R) is a minimal reformulation of
c(B) under c(C) and GReX
R can be computed from c(R)
All of them are found by CB.
21
A Glimpse at the ChaseTransforming Queries
Using Constraints
A query find data satisfying condition A
A
Q
The chase repeatedly applying chase steps until
no new conditions can be added
In general, Q and Q1 are not equivalent, but in
all DBs satisfying the constraint, they are!
Theory of the chase 20 years old, deep and rich,
due to Beeri, Maier, Mendelson, Sagiv, Vardi,
Yannakakis and others!
22
How Do We Use the Chase?Capturing Relational
Views With Constraints
Let the schema correspondence be the view
retrieve the data satisfying conditions A and
B
V
A
B
all data satisfying A and B appears in
result of V
all data appearing in V satisfies A and B
23
Chase Backchase
First chase
A
Q
Next inspect all subqueries (syntactic pieces)
of the chase result Q2
SQ
V
It turns out that SQ is equivalent to Q
Presence of constraint A ? B allows reformulation

24
General CB Algorithm (joint work with Lucian
Popa, IBM Almaden)

(public) schema P , (proprietary) schema S
Let C be a set of constraints. (eg., on P
and/or P S )

Assume some terminating chasing sequence
Q(P)
25
Two Sets of Experiments

Synthetic queries
reformulation time as function of query
complexity
XML analog of relational star queries,
increasing number of joins
can very complex queries still be
reformulated in a practical amount of time ?
Realistic queries from the XML Benchmark
Project http//monetdb.cwi.nl/xml
The Queries 20 queries designed to
exercise interesting features of XQuery
The Schema correspondence views in both
directions
compiles to about 200 constraints!

Much more than in typical relational schemas!
26
Experiments with Synthetic Queries
Number of joins (number of corners in the star)
27
Experiments with Benchmark Queries
Reformulation times must be understood in
conjunction with execution times (eg., tens of
seconds for Q10)
28
Summary of Contributions

MARS, a system for XQuery reformulation,
- with mixed and redundant storage, under
integrity constraints.
- complex schema correspondence (views in both
directions)
Showed practical relevance of CB method
(feasible and worthwhile)
A completeness result for a significant fragment
of XQuery and a large
class of schema correspondences. The method
remains sound for the full language.
A reduction between minimal reformulation and
query equivalence, and
we gave matching lower bounds showing our
chase-based decision procedure is
asymptotically optimal for the fragment
considered.

29
The End
30
Why XML?

The relational data model is still the dominant
concept in databases.
All data can be coded into tables.
(For that matter into (goedel)numbers too!)
Artificial coding makes life harder for query
programmers.
Result less productivity, more bugs.
XML is much more flexible. It is also
self-describing, i.e., no
need apriori for types/schemas (but this is
sometimes a bad idea).
It came from the document community (tagged text)
and was cheered by industry gurus. So we have to
live with it.
(Although one can image better data models)

31
Making It Work

Chase each chase step is similar to evaluation
of a recursive Datalog rule on a
symbolic database built from
the query
? we borrowed classical query
processing techniques

Backchase size of search space is O(2u), u
size of universal plan We
found criteria for pruning this space.

compiling constraints to join tree
joins implemented as hash-joins
pushing selections into joins

Cost-independent prune subqueries that
- do not correspond to legal XML queries
- contain redundant descendant navigation
steps

bottom-up exploration of subqueries first
all performing 1 navigation step, next all
performing 2 navigation steps, etc.
Perform contiguous navigation steps starting from
the root
x child-of y, y child-of z, x descendant-of z

A cost-based pruning strategy parameterized by
costing model

- finds optimal reformulation for any monotonic
cost model - cost models for XML are still under
research - heuristic cost model cost is
number of table scans/XML navigation steps
performed - amenable to experimenting with
other cost models
32
Benefit of Reformulation For Execution Time
no. of elements in document
Benefit increases with increasing complexity of
query and increasing database size
33
More Results for Benchmark Queries
Delta to finish search
Delta to best reformulation
Time to first reformulation
For redundancy materialized the XBind query for
each query
(particular case of Acess Support Relation)
Time to find first reformulation is essentially
the same as in the absence of redundancy. Addition
al time spent only for finding optimal one.
34
Related WorkData Integration As Particular Case
of MARS Applications
Global As View (GAV)
Q
XQ o CR
P
(global schema)
CR
S
(local schema)
with Fernandez and Suciu in SIGMOD99
reformulation by composition-with-views
TSIMMIS, SilkRoute, XPeranto
35
Future Work Directions

Short-Term
- tuning of CB implementation for further
speedup
- XML-specific strategies for pruning the
backchase stage
- in particular, finding a good cost model to
perform cost-based pruning
Medium-Term
- Applying CB to Data Security
- Applications to Adaptive Distributed Query
Optimization
Long Term
- a unified framework for integrating data from
various, heterogenous sources going
beyond classical databases (XML/relational/LDAP
web forms web services)

36
Application 3 Schema Evolution (e.g. Caching)
Goal support existing client applications even
after changing the schema
client
old query Q (O)
old schema O
new schema N
schema correspondence
could be O extended with cached results
37
A Source of Redundancy Relational Storage of XML
catalog
drug
drug
name
price
notes
price
notes
name
50
aspirin
cortisone
4
38
Containment Under Integrity Constraints

Decision procedure for containment is based on
chasing with constraints from GReX.
Natural extension to XML integrity constraints.
Some results
Containment of well-behaved XPath/XBind queries
under bounded simple XML integrity constraints
(SXICs) is decidable (used in relative
completeness theorem).
Even modest use of unboundedness makes the
problem undecidable.
Corollary containment under bounded SXICs and
DTDs is undecidable.
Containment under DTDs only is an open problem,
but we have a PSPACE lower bound.
See proposal for details.

39
LDAP
40
The Very End
41
The Architecture of Our Solution
client XQuery
defined next
Mappings (?) as XQueries rel/XML encodings
schema correspondence
not shown here
reformulated queries (multiple solutions)
42

Problem
XML/MARS XQuery Reformulation
schema correspondence given by views in both
directions
multiple solutions

43
Capturing Relational Views With Constraints
Let the schema correspondence be a view defined
as the relational conjunctive query V(x,z) -
A(x,y), B(y,z)
Capture the definition with constraints,
(cV) ?x ?y ?z A(x,y) ? B(y,z) ? V(x,z)
(bV) ?x ?z V(x,z) ? ?y A(x,y) ? B(y,z)
44
Partially capturing the XML model

Partially, because some features cannot fully be
captured with constraints
descendant is the transitive closure of child,
but this is not FO-definable
neither is the treeness property
our solution
add a set of constraints GREX to approximate
intended models
it turns out that capturing descendant
helps in capturing treeness
then, we define a significant XQuery fragment
(we call it well-behaved)
that cannot distinguish between
intended and approximate models

45
Constraints in GReX (2) the tagged tree
structure of XML

(topRoot) ?r?x root(r) ? desc(x,r) ? x r
root has no ancestors
(oneTag) ?x?t1?t2 tag(x,t1) ? tag(x,t2) ?
t1 t2 one tag per element
(noLoop) ?x?y desc(x,y) ? desc(y,x) ? x
y no non-trivial cycles
(oneParent) ?x?y?z child(x,z) ? child(y,z) ? x
y at most one parent
(noShare) ?x?y?u?v child(x,u) ? child(x,v)
? unique path between
desc(u,y)
? desc(v,y) ? u v elements
(inLine) ?x?y desc(x,u) ? desc(y,u) ?
ancestors of an element
x y ?
desc(x,y) ? desc(y,x) are collinear

46
XQuery Restrictions

What it allows
composition of navigation
steps,
navigation axes self,
(named)child, descendant, ancestor, idrefs
qualifiers path,
string ? path, and, or, path
equality/inequality
where clause
disjunction, path equality/inequality,
existential quantification
What it rules out
user-defined functions,
range, before predicates,
aggregates, arbitrary
negation, universal quantification,
concatenation (,)
navigation to parent (..) or
to child of unspecified name ()

47
CB Completeness

Let C be a set of constraints (relates public
schema P and proprietary schema S)
C-minimal query
removing any of its relational atoms
produces non-equivalent query under D
Q1 is a subquery of Q2
Q1 is isomorphic to a piece of Q2

Q(P)
Completeness Theorem Any C-minimal reformulation
of Q is a subquery of U
48
A Completeness Result for Our Solution

Given
- well-behaved XBind query B
compiled to a relational query c(B)
- schema correspondence M given by well-behaved
XQueries (in both directions),
compiled to set of relational
constraints c(M)
- bounded XML integrity constraints XIC,
compiled to set of relational
constraints c(XIC)

a class of XML integrity constraints, see
KRDB01
Relative Completeness Theorem for any R
R is a (MXIC)-minimal
reformulation of B
iff c(R) is
a (GReX ? c(M) ? c(XIC))-minimal reformulation of
c(B)
All of them are found by CB. Corollary
completeness of reformulation algorithm for XBind
queries
R can be computed from c(R)
49
Capturing XML Semantics
client XQuery
Mappings (?) as XQueries
schema correspondence
GReX built-in constraints capture XML data model
reformulated queries (multiple solutions)
50
Summary of Constraints Used in CB Phase

Built-in constraints in GReX
Relational views compile to inclusion
constraints
XQuery views
their XBind queries compile to inclusion
constraints as for relational views
their return clause compiles to several
decorrelated queries, each captured with
constraints
the XML template in the return clause compiles to
several Skolem and copy functions, each compiled
to constraints
Integrity constraints
XML constraints compile to relational constraints
relational schema constraints

51
Are the Restrictions Justified?

Our completeness result holds for well-behaved
XQueries, under bounded
XML integrity constraints.
What about reformulating
XQueries with parent and wildcard child
navigation?
Under other XML integrity constraints?
Even under full-fledged DTDs?
For such extensions, we make a deeper study of
equivalence, which is an even simpler problem in
reformulation.
The equivalence checker is invoked as black-box
algorithm during CB.

52
XBind (includes XPath) Fragments
Equivalence
path concatenation, attribute values navigation
axes self, (named)child, descendant qualifiers
path, string ? path, and
PTIME
join on attribute variables
NP-complete
any or all (!) of the following .
disjunction . ancestor navigation .
path equality . wildcard child (?)
navigation parent, preceding(following)-sibling
53
Containment for the well-behaved fragment of
XBind/XPath
Theorem B1 , B2 XBind/XPath queries from our
well-behaved fragment c(B1) , c(B2) their
relational compilation B1 is
equivalent to B2 iff c(B1) is
equivalent to c(B2) under GReX
decidable in P2p using chase
This result about containment is used in the
relative completeness theorem
54
Extensions of the NP fragment ?2p fragments

any or all (!) of the following make equivalence
?2p-complete
disjunction
unsurprising conjunctive queriesunion
already ?2p-complete SY80

ancestor navigation
translate ancestor away introducing union
/a/b/ancestor ? /a/b ? /ab
path equality qualifier
can simulate ancestor
//..//./p/s ? /p/ancestor/s
wildcard child navigation
union introduced by interaction //??
//a ? /a ? /?//a

Not well-behaved, but we have a different
decision procedure
55
Experimental Setup Started From the XML Benchmark

Used the official XML Benchmark Project
http//monetdb.cwi.nl/xml
The application domain an online auctioning
application.
The published schema a DTD given by the XML
Benchmark Project
Data is partially nicely structured.
The Queries 20 queries designed
to exercise interesting features of XQuery

56
What We Added to the XML Benchmark Setup
The mixed storage schema
relationally person, item, open auction,
closed auction, etc.
unstructured part annotations on items The
redundancy materialized the XBind query for
each query
(particular case of Acess Support Relation) The
mappings in both directions
relations ? XML, XML ? XML
It all compiles to about 200
constraints !
Much more than in typical relational schemas! Had
to change original implementation SIGMOD00 to
scale.
57
Related Work

Publishing systems
Schema mapping proprietary relational ?
published XML SilkRoute, Xperanto
reformulation by composition-with-views.
Schema mapping published XML ? proprietary
relational STORED, Agora
reformulation by rewriting-with-views
Information Integration
TSIMMIS (composition-w-views), Information
Manifold (rewriting-w-views)
Containment
Miklau and Suciu, smaller fragment of
XPath(they too find that is naughty
FLS, CGLV - conjunctive regular path
queries
Amer-Ahia and Srivastava - minimization of
tree pattern queries
Containment under integrity constraints
XML keys BDFHT description logics CGL

58
Query Reformulation in Data Publishing
public schema P (virtual data)
schema interface against which
queries are formulated
publishing query (may hide some proprietary
data)
proprietary storage schema S (materialized data)
59
Compiling the Binding Part of XQueries to
Relational Queries
But, over arbitrary DBs with this schema, the
relational translation of Root ? desc ?
desc is not equivalent to that of
Root ? desc
must communicate to the CB that desc table is
transitive
60
The Challenge for Reformulation on MARS

To find the reformulations efficiently, we need
to
reason with schema correspondence
efficiently construct the search space for
reformulations
- must contain all reformulations (for
completeness)
explore search space
- exhaustively (for security applications)
- maybe trading optimality of reformulation for
search speed
(for optimization purposes)

61
Contributions

A novel algorithm for reformulation of relational
queries under relational constraints
Chase Backchase

Uses this semantics and exploits CB

A declarative semantics for most of XQuery

VLDB99 with Popa and Tannen SIGMOD00 with
Popa, Sahuguet and Tannen

A reformulation algorithm for XQuery
practical (feasible and worthwhile)
complete for most of XQuery
optimal (we show lower bounds for various XQuery
fragments KRDB01, DBPL01)

MARS a system for XQuery reformulation over
Mixed And Redundant Storage
constructs and represents search space
efficiently
cost-based exploration strategy parameterized by
traditional costing module
finds first reformulation fast
Experimental evaluation time to first
reformulation, simple cost

62
Compiling Client XQueries
client XQuery
Mappings (?) as XQueries
schema correspondence
GReX built-in constraints capture XML data model
reformulated queries (multiple solutions)
63
Capturing the Schema Correspondence
client XQuery
Mappings (?) as XQueries
schema correspondence
GReX built-in constraints capture XML data model
reformulated queries (multiple solutions)
64
Major Obstacles in Compiling Schema Mappings to
Constraints

Schema correspondence given by XQueries. As
opposed to relational queries,
XQueries have nested, correlated subqueries in
return clause
XQueries create new elements
XQueries return deep, recursive copies of
input XML trees
(solution not shown)

65
Compiling Nested Subqueries Decorrelation

the query
for p in doc(foo.xml)//person
return ltresgtp/phone/text()lt/resgt

is short for the nested query
for p in doc(foo.xml)//person
return ltresgtfor t in p/phone/text()
return t
lt/resgt

compile XBind parts to two decorrelated
relational queries (shown here in Datalog
syntax) Bouter(p) ? Root(r), desc(r,x),
child(x,p), tag(p,person) Binner(p,t) ?
Bouter(p), child(p,n), tag(n,phone),
text(n,t) capture each with two inclusion
constraints, as done in original CB method
66
Capturing Creation of New Elements

for p in
doc(foo.xml)//person
return ltresgtp/phone/text()lt/r
esgt
For each binding of p, a distinct ltresgt-element
is constructed.

Capture F by the relation G representing its
graph, and the constraints ?p?r1?r2 G(p,r1) ?
G(p,r2) ? r1r2 ( r F(p)
) ?p1?p2?r G(p1,r) ? G(p2,r) ? p1p2
( F is injective ) ?p ?r G(p,r) ?
Bouter(p)
(Fs domain is included in Bouter) ?p
Bouter(p) ? ?r G(p,r)
(Bouter is included in Fs domain)
F is the Skolem function that validates this
constraint
67
Stratified-Witness Constraints(with L.P.)
Full dependencies no existential quantifier. The
chase always terminates. Beyond this? Given set
C of dependencies --gt define chase flow
graph Nodes correspond to relation components
an R or arity 3 produces 3 nodes. Edges are drawn
between ith of R and jth of S iff R appears on
the left side and S appears on the right side of
the implication of some dependency. The edge is
labeled ? if the corresponding variable in S is
existentially quantified. C is
stratified-witness if there is no cycle with an
?-labeled edge Proposition The chase with
stratified-witness constraints always terminates.
68
(Relational) Conjunctive Queries
Q(x,z) R(x,y,z) , R(y,x,u) ,
S(z,u) select r1.A , s.A from R r1 , R r2
, S s where r1.Ar2.B and r1.Br2.A and
r1.Cs.A and r2.Cs.B notation r
stands for r1 , , rn queries select
O(r) from R r where C(r)
69
(Relational) Dependencies a.k.a Integrity
Constraints
?(r?R) B(r) ? ?(s?S) C(r,s)
B and C are conjunctions of equalities,
as in where clause example ?(r1?R)(r2?R)
r1.E r2.E ? ?(s?R) s.D r1.D
? s.E r1.E ? s.F r2.F
70
Query Containment and Dependencies
Q1 select O1(r1) from R1 r1 where
C1(r1) Q2 select O2(r2) from R2 r2
where C2(r2) define cont(Q1,Q2) as
?(r1?R1) C1(r1) ?
?(r2?R2) C2(r2) ? O1(r1)O2(r2) we have, in
each instance Q1 Q2 iff
cont(Q1,Q2)
71
And Viceversa
d ?(r?R) B(r) ? ?(s?S) C(r,s)
front(d) select r from
R r where B(r) back(d) select r
from R r , S s where B(r) ? C(r,s)
we have, in each instance d
iff front(d) back(d)
72
Chase Step
d ?(r?R) B(r) ? ?(s?S) C(r,s)
select O(r) select
O(r) from R r
from R r , S s where B(r)
where B(r) ? C(r,s) basic fact
Q Q ? Q d Q the chase
step is applicable if Q is not trivially
equivalent to Q (for example, we cannot chase
Q with d ! )
73
Using the Chase
basic fact if chase step of Q with d is
not applicable then Inst(Q)
d ( canonical instance Inst(Q) built from query
Q ) Basic Theorem D set of dependencies
Q1 . . . chaseD(Q1) terminating chase
sequence
(no more applicable steps) Then Q1
D Q2 iff chaseD(Q1) Q2
74
Reformulation with Views
a view is just a query V select
O(r) from R r where C(r) Reformulation
of query Q(R) with view V finding
X(R,V) such that Q(R) V X(R,V)
75
One View Two Dependencies
V select O(r) from R r where C(r) the
chase-in dependency cV ?(r?R) C(r) ?
?(x?V) xO(r) the backchase dependency bV
?(x?V) ?(r?R) C(r) ? xO(r) It turns out
that if rewritings of Q with V exist
then such a rewriting can be obtained by chasing
Q with cV
76
The Chase and Backchase (CB) Algorithm(joint
work with Lucian Popa, IBM Almaden)
The chase with cV always terminates. The search
space for rewritings of Q with V consists of the
subqueries of chasecV(Q). ( S is a
subquery injective homomorphism from S to
chasecV(Q) ) Keep only subqueries such that
S V chasecV(Q) This can be checked by
(back!)chasing with cV, bV (also terminating)
77
Preliminary Completeness Result for CB(with
L.P.)
Theorem Any scan-minimal reformulation of Q
with V is a subquery of
chasecV(Q). scan-minimal no scan (from
item) can be removed without compromising
equivalence with Q. Fewer scans means faster
execution under most cost models.
78
Additional Integrity Constraints
In general the storage schema contains integrity
constraints that restrict its class of instances
(models). This may extend the set of
reformulation solutions! Let C be a set of
dependencies Reformulating query Q(R) with
view V under C finding X(R,V) such that
Q(R) V,D X(R,V). Thats the same as
reformulating Q under C cV bV Can we still
use the chase?

Write a Comment

User Comments (0)