Title: Logic as a Query Language:
1 Logic as a Query Language
from Frege to XML
Victor Vianu U.C. San Diego
2Logic and Databases a success story
FO lies at the core of modern database
systems Relational query languages are based
on FO SQL, QBE
More powerful query languages (all the way to
XML) are based on extensions of FO
Foundations lie in classical logic FO
Frege relational algebra Tarski
3Why is FO so successful as a query language?
easy to use syntactic variants
SQL, QBE efficient implementation via
relational algebra amenable to
analysis and simplification potential for
perfect scaling to large databases
very fast response can be achieved
using parallel processing
4A relational database
drinker bar
bar beer
frequents
serves
Joe Kings
Kings Bass
Joe Mollys
Kings Bud
Sue Mollys
Mollys Bass
...
logically a finite first-order structure
5Find the drinkers who frequent some bar serving
Bass
FO ?ddrinker ? bbar
(frequents(d,b) ? serves(b, Bass))?
QBE
bar beer
drinker bar
frequents
serves
d b
b Bass
drinker
answer
d
6not
Find the drinkers who frequent some bar serving
Bass
FO ?ddrinker ? bbar
(frequents(d,b) ? serves(b, Bass))?
QBE
bar beer
drinker bar
frequents
serves
d b
b Bass
drinker
answer
d
7 Naïve implementation nested loops
?ddrinker ? bbar (frequents(d,b) ? serves(b,
Bass))?
for each drinker for each bar
check the pattern
Number of checks drinkers ? bars Roughly n
unacceptable for large databases!
2
Better approach relational algebra
8Relational algebra operations
union, difference
bar beer
Kings Bass
Mollys Bass
selection ? ? (serves)
beer Bass
bar
Kings
projection ? ? (serves)
bar
Mollys
9 join ? frequents ? serves
drinker bar
bar beer
frequents
serves
Joe Kings
Kings Bass
Joe Mollys
Kings Bud
Sue Mollys
Mollys Bass
...
drinker bar beer
frequents ? serves
Kings Bass
Joe Kings Bass
Mollys Bass
Joe Kings Bud
Joe Mollys Bass
Sue Mollys Bass
10Relational algebra queries
Find the drinkers who frequent some bar serving
Bass
? ( ? ( frequents ?
serves ))
beer Bass
drinker
drinker bar beer
drinker bar beer
drinker
Kings Bass
Kings Bass
Joe Kings Bass
Joe Kings Bass
Joe Sue ..
Mollys Bass
Mollys Bass
Joe Kings Bud
Joe Mollys Bass
Joe Mollys Bass
Sue Mollys Bass
Sue Mollys Bass
Theorem Relational algebra and FO are equivalent
11Journey of a Query
- FO (SQL) ?z(P(xz) ? Q(zy)) ?
- Relational Algebra ?13(P??Q) ??
- Query Rewriting ?14(P??S) ?? Q ?? R
- Query Execution Plan
- Execution
-
- Physical Level
??
??
?14
Q
R
??
P
S
12 rewriting rules for algebra queries
? (? ( frequents ? serves
))
beer Bass
drinker
efficient algorithms for individual operations
Indexes special directories to data
cost roughly n ( log n) much better than
n for large databases!
2
13 rewriting rules for algebra queries
? (? ( frequents ? serves
))
beer Bass
drinker
? frequents ? ? (?
(serves ))
beer Bass
drinker
bar
efficient algorithms for individual operations
Indexes special directories to data
cost roughly n ( log n) much better than
n for large databases!
2
14Most spectacular theoretical potential for
perfect scaling!
perfect scaling given sufficient resources,
performance does not degrade as the database
becomes larger
key parallel processing
cost number of processors polynomial in the
size of the database
role of algebra operations highlight
parallelism
15Each algebra operation can in principle be
implemented very efficiently in parallel
Example projection ? (serves)
bar
bar beer
bar
serves
Kings Bass
Kings
Kings Bud
Mollys
Mollys Bass
Constant parallel time!
16Another example join frequents ? serves
drinker bar
frequents
Joe Kings
Joe Mollys
drinker bar beer
Sue Mollys
...
Kings Bass
Joe Kings Bass
Joe Kings Bass
Mollys Bass
Joe Kings Bud
Joe Mollys Bass
bar beer
Sue Mollys Bass
serves
Kings Bass
Kings Bud
Mollys Bass
17Every relational algebra query takes constant
parallel time!
? (? ( frequents ? serves
))
beer Bass
drinker
?
drinker
?
beer Bass
constant parallel time
?
frequents
serves
18Summary so far
Keys to the success of FO as a query language
-ease of use -efficient implementation via
relational algebra Constant parallel
complexity the full potential of FO as a
query language remains yet to be realized!
19Beyond relational databases the Web and XML
relations replaced by trees (XML data)
structure described by schemas (e.g.,
DTDs) Again, logic provides the foundations
DTDs are equivalent to tree automata (MSO on
trees) XML queries are essentially tree
transducers Can use automata and logic to
understand semantics and expressiveness,
perform static analysis
20Most XML query languages are extensions of SQL
- implementation based on same paradigm
- uses extensions of relational algebra
- query optimization builds upon relational
techniques
21XML and DTDs
ltdealergt ltUsedCarsgt ltadgt ltmodelgtHondalt/model
gt ltyeargt96lt/yeargt lt/adgt lt/UsedCarsgt
ltNewCarsgt ltadgt ltmodelgtAcuralt/modelgt lt/adgt
lt/NewCarsgt lt/dealergt
dealer
UsedCars
NewCars
ad
ad
model
year
model
Honda
96
Acura
22Data Type Definition (DTD)
? alphabet of element names, root ? ?
set of rules e
r
regular expression over ?
element name
23Documents satisfying a DTD
root
.
e
r
e
e1 . ek
? r
Set of trees satisfying DTD d T(d)
24Example
A DTD and a tree satisfying it root
section section intro,
section,conclusions
root section
section
intro section conc
intro conc intro section
section conc intro conc
intro conc
25Specialization
dealer
UsedCars
NewCars
ad
ad
model
year
model
ad has different structure in different contexts
26Specialization
dealer
UsedCars
NewCars
adused
adnew
model
year
model
ad has different structure in different contexts
27- What sets of trees can be defined?
- Exactly the regular tree languages!
- --trees accepted by tree automata
- --trees defined by Monadic Second-Order Logic
(MSO) - XML query languages are essentially tree
transducers -
- Consequences
- can use automata/logic techniques to analyze
- and manipulate DTDs and XML queries
-
28Example static analysis for robust data
integration
Integrated View
Common DTD
XML Source
XML Source
Source DTDs
29Example static analysis for robust data
integration
Integrated View
Tree automaton B
Common DTD
Tree transducer T
XML Source
XML Source
Tree automaton A
Source DTDs
30Example static analysis for robust data
integration
Integrated View
Tree automaton B
Common DTD
Tree transducer T
XML Source
XML Source
Tree automaton A
Source DTDs
Need to check T(A) ? B
Key T-1(B) definable in MSO
31Conclusion
- Logic has provided the foundations of databases,
- from relational databases all the way to XML
- FO lies at the core of relational database
systems - XML and its query languages are founded upon
- tree automata, tree transducers, and logics on
trees - Implementation uses extensions of relational
algebra - and builds upon relational database techniques