Title: End of XML
1End of XML
2FLWR (Flower) Expressions
- FOR ...
- LET...
- WHERE...
- RETURN...
3XQuery
- Find book titles by the coauthors of Database
Theory
FOR x IN bib/booktitle/text() Database
Theory/author y IN bib/bookauthor/tex
t() x/text()/title RETURN ltanswergt
y/text() lt/answergt
Result ltanswergt abc lt/ answer gt lt answer gt
def lt/ answer gt lt answer gt ghi lt/ answer gt
The answer willcontain duplicates !
4XQuery
- Same as before, but eliminate duplicates
FOR x IN bib/booktitle/text() Database
Theory/author y IN distinct(bib/booka
uthor/text() x/text()/title) RETURN ltanswergt
y/text() lt/answergt
Result ltanswergt abc lt/ answer gt lt answer gt
def lt/ answer gt lt answer gt ghi lt/ answer gt
distinct a function that eliminates duplicates
5SQL and XQuery Side-by-side
Product(pid, name, maker)Company(cid, name, city)
Find all products made in Seattle
FOR x in /db/Product/row y in
/db/Company/rowWHERE x/maker/text()y/cid/
text() and y/city/text() SeattleRETURN
x/name
SELECT x.nameFROM Product x, Company yWHERE
x.makery.cid and y.citySeattle
SQL
XQuery
6XQuery Nesting
- For each author of a book by Morgan Kaufmann,
list all books she published
FOR a IN distinct(document("bib.xml")
/bib/bookpublisherMorgan
Kaufmann/author) RETURN ltresultgt
a, FOR t IN
/bib/bookauthora/title
RETURN t
lt/resultgt
7XQuery
Result
- ltresultgt
- ltauthorgtJoneslt/authorgt
- lttitlegt abc lt/titlegt
- lttitlegt def lt/titlegt
- lt/resultgt
- ltresultgt
- ltauthorgt Smith lt/authorgt
- lttitlegt ghi lt/titlegt
- lt/resultgt
8XQuery
- FOR x in expr -- binds x to each value in the
list expr - LET x expr -- binds x to the entire list
expr - Useful for common subexpressions and for
aggregations
9XQuery
ltbig_publishersgt FOR p IN
distinct(document("bib.xml")//publisher)
LET b document("bib.xml")/bookpublisher
p WHERE count(b) gt 100 RETURN
p lt/big_publishersgt
count a (aggregate) function that returns the
number of elms
10XQuery
- Find books whose price is larger than average
LET aavg(document("bib.xml")/bib/book/price) FOR
b in document("bib.xml")/bib/book WHERE
b/price gt a RETURN b
Lets try to write this in SQL
11XQuery
- Summary
- FOR-LET-WHERE-RETURN FLWR
FOR/LET Clauses
List of tuples
WHERE Clause
List of tuples
RETURN Clause
Instance of Xquery data model
12FOR v.s. LET
- FOR
- Binds node variables ? iteration
- LET
- Binds collection variables ? one value
13FOR v.s. LET
Returns ltresultgt ltbookgt...lt/bookgtlt/resultgt
ltresultgt ltbookgt...lt/bookgtlt/resultgt ltresultgt
ltbookgt...lt/bookgtlt/resultgt ...
FOR x IN document("bib.xml")/bib/book RETURN
ltresultgt x lt/resultgt
LET x IN document("bib.xml")/bib/book RETURN
ltresultgt x lt/resultgt
Returns ltresultgt ltbookgt...lt/bookgt
ltbookgt...lt/bookgt
ltbookgt...lt/bookgt ... lt/resultgt
14Collections in XQuery
- Ordered and unordered collections
- /bib/book/author an ordered collection
- Distinct(/bib/book/author) an unordered
collection - LET a /bib/book ? a is a collection
- b/author ? a collection (several authors...)
Returns ltresultgt ltauthorgt...lt/authorgt
ltauthorgt...lt/authorgt
ltauthorgt...lt/authorgt
... lt/resultgt
RETURN ltresultgt b/author lt/resultgt
15Collections in XQuery
- What about collections in expressions ?
- b/price ? list of n
prices - b/price 0.7 ? list of n numbers
- b/price b/quantity ? list of n x m numbers
?? - b/price (b/quant1 b/quant2) ? b/price
b/quant1 b/price b/quant2 !!
16Sorting in XQuery
ltpublisher_listgt FOR p IN distinct(document("
bib.xml")//publisher) RETURN ltpublishergt
ltnamegt p/text() lt/namegt ,
FOR b IN document("bib.xml")//bookpublisher
p RETURN ltbookgt
b/title ,
b/price
lt/bookgt SORTBY(price DESCENDING)
lt/publishergt SORTBY(name)
lt/publisher_listgt
17If-Then-Else
FOR h IN //holding RETURN ltholdinggt
h/title,
IF h/_at_type "Journal"
THEN h/editor
ELSE h/author
lt/holdinggt SORTBY (title)
18Existential Quantifiers
FOR b IN //book WHERE SOME p IN b//para
SATISFIES contains(p,
"sailing") AND
contains(p, "windsurfing") RETURN b/title
19Universal Quantifiers
FOR b IN //book WHERE EVERY p IN b//para
SATISFIES contains(p,
"sailing") RETURN b/title
20The Role of XML Data
- XML is designed for data exchange, not to replace
relational or E/R data - Sources of XML data
- Created manually with text editors not really
data - Generated automatically from relational data
(will discuss next) - Text files, replacing older data formats Web
server logs, scientific data (biological,
astronomical) - Stored/processed in native XML engines very few
applications need that today
21XML from/to Relational Data
- XML publishing
- relational data ? XML
- XML storage
- XML ? relational data
22XML Publishing
- Exporting the data is easy we do this already
for HTML - Translating XQuery ? SQL is hard
- XML publishing systems
- Research Experanto (IBM/DB2), SilkRoute (ATT
Labs and UW) - XQuery ? SQL
- Commercial SQL Server, Oracle
- only Xpath ? SQL and with restrictions
23XML Publishing
- How do we choose the output structure ?
- Determined by agreement
- Or dictated by committees
- XML dialects (called applications) DTDs
- XML Data is often nested, irregular, etc
- No normal forms for XML
24XML Storage
- Most often the XML data is small
- E.g. a SOAP message
- Parsed directly into the application (DOM API)
- Sometimes XML data is large
- need to store/process it in a database
- The XML storage problem
- How do we choose the schema of the database ?
25XML Storage
- Two solutions
- Schema derived from DTD
- Storing XML as a graph Edge relation