XML Data Management 8. XQuery - PowerPoint PPT Presentation

About This Presentation
Title:

XML Data Management 8. XQuery

Description:

XML Data Management 8. XQuery Werner Nutt Join Exercise Return all pairs of ingredients such that the ingredients have the same name, but occur with different amounts ... – PowerPoint PPT presentation

Number of Views:91
Avg rating:3.0/5.0
Slides: 59
Provided by: Preferr180
Category:
Tags: xml | bell | data | management | pepper | xquery

less

Transcript and Presenter's Notes

Title: XML Data Management 8. XQuery


1
XML Data Management 8. XQuery
  • Werner Nutt

2
Requirements for an XML Query Language
  • David Maier, W3C XML Query Requirements
  • Closedness output must be XML
  • Composability wherever a set of XML elements is
    required, a subquery is allowed as well
  • Support for key operations
  • selection
  • extraction, projection
  • restructuring
  • combination, join
  • fusion of elements

3
Requirements for an XML Query Language
  • Can benefit from a schema, but should also be
    applicable without
  • Retains the order of nodes
  • Formal semantics
  • structure of results should be derivable from
    query
  • defines equivalence of queries
  • Queries should be representable in XML
  • ? documents can have embedded queries

4
How Does One Design a Query Language?
  • In most query languages, there are two aspects to
    a query
  • Retrieving data (e.g., from where in SQL)
  • Creating output (e.g., select in SQL)
  • Retrieval consists of
  • Pattern matching (e.g., from )
  • Filtering (e.g., where )
  • although these cannot always be clearly
    distinguished

5
XQuery Principles
  • Data Model identical with the XPath data model
  • documents are ordered, labeled trees
  • nodes have identity
  • nodes can have simple or complex types
    (defined in XML Schema)
  • A query result is an ordered list/sequence of
    items
  • (nodes, values, attributes, etc., but not lists)
  • special case the empty list ()

6
XQuery Principles (cntd)
  • XQuery can be used without schemas, but
    can be checked against DTDs and XML schemas
  • XQuery is a functional language
  • no statements
  • evaluation of expressions
  • function definitions
  • modules

7
The Recipes DTD (Reminder)
  • lt!ELEMENT recipes (recipe)gt
  • lt!ELEMENT recipe (title, ingredient,
    preparation, nutrition)gt
  • lt!ELEMENT title (PCDATA)gt
  • lt!ELEMENT ingredient (ingredient,
    preparation?)gt
  • lt!ATTLIST ingredient
  • name CDATA REQUIRED
  • amount CDATA IMPLIED
  • unit CDATA IMPLIEDgt
  • lt!ELEMENT preparation (step)gt
  • lt!ELEMENT step (PCDATA)gt
  • lt!ELEMENT nutrition EMPTYgt
  • lt!ATTLIST nutrition
  • calories CDATA REQUIRED
  • fat CDATA REQUIREDgt

8
A Query over the Recipes Document
lttitlesgt for r in doc("recipes.xml")//recip
e return r/title lt/titlesgt returns
lttitlesgt lttitlegtBeef Parmesan with Garlic
Angel Hair Pastalt/titlegt lttitlegtRicotta
Pielt/titlegt lt/titlesgt


9
Query Features
lttitlesgt for r in doc("recipes.xml")//recipe
return r/title lt/titlesgt
doc(String) returns input document
Sequence of results,one for each variable binding
10
An Equivalent Stylesheet Template
  • ltxsltemplate match"/"gt
  • lttitlesgt
  • ltxslfor-each select"//recipe"gt
  • ltxslcopy-of select"title"/gt
  • lt/xslfor-eachgt
  • lt/titlesgt
  • lt/xsltemplategt

11
Features Summary
  • The result is a new XML document
  • A query consists of parts that are returned as is
  • ... and others that are evaluated (everything in
    ... )
  • Calling the function doc(String) returns
    an
    input document
  • XPath is used to retrieve node sets and values
  • Iteration over node sets for
    binds a variable to all nodes in a node set
  • Variables can be used in XPath expressions
  • return returns a sequence of results,
    one for each binding
    of a variable

12
XPath is a Fragment of XQuery
  • doc("recipes.xml")//recipe1/title
  • returns
  • lttitlegtBeef Parmesan with Garlic Angel Hair
    Pastalt/titlegt
  • doc("recipes.xml")//recipeposition()lt3
    /title
  • returns
  • lttitlegtBeef Parmesan with Garlic Angel Hair
    Pastalt/titlegt,
  • lttitlegtRicotta Pielt/titlegt,
  • lttitlegtLinguine alla Pescadoralt/titlegt

an element
a list of elements
13
Beware Attributes in XPath
  • doc("recipes.xml")//recipe1/ingredient1
    /_at_name
  • ? attribute name "beef cube steak"
  • string(doc("recipes.xml")//recipe1
    /ingredient1/_at_name)
  • ? "beef cube steak"

an attribute, represented as a constructor for
an attribute node (not in Saxon)
a value of type string
14
Beware Attributes in XPath (cntd.)
  • ltfirst-ingredientgtstring(doc("recipes.xml")//rec
    ipe1 /ingredient1/_at_name)
    lt/first-ingredientgt
  • ? ltfirst-ingredientgtbeef cube steaklt/first-ingred
    ientgt

an element with string content
15
Beware Attributes in XPath (cntd.)
  • ltfirst-ingredientgtdoc("recipes.xml")//recipe1
    /ingredient1/_at_name
  • lt/first-ingredientgt
  • ? ltfirst-ingredient name"beef cube steak"/gt

an element with an attribute
  • Note The XML that we write down is only the
    surface structure of the data model that is
    underlying XQuery

16
Beware Attributes in XPath (cntd.)
  • ltfirst-ingredient
  • oldName"doc("recipes.xml")//recipe1
    /ingredient1/_at_name"gtBeeflt/first-i
    ngredientgt
  • ? ltfirst-ingredient oldName"beef cube steak"gt
  • Beef
  • lt/first-ingredientgt

17
Constructor Syntax
  • For all constituents of documents, there are
    constructors
  • element first-ingredient
  • attribute oldName
  • string(doc("recipes.xml")//recipe1

    /ingredient1/_at_name),
  • "Beef"
  • equivalent to the notation on the
    previous slide

18
Iteration with the For-Clause
  • Syntax for var in xpath-expr
  • Example for r in doc("recipes.xml")//recipe
    return string(r)
  • The expression creates a list of bindings for a
    variable var
  • If var occurs in an expression exp,
    then exp is evaluated for
    each binding
  • For-clauses can be nested

for r in doc("recipes.xml")//recipefor v in
doc("vegetables.xml")//vegetable return ...
19
What Does This Return?
  • for i in (1,2,3)for j in (1,2,3) return
    element concat("x",i j) i j

20
Nested For-clauses Example
  • ltmy-recipesgt
  • for r in doc("recipes.xml")//recipe
  • return
  • ltmy-recipe title"r/title"gt
  • for i in r//ingredient
  • return
  • ltmy-ingredientgt
  • string(i/_at_name)
  • lt/my-ingredientgt
  • lt/my-recipegt
  • lt/my-recipesgt

Returns my-recipes with titles as attributes
and my-ingredients with names as text content
21
The Equivalent Stylesheet Template
  • ltxsltemplate match"/"gt
  • ltmy-recipesgt
  • ltxslfor-each select".//recipe"gt
  • ltmy-recipe title"title"gt
  • ltxslfor-each select"ingredient"gt
  • ltmy-ingredientgt
  • ltxslvalue-of select"_at_name"/gt
  • lt/my-ingredientgt
  • lt/xslfor-eachgt
  • lt/my-recipegt
  • lt/xslfor-eachgt
  • lt/my-recipesgt
  • lt/xsltemplategt

22
The Let Clause
  • Syntax let var xpath-expr
  • binds variable var to a list of nodes,
    with the nodes in document order
  • does not iterate over the list
  • allows one to keep intermediate results for reuse
    (not
    possible in SQL)
  • Example
  • let oorecps doc("recipes.xml")//recipe
    .//ingredient/_at_name"olive oil"

23
Let Clause Example
ltcalory-contentgt let oorecps
doc("recipes.xml")//recipe
.//ingredient/_at_name"olive oil" for r in
oorecps return ltcaloriesgt r/title/text()
" " string(r/nutrition/_at_calories)
lt/caloriesgtlt/calory-contentgt
Note the implicit string concatenation
Calories of recipes with olive oil
24
Let Clause Example (cntd.)
  • The query returns
  • ltcalory-contentgt
  • ltcaloriesgtBeef Parmesan 1167lt/caloriesgt
  • ltcaloriesgtLinguine alla Pescadora
    532lt/caloriesgt
  • lt/calory-contentgt

25
The Where Clause
  • Syntax where ltconditiongt
  • occurs before return clause
  • similar to predicates in XPath
  • comparisons on nodes
  • for node equality
  • ltlt and gtgt for document order
  • Example

for r in doc("recipes.xml")//recipe where
r//ingredient/_at_name"olive oil" return ...
26
Quantifiers
  • Syntax some/every var in ltnode-setgt
    satisfies ltexprgt
  • var is bound to all nodes in ltnode-setgt
  • Test succeeds if ltexprgt is true for some/every
    binding
  • Note if ltnode-setgt is empty, then
    some is false and all is
    true

27
Quantifiers (Example)
  • Recipes that have some compound ingredient
  • Recipes where every top level ingredient is
    non-compound

for r in doc("recipes.xml")//recipe where some
i in r/ingredient satisfies i/ingredient
return r/title
for r in doc("recipes.xml")//recipe where every
i in r/ingredient satisfies
not(i/ingredient) return r/title
28
Element Fusion
  • To every recipe, add the attribute calories!
  • ltresultgt
  • let rs doc("recipes.xml")//recipe
  • for r in rs return
  • ltrecipegt
  • r/nutrition/_at_calories
  • r/title
  • lt/recipegt
  • lt/resultgt

29
Element Fusion (cntd.)
  • The query result
  • ltresultgt
  • ltrecipe calories"1167"gt
  • lttitlegtBeef Parmesan with Garlic Angel Hair
    Pastalt/titlegt
  • lt/recipegt
  • ltrecipe calories"349"gtlttitlegtRicotta
    Pielt/titlegtlt/recipegt
  • ltrecipe calories"532"gtlttitlegtLinguine
    Pescadorolt/titlegtlt/recipegt
  • ltrecipe calories"612"gtlttitlegtZuppa
    Ingleselt/titlegtlt/recipegt
  • ltrecipe calories"8892"gt
  • lttitlegtCailles en Sarcophageslt/titlegt
  • lt/recipegt
  • lt/resultgt

30
Fusion with Mixed Syntax
  • We mix constructor and XMLSyntax
  • element result
  • let rs doc("recipes.xml")//recipe
  • for r in rs return
  • ltrecipegt
  • attribute calories r/nutrition/_at_calories
  • r/title
  • lt/recipegt

31
The Same with Constructor Syntax Only
  • element result
  • let rs doc("recipes.xml")//recipe
  • for r in rs return
  • element recipe
  • attribute calories r/nutrition/_at_calories,
  • r/title

32
Join
  • Pair every ingredient with the recipes where it
    is used!
  • let rs doc("recipes.xml")//recipe
  • for i in rs//ingredient
  • for r in rs
  • where r//ingredient/_at_namei/_at_name
  • return
  • ltusedingt
  • i/_at_name
  • r/title
  • lt/usedingt

33
Join (cntd.)
  • The query result
  • ltusedin name"beef cube steak"gt
  • lttitlegtBeef Parmesan with Garlic Angel Hair
    Pastalt/titlegt
  • lt/usedingt,
  • ltusedin name"onion, sliced into thin rings"gt
  • lttitlegtBeef Parmesan with Garlic Angel Hair
    Pastalt/titlegt
  • lt/usedingt,
  • ltusedin name"green bell pepper, sliced in
    rings"gt
  • lttitlegtBeef Parmesan with Garlic Angel Hair
    Pastalt/titlegt
  • lt/usedingt

34
Join Exercise
  • Return all pairs of ingredients such that
  • the ingredients have the same name,
  • but occur with different amounts
  • and return
  • the recipes where each of them is used
  • together with the amount being used in those
    recipes,
  • while returning every pair only once.
  • Could a query for these ingredients be expressed
    in XPath?

35
Document Inversion
  • For every ingredient, return all the recipes
    where it is used!
  • ltresultgt
  • let rs doc("recipes.xml")//recipe
  • for i in rs//ingredient
  • return
  • ltingredientgt
  • i/_at_
  • rs.//ingredient/_at_namei/_at_name/title
  • lt/ingredientgt
  • lt/resultgt

36
Document Inversion (cntd.)
  • The query result
  • ltresultgt ltingredient amount"1"
    name"Alchermes liquor" unit"cup"gt
    lttitlegtZuppa Ingleselt/titlegt lt/ingredientgt
  • ltingredient amount"2" name"olive oil"
    unit"tablespoon"gt lttitlegtBeef Parmesan with
    Garlic Angel Hair Pastalt/titlegt
    lttitlegtLinguine Pescadorolt/titlegt
  • lt/ingredientgt

37
Eliminating Duplicates
  • The function distinct-values(Node Set)
  • extracts the values of a sequence of nodes
  • creates a duplicate free list of values
  • Note the coercion nodes are cast as values!
  • Example
  • let rs doc("recipes.xml")//recipereturn
    distinct-values(rs//ingredient/_at_name)
  • yields
  • xdtuntypedAtomic("beef cube steak"),
  • xdtuntypedAtomic("onion, sliced into thin
    rings"),
  • ...

by the Galaxengine
38
Avoiding Multiple Results in a Join
  • We want that every ingredient is listed only
    once
  • Eliminate duplicates using distinct-values!
  • ltresultgtlet rs doc("recipes.xml")//recipe
    for in in distinct-values(
    rs//ingredient/_at_name) return ltrecipes
    with"in"gt rs.//ingredient/_at_namein/tit
    le lt/recipesgt lt/resultgt

39
Avoiding Multiple Results (cntd.)
  • The query result
  • ltresultgt ltrecipes with"beef cube steak"gt
    lttitlegtBeef Parmesan with Garlic Angel Hair
    Pastalt/titlegt lt/recipesgt
  • ltrecipes with"onion, sliced into thin
    rings"gt lttitlegtBeef Parmesan with Garlic
    Angel Hair Pastalt/titlegt lt/recipesgt...
    ltrecipes with"salt"gt lttitlegtLinguine
    Pescadorolt/titlegt lttitlegtCailles en
    Sarcophageslt/titlegt
  • lt/recipesgt
  • ...

40
The Order By Clause
  • Syntax order by expr ascending
    descending
  • for iname in doc("recipes.xml")//_at_name order
    by iname descending return string(iname)
  • yields
  • "whole peppercorns","whole baby clams","white
    sugar",...

41
The Order By Clause (cntd.)
  • let rs doc("recipes.xml")//_at_namefor r in
    rsorder by r/nutrition/_at_caloriesreturn
    r/title
  • In which order will the titles come?

42
The Order By Clause (cntd.)
  • The interpreter must be told whether the values
  • should be regarded as numbers or as strings

    (alphanumerical sorting is default)
  • for r in rsorder by number(r/nutrition/_at_calor
    ies)return r/title
  • Note
  • The query returns titles ...
  • but the ordering is according to calories, which
    do not appear in the output
  • Also possible in SQL! What if combined with
    distinct-values?

43
FLWOR Expresssions (pronounced flower)
  • We have now seen the main ingredients of XQuery
  • For and Let clauses, which can be mixed
  • a Where clause imposing conditions
  • an Order by clause, which determines the order of
    results
  • a Return clause, which constructs the output.
  • Combining these yields FLWOR expressions.

44
Conditionals
  • if (expr) then expr else expr
  • Example
  • let is doc("recipes.xml")//ingredientfor i
    in isnot(ingredient)let u if
    (not(i/_at_unit)) then attribute unit
    "pieces" else ()
  • creates an attribute unit"pieces" if none
    existsand an empty item list otherwise

45
Conditionals (cntd.)
  • We use the conditional to construct variants of
    ingredients
  • let is doc("recipes.xml")//ingredientfor i
    in isnot(ingredient) let u if
    (not(i/_at_unit)) then attribute
    "unit" "pieces" else
    ()returnltingredientgt i/_at_ ult/ingredientgt

46
Conditionals (cntd.)
  • The query result
  • ltingredient name"beef cube steak" amount"1.5"
    unit"pound"/gt,
  • ...
  • ltingredient name"eggs" amount"12"
    unit"pieces"/gt,

47
Exercises
  • Write queries that produce
  • A list, containing for every recipe the recipe's
    title element and an element with the number of
    calories
  • The same, ordered according to calories
  • The same, alphabetically ordered according to
    title
  • The same, ordered according to the fat content
  • The same, with title as attribute and calories as
    content.
  • A list, containing for every recipe the top level
    ingredients, dropping the lower level ingredients

48
Sample Solution 1
  • A list, containing for every recipe the recipe's
    title element and an element with the number of
    calories
  • ltresultgt
  • for r in doc("recipes.xml")//recipe
  • return
  • (r/title,
  • ltcaloriesgt
  • number(r//_at_calories)
  • lt/caloriesgt)
  • lt/resultgt

The results returned are 2-element lists. The
list constructor is ( . , . )
49
Sample Solution 6
  • ltresultsgt for r in doc("recipes.xml")//recipe
    return ltrecipegt attribute title
    r/title, for i in r/ingredient
    return if (not(i/ingredient)) then
    i else ltingredientgt
    i/_at_ lt/ingredientgt lt/recipegt
    lt/resultsgt

50
Aggregation
  • Aggregation functions count, sum, avg, min, max
  • Example The number of recipes with olive oil
  • let doc doc("recipes.xml)returnltnumbergt
    count(doc//recipe .//ingredient/_at_name
    "olive oil")lt/numbergt

51
Grouping and Aggregation
  • For each recipe, the number of simple
    ingredients
  • for r in doc("recipes.xml")//recipereturnltnumb
    ergt attribute title r/title/text()
    count(r//ingredientnot(ingredient))lt/numbergt

52
Grouping and Aggregation (cntd.)
  • The query result
  • ltnumber title"Beef Parmesan with Garlic Angel
    Hair Pasta"gt
    11lt/numbergt,
  • ltnumber title"Ricotta Pie"gt12lt/numbergt,
  • ltnumber title"Linguine Pescadoro"gt15lt/numbergt,
  • ltnumber title"Zuppa Inglese"gt8lt/numbergt,
  • ltnumber title"Cailles en Sarcophages"gt30lt/numbergt

53
Grouping and Aggregation (cntd.)
  • A list, containing for every ingredient, the
    number of occurrences of that ingredient
  • let d doc("recipes.xml")
  • let is distinct-values(d//ingredient/_at_name)
  • return
  • ltresultgt
  • for i in is
  • order by i
  • return
  • ltingredient name"i"gt
  • count(d//ingredient_at_namei)
  • lt/ingredientgt
  • lt/resultgt

54
Nested Aggregation
  • The recipe with the maximal number of calories!
  • let rs doc("recipes.xml")//recipelet
    maxCal max(rs//_at_calories)for r in
    rswhere r//_at_calories maxCalreturn
    string(r/title)
  • returns
  • "Cailles en Sarcophages"

55
User-defined Functions
  • declare function localfac(n as xsinteger) as
    xsinteger
  • if (n 0)
  • then 1
  • else n localfac(n - 1)
  • localfac(10)

FunctionDeclaration
FunctionCall
56
Example Nested Ingredients
  • declare function localnest(n as xsinteger,
    content as xsstring)
  • as element()
  • if (n 0)
  • then element ingredientcontent
  • else element ingredientlocalnest(n -
    1,content)
  • localnest(3,"Stuff")

57
What Does this Function Return?
  • declare function localdepth(n as node())
  • as xsinteger
  • if (fnempty(n/))
  • then 1
  • else let cdepths
  • for c in n/ return localdepth(c)
  • return fnmax(cdepths) 1

58
Exercise
  • Write a function
  • localelement-copy
  • that
  • takes as input a node ( XML tree)
  • produces as output a copy of the tree, but
    without the attributes
Write a Comment
User Comments (0)
About PowerShow.com