Lecture 5: XML and XQuery - PowerPoint PPT Presentation

About This Presentation
Title:

Lecture 5: XML and XQuery

Description:

Title: CS206 --- Electronic Commerce Author: Jeff Ullman Last modified by: KSU Created Date: 3/23/2002 8:14:09 PM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:237
Avg rating:3.0/5.0
Slides: 98
Provided by: jeff466
Learn more at: https://www.cs.kent.edu
Category:

less

Transcript and Presenter's Notes

Title: Lecture 5: XML and XQuery


1
Lecture 5 XML and XQuery
2
Semistructured Data
  • Another data model, based on trees.
  • Motivation flexible representation of data.
  • Often, data comes from multiple sources with
    differences in notation, meaning, etc.
  • Motivation sharing of documents among systems
    and databases.

3
Graphs of Semistructured Data
  • Nodes objects.
  • Labels on arcs (attributes, relationships).
  • Atomic values at leaf nodes (nodes with no arcs
    out).
  • Flexibility no restriction on
  • Labels out of a node.
  • Number of successors with a given label.

4
Example Data Graph
root
beer
beer
bar
manf
manf
prize
A.B.
name
name
year
award
servedAt
Bud
Gold
1995
Mlob
name
addr
Maple
Joes
5
XML
  • XML Extensible Markup Language.
  • While HTML uses tags for formatting (e.g.,
    italic), XML uses tags for semantics (e.g.,
    this is an address).
  • Key idea create tag sets for a domain (e.g.,
    genomics), and translate all data into properly
    tagged XML documents.

6
Well-Formed and Valid XML
  • Well-Formed XML allows you to invent your own
    tags.
  • Similar to labels in semistructured data.
  • Valid XML involves a DTD (Document Type
    Definition), a grammar for tags.

7
Well-Formed XML
  • Start the document with a declaration, surrounded
    by lt?xml ?gt .
  • Normal declaration is
  • lt?xml version 1.0 standalone yes ?gt
  • Standalone no DTD provided.
  • Balance of document is a root tag surrounding
    nested tags.

8
Tags
  • Tags, as in HTML, are normally matched pairs, as
    ltFOOgt lt/FOOgt .
  • Tags may be nested arbitrarily.
  • XML tags are case sensitive.

9
Example Well-Formed XML
  • lt?xml version 1.0 standalone yes ?gt
  • ltBARSgt
  • ltBARgtltNAMEgtJoes Barlt/NAMEgt
  • ltBEERgtltNAMEgtBudlt/NAMEgt
  • ltPRICEgt2.50lt/PRICEgtlt/BEERgt
  • ltBEERgtltNAMEgtMillerlt/NAMEgt
  • ltPRICEgt3.00lt/PRICEgtlt/BEERgt
  • lt/BARgt
  • ltBARgt
  • lt/BARSgt

10
XML and Semistructured Data
  • Well-Formed XML with nested tags is exactly the
    same idea as trees of semistructured data.
  • We shall see that XML also enables nontree
    structures, as does the semistructured data model.

11
Example
  • The ltBARSgt XML document is

BARS
BAR
BAR
BAR
NAME
. . .
BEER
BEER
Joes Bar
PRICE
PRICE
NAME
NAME
Bud
2.50
Miller
3.00
12
DTD Structure
  • lt!DOCTYPE ltroot taggt
  • lt!ELEMENT ltnamegt(ltcomponentsgt)gt
  • . . . more elements . . .
  • gt

13
DTD Elements
  • The description of an element consists of its
    name (tag), and a parenthesized description of
    any nested tags.
  • Includes order of subtags and their multiplicity.
  • Leaves (text elements) have PCDATA (Parsed
    Character DATA ) in place of nested tags.

14
Example DTD
  • lt!DOCTYPE BARS
  • lt!ELEMENT BARS (BAR)gt
  • lt!ELEMENT BAR (NAME, BEER)gt
  • lt!ELEMENT NAME (PCDATA)gt
  • lt!ELEMENT BEER (NAME, PRICE)gt
  • lt!ELEMENT PRICE (PCDATA)gt
  • gt

15
Element Descriptions
  • Subtags must appear in order shown.
  • A tag may be followed by a symbol to indicate its
    multiplicity.
  • zero or more.
  • one or more.
  • ? zero or one.
  • Symbol can connect alternative sequences of
    tags.

16
Example Element Description
  • A name is an optional title (e.g., Prof.), a
    first name, and a last name, in that order, or it
    is an IP address
  • lt!ELEMENT NAME (
  • (TITLE?, FIRST, LAST) IPADDR
  • )gt

17
Use of DTDs
  • Set standalone no.
  • Either
  • Include the DTD as a preamble of the XML
    document, or
  • Follow DOCTYPE and the ltroot taggt by SYSTEM and a
    path to the file where the DTD can be found.

18
Example (a)
  • lt?xml version 1.0 standalone no ?gt
  • lt!DOCTYPE BARS
  • lt!ELEMENT BARS (BAR)gt
  • lt!ELEMENT BAR (NAME, BEER)gt
  • lt!ELEMENT NAME (PCDATA)gt
  • lt!ELEMENT BEER (NAME, PRICE)gt
  • lt!ELEMENT PRICE (PCDATA)gt
  • gt
  • ltBARSgt
  • ltBARgtltNAMEgtJoes Barlt/NAMEgt
  • ltBEERgtltNAMEgtBudlt/NAMEgt ltPRICEgt2.50lt/PRICEgtlt/BEER
    gt
  • ltBEERgtltNAMEgtMillerlt/NAMEgt ltPRICEgt3.00lt/PRICEgtlt/B
    EERgt
  • lt/BARgt
  • ltBARgt
  • lt/BARSgt

19
Example (b)
  • Assume the BARS DTD is in file bar.dtd.
  • lt?xml version 1.0 standalone no ?gt
  • lt!DOCTYPE BARS SYSTEM bar.dtdgt
  • ltBARSgt
  • ltBARgtltNAMEgtJoes Barlt/NAMEgt
  • ltBEERgtltNAMEgtBudlt/NAMEgt
  • ltPRICEgt2.50lt/PRICEgtlt/BEERgt
  • ltBEERgtltNAMEgtMillerlt/NAMEgt
  • ltPRICEgt3.00lt/PRICEgtlt/BEERgt
  • lt/BARgt
  • ltBARgt
  • lt/BARSgt

20
Attributes
  • Opening tags in XML can have attributes.
  • In a DTD,
  • lt!ATTLIST E . . . gt
  • declares an attribute for element E, along with
    its datatype.

21
Example Attributes
  • Bars can have an attribute kind, a character
    string describing the bar.
  • lt!ELEMENT BAR (NAME BEER)gt
  • lt!ATTLIST BAR kind CDATA IMPLIEDgt

22
Example Attribute Use
  • In a document that allows BAR tags, we might see
  • ltBAR kind sushigt
  • ltNAMEgtAkasakalt/NAMEgt
  • ltBEERgtltNAMEgtSapporolt/NAMEgt
  • ltPRICEgt5.00lt/PRICEgtlt/BEERgt
  • ...
  • lt/BARgt

23
IDs and IDREFs
  • Attributes can be pointers from one object to
    another.
  • Compare to HTMLs NAME foo and HREF foo.
  • Allows the structure of an XML document to be a
    general graph, rather than just a tree.

24
Creating IDs
  • Give an element E an attribute A of type ID.
  • When using tag ltE gt in an XML document, give its
    attribute A a unique value.
  • Example
  • ltE A xyzgt

25
Creating IDREFs
  • To allow objects of type F to refer to another
    object with an ID attribute, give F an attribute
    of type IDREF.
  • Or, let the attribute have type IDREFS, so the F
    object can refer to any number of other objects.

26
Example IDs and IDREFs
  • Lets redesign our BARS DTD to include both BAR
    and BEER subelements.
  • Both bars and beers will have ID attributes
    called name.
  • Bars have SELLS subobjects, consisting of a
    number (the price of one beer) and an IDREF
    theBeer leading to that beer.
  • Beers have attribute soldBy, which is an IDREFS
    leading to all the bars that sell it.

27
The DTD
  • lt!DOCTYPE BARS
  • lt!ELEMENT BARS (BAR, BEER)gt
  • lt!ELEMENT BAR (SELLS)gt
  • lt!ATTLIST BAR name ID REQUIREDgt
  • lt!ELEMENT SELLS (PCDATA)gt
  • lt!ATTLIST SELLS theBeer IDREF REQUIREDgt
  • lt!ELEMENT BEER EMPTYgt
  • lt!ATTLIST BEER name ID REQUIREDgt
  • lt!ATTLIST BEER soldBy IDREFS IMPLIEDgt
  • gt

28
Example Document
  • ltBARSgt
  • ltBAR name JoesBargt
  • ltSELLS theBeer Budgt2.50lt/SELLSgt
  • ltSELLS theBeer Millergt3.00lt/SELLSgt
  • lt/BARgt
  • ltBEER name Bud soldBy JoesBar
  • SuesBar /gt
  • lt/BARSgt

29
Empty Elements
  • We can do all the work of an element in its
    attributes.
  • Like BEER in previous example.
  • Another example SELLS elements could have
    attribute price rather than a value that is a
    price.

30
Example Empty Element
  • In the DTD, declare
  • lt!ELEMENT SELLS EMPTYgt
  • lt!ATTLIST SELLS theBeer IDREF REQUIREDgt
  • lt!ATTLIST SELLS price CDATA REQUIREDgt
  • Example use
  • ltSELLS theBeer Bud price 2.50/gt

31
XPath
  • Path Expressions
  • Conditions

32
Paths in XML Documents
  • XPath is a language for describing paths in XML
    documents.
  • Really think of the semistructured data graph and
    its paths.

33
Example DTD
  • lt!DOCTYPE BARS
  • lt!ELEMENT BARS (BAR, BEER)gt
  • lt!ELEMENT BAR (PRICE)gt
  • lt!ATTLIST BAR name ID REQUIREDgt
  • lt!ELEMENT PRICE (PCDATA)gt
  • lt!ATTLIST PRICE theBeer IDREF REQUIREDgt
  • lt!ELEMENT BEER EMPTYgt
  • lt!ATTLIST BEER name ID REQUIREDgt
  • lt!ATTLIST BEER soldBy IDREFS IMPLIEDgt
  • gt

34
Example Document
  • ltBARSgt
  • ltBAR name JoesBargt
  • ltPRICE theBeer Budgt2.50lt/PRICEgt
  • ltPRICE theBeer Millergt3.00lt/PRICEgt
  • lt/BARgt
  • ltBEER name Bud soldBy JoesBar
  • SuesBar /gt
  • lt/BARSgt

35
Path Descriptors
  • Simple path descriptors are sequences of tags
    separated by slashes (/).
  • If the descriptor begins with /, then the path
    starts at the root and has those tags, in order.
  • If the descriptor begins with //, then the path
    can start anywhere.

36
Value of a Path Descriptor
  • Each path descriptor, applied to a document, has
    a value that is a sequence of elements.
  • An element is an atomic value or a node.
  • A node is matching tags and everything in
    between.
  • I.e., a node of the semistructured graph.

37
Example /BARS/BAR/PRICE
  • ltBARSgt
  • ltBAR name JoesBargt
  • ltPRICE theBeer Budgt2.50lt/PRICEgt
  • ltPRICE theBeer Millergt3.00lt/PRICEgt
  • lt/BARgt
  • ltBEER name Bud soldBy JoesBar
  • SuesBar /gt
  • lt/BARSgt

38
Example //PRICE
  • ltBARSgt
  • ltBAR name JoesBargt
  • ltPRICE theBeer Budgt2.50lt/PRICEgt
  • ltPRICE theBeer Millergt3.00lt/PRICEgt
  • lt/BARgt
  • ltBEER name Bud soldBy JoesBar
  • SuesBar /gt
  • lt/BARSgt

39
Wild-Card
  • A star () in place of a tag represents any one
    tag.
  • Example ///PRICE represents all price objects
    at the third level of nesting.

40
Example /BARS/
  • ltBARSgt
  • ltBAR name JoesBargt
  • ltPRICE theBeer Budgt2.50lt/PRICEgt
  • ltPRICE theBeer Millergt3.00lt/PRICEgt
  • lt/BARgt
  • ltBEER name Bud soldBy JoesBar
  • SuesBar /gt
  • lt/BARSgt

41
Attributes
  • In XPath, we refer to attributes by prepending _at_
    to their name.
  • Attributes of a tag may appear in paths as if
    they were nested within that tag.

42
Example /BARS//_at_name
  • ltBARSgt
  • ltBAR name JoesBargt
  • ltPRICE theBeer Budgt2.50lt/PRICEgt
  • ltPRICE theBeer Millergt3.00lt/PRICEgt
  • lt/BARgt
  • ltBEER name Bud soldBy JoesBar
  • SuesBar /gt
  • lt/BARSgt

43
Selection Conditions
  • A condition inside may follow a tag.
  • If so, then only paths that have that tag and
    also satisfy the condition are included in the
    result of a path expression.

44
Example Selection Condition
  • /BARS/BARPRICE lt 2.75/PRICE
  • ltBARSgt
  • ltBAR name JoesBargt
  • ltPRICE theBeer Budgt2.50lt/PRICEgt
  • ltPRICE theBeer Millergt3.00lt/PRICEgt
  • lt/BARgt

45
Example Attribute in Selection
  • /BARS/BAR/PRICE_at_theBeer Miller
  • ltBARSgt
  • ltBAR name JoesBargt
  • ltPRICE theBeer Budgt2.50lt/PRICEgt
  • ltPRICE theBeer Millergt3.00lt/PRICEgt
  • lt/BARgt

46
Axes
  • In general, path expressions allow us to start at
    the root and execute steps to find a sequence of
    nodes at each step.
  • At each step, we may follow any one of several
    axes.
  • The default axis is child --- go to all the
    children of the current set of nodes.

47
Example Axes
  • /BARS/BEER is really shorthand for
    /BARS/childBEER .
  • _at_ is really shorthand for the attribute axis.
  • Thus, /BARS/BEER_at_name Bud is shorthand for
  • /BARS/BEERattributename Bud

48
More Axes
  • Some other useful axes are
  • parent parent(s) of the current node(s).
  • descendant-or-self the current node(s) and
    all descendants.
  • Note // is really shorthand for this axis.
  • ancestor, ancestor-or-self, etc.

49
XQuery
  • Values
  • FLWR Expressions
  • Other Expressions

50
XQuery
  • XQuery extends XPath to a query language that has
    power similar to SQL.
  • XQuery is an expression language.
  • Like relational algebra --- any XQuery expression
    can be an argument of any other XQuery
    expression.
  • Unlike RA, with the relation as the sole
    datatype, XQuery has a subtle type system.

51
The XQuery Type System
  • Atomic values strings, integers, etc.
  • Also, certain constructed values like true(),
    date(2004-09-30).
  • Nodes.
  • Seven kinds.
  • Well only worry about four, on next slide.

52
Some Node Types
  • Element Nodes are like nodes of semistructured
    data.
  • Described by !ELEMENT declarations in DTDs.
  • Attribute Nodes are attributes, described by
    !ATTLIST declarations in DTDs.
  • Text Nodes PCDATA.
  • Document Nodes represent files.

53
Example Document
  • ltBARSgt
  • ltBAR name JoesBargt
  • ltPRICE theBeer Budgt2.50lt/PRICEgt
  • ltPRICE theBeer Millergt3.00lt/PRICEgt
  • lt/BARgt
  • ltBEER name Bud soldBy JoesBar
  • SuesBar /gt
  • lt/BARSgt

54
Example Nodes
BARS
name JoesBar
SoldBy
name Bud
BEER
BAR
theBeer Miller
theBeer Bud
PRICE
PRICE
3.00
2.50
Green element Gold attribute Purple text
55
Document Nodes
  • Form document(ltfile namegt).
  • Establishes a document to which a query applies.
  • Example document(/usr/ullman/bars.xml)

56
FLWR Expressions
  1. One or more for and/or let clauses.
  2. Then an optional where clause.
  3. A return clause.

57
Semantics of FLWR Expressions
  • Each for creates a loop.
  • let produces only a local definition.
  • At each iteration of the nested loops, if any,
    evaluate the where clause.
  • If the where clause returns TRUE, invoke the
    return clause, and append its value to the output.

58
FOR Clauses
  • for ltvariablegt in ltexpressiongt, . . .
  • Variables begin with .
  • A for-variable takes on each item in the sequence
    denoted by the expression, in turn.
  • Whatever follows this for is executed once for
    each value of the variable.

59
Example FOR
  • for beer in document(bars.xml)/BARS/BEER/_at_name
  • return
  • ltBEERNAMEgt beer lt/BEERNAMEgt
  • beer ranges over the name attributes of all
    beers in our example document.
  • Result is a list of tagged names, like
    ltBEERNAMEgtBudlt/BEERNAMEgt ltBEERNAMEgtMillerlt/BEERNAM
    Egt . . .

60
LET Clauses
  • let ltvariablegt ltexpressiongt, . . .
  • Value of the variable becomes the sequence of
    items defined by the expression.
  • Note let does not cause iteration for does.

61
Example LET
  • let d document(bars.xml)
  • let beers d/BARS/BEER/_at_name
  • return
  • ltBEERNAMESgt beers lt/BEERNAMESgt
  • Returns one element with all the names of the
    beers, like
  • ltBEERNAMESgtBud Miller lt/BEERNAMESgt

62
Following IDREFs
  • XQuery (but not XPath) allows us to use paths
    that follow attributes that are IDREFs.
  • If x denotes a sequence of one or more IDREFs,
    then x gty denotes all the elements with tag y
    whose IDs are one of these IDREFs.

63
Example
  • Find all the beer elements where the beer is sold
    by Joes Bar for less than 3.00.
  • Strategy
  • beer will for-loop over all beer elements.
  • For each beer, let joe be either the Joes-Bar
    element, if Joe sells the beer, or the empty
    sequence if not.
  • Test whether joe sells the beer for lt 3.00.

64
Example The Query
  • let d document(bars.xml)
  • for beer in d/BARS/BEER
  • let joe beer/_at_soldBygtBAR_at_nameJoesBar
  • let joePrice joe/PRICE_at_theBeerbeer/_at_name
  • where joePrice lt 3.00
  • return ltCHEAPBEERgt beer lt/CHEAPBEERgt

65
Order-By Clauses
  • FLWR is really FLWOR an order-by clause can
    precede the return.
  • Form order by ltexpressiongt
  • With optional ascending or descending.
  • The expression is evaluated for each output
    element.
  • Determines placement in output sequence.

66
Example Order-By
  • List all prices for Bud, lowest first.
  • let d document(bars.xml)
  • for p in d/BARS/BAR/PRICE_at_theBeerBud
  • order by p
  • return p

67
Predicates
  • Normally, conditions imply existential
    quantification.
  • Example /BARS/BAR_at_name means all the bars
    that have a name.
  • Example /BARS/BAR_at_nameJoesBar/PRICE
    /BARS/BAR_at_nameSuesBar/PRICE means Joe and
    Sue have at least one price in common.

68
Path Expression Examples
Bib
o1
paper
paper
book
  • Doc

references
o12
o24
o29
references
references
author
page
author
year
author
title
http
title
title
publisher
author
author
author
o43
25
o44
o45
o46
o52
96
1997
o51
o50
o49
o47
o48
first
last
firstname
lastname
lastname
firstname
o70
o71
243
206
Serge
Abiteboul
Victor
122
133
Vianu
Bib/paper lto12,o29gt Bib/book/publisher
lto51gt Bib/paper/author/lastname lto71,206gt
Note that order of elements matters!
69
FOR vs. LET Example
Returns ltresultgt ltbookgt...lt/bookgtlt/resultgt
ltresultgt ltbookgt...lt/bookgtlt/resultgt ltresultgt
ltbookgt...lt/bookgtlt/resultgt ...
FOR x IN document("bib.xml")/bib/book RETURN
ltresultgt x lt/resultgt
Returns ltresultgt ltbookgt...lt/bookgt
ltbookgt...lt/bookgt ltbookgt...lt/bookgt
... lt/resultgt
LET x IN document("bib.xml")/bib/book RETURN
ltresultgt x lt/resultgt
70
XQuery Example 1
  • Find all book titles published after 1995

FOR x IN document("bib.xml")/bib/book WHERE
x/year gt 1995 RETURN x/title
Result lttitlegt abc lt/titlegt lttitlegt def
lt/titlegt lttitlegt ghi lt/titlegt
71
XQuery Example 2
  • For each author of a book by Morgan Kaufmann,
    list all books she published

FOR a IN distinct(document("bib.xml")
/bib/bookpublisherMorgan
Kaufmann/author) RETURN ltresultgt
a, FOR t IN
/bib/bookauthora/title
RETURN t lt/resultgt
distinct a function that eliminates duplicates
(after converting inputs to atomic values)
72
Results for Example 2
  • ltresultgt
  • ltauthorgtJoneslt/authorgt
  • lttitlegt abc lt/titlegt
  • lttitlegt def lt/titlegt
  • lt/resultgt
  • ltresultgt
  • ltauthorgt Smith lt/authorgt
  • lttitlegt ghi lt/titlegt
  • lt/resultgt

Observe how nested structure of result elements
is determined by the nested structure of the
query.
73
XQuery Example 3
ltbig_publishersgt FOR p IN
distinct(document("bib.xml")//publisher)
LET b document("bib.xml")/bookpublisher
p WHERE count(b) gt 100 RETURN
p lt/big_publishersgt
For each publisher p
  • Let the list of books
  • published by p be b

Count the books in b, and return p if b gt 100
count (aggregate) function that returns the
number of elements
74
XQuery Example 4
  • Find books whose price is larger than average

LET aavg(document("bib.xml")/bib/book/price) FOR
b in document("bib.xml")/bib/book WHERE
b/price gt a RETURN b
75
Collections in XQuery
  • Ordered and unordered collections
  • /bib/book/author an ordered collection
  • Distinct(/bib/book/author) an unordered
    collection
  • Examples
  • LET a /bib/book ? a is a collection stmt
    iterates over all books in collecion
  • b/author ? also a collection (several
    authors...)

Returns a single collection! ltresultgt
ltauthorgt...lt/authorgt
ltauthorgt...lt/authorgt
ltauthorgt...lt/authorgt ...
lt/resultgt
However
RETURN ltresultgt b/author lt/resultgt
76
Collections in XQuery
  • What about collections in expressions ?
  • b/price ? list of n
    prices
  • b/price 0.7 ? list of n numbers??
  • b/price b/quantity ? list of n x m numbers ??
  • Valid only if the two sequences have at most one
    element
  • Atomization
  • book1/author eq "Kennedy" - Value Comparison
  • book1/author "Kennedy" - General Comparison

77
Sorting in XQuery
ltpublisher_listgt FOR p IN distinct(document("
bib.xml")//publisher) ORDERBY p RETURN
ltpublishergt ltnamegt p/text() lt/namegt ,
FOR b IN document("bib.xml")//bookp
ublisher p ORDERBY
b/price DESCENDING RETURN ltbookgt

b/title ,
b/price
lt/bookgt
lt/publishergt lt/publisher_listgt
78
Conditional Expressions If-Then-Else
FOR h IN //holding ORDERBY h/title RETURN
ltholdinggt h/title,
IF h/_at_type "Journal"
THEN h/editor
ELSE h/author
lt/holdinggt
79
Existential Quantifiers
FOR b IN //book WHERE SOME p IN b//para
SATISFIES contains(p, "sailing") AND
contains(p, "windsurfing") RETURN b/title
80
Universal Quantifiers
FOR b IN //book WHERE EVERY p IN b//para
SATISFIES contains(p, "sailing") RETURN
b/title
81
Other Stuff in XQuery
  • Before and After
  • for dealing with order in the input
  • Filter
  • deletes some edges in the result tree
  • Recursive functions
  • Namespaces
  • References, links
  • Lots more stuff

82
AppendixXML Schema and XQuery Data Model
83
XML Schema
  • Includes primitive data types (integers, strings,
    dates, etc.)
  • Supports value-based constraints (integers gt 100)
  • User-definable structured types
  • Inheritance (extension or restriction)
  • Foreign keys
  • Element-type reference constraints

84
Sample XML Schema
  • ltschema version1.0 xmlnshttp//www.w3.org/199
    9/XMLSchemagt
  • ltelement nameauthor typestring /gt
  • ltelement namedate type date /gt
  • ltelement nameabstractgt
  • lttypegt
  • lt/typegt
  • lt/elementgt
  • ltelement namepapergt
  • lttypegt
  • ltattribute namekeywords typestring/gt
  • ltelement refauthor minOccurs0
    maxOccurs /gt
  • ltelement refdate /gt
  • ltelement refabstract minOccurs0
    maxOccurs1 /gt
  • ltelement refbody /gt
  • lt/typegt
  • lt/elementgt
  • lt/schemagt

85
XML-Query Data Model
  • Describes XML data as a tree
  • Node DocNode ElemNode
    ValueNode
    AttrNode NSNode
    PINode CommentNode
    InfoItemNode
    RefNode

http//www.w3.org/TR/query-datamodel/2/2001
86
XML-Query Data Model
  • Element node (simplified definition)
  • elemNode (QNameValue,
    AttrNode , ElemNode
    ValueNode) ? ElemNode
  • QNameValue means a tag name

Reads Give me a tag, a set of attributes, a
list of elements/values, and I will return an
element
87
XML Query Data Model
  • Example

book1 elemNode(book, price2, currency3,
title4, author5, author6,
author7, year8) price2 attrNode() /
next /currency3 attrNode()title4
elemNode(title, string9)
ltbook price 55 currency USDgt
lttitlegt Foundations lt/titlegt ltauthorgt
Abiteboul lt/authorgt ltauthorgt Hull lt/authorgt
ltauthorgt Vianu lt/authorgt ltyeargt 1995
lt/yeargt lt/bookgt
88
(No Transcript)
89
XQuery Values
  • Item node or atomic value.
  • Value ordered sequence of zero or more items.
  • Examples
  • () empty sequence.
  • (Hello, World)
  • (Hello, ltPRICEgt2.50lt/PRICEgt, 10)

90
Nesting of Sequences Ignored
  • A value can, in principle, be an item of another
    value.
  • But nested list structures are expanded.
  • Example ((1,2),(),(3,(4,5))) (1,2,3,4,5)
    1,2,3,4,5.
  • Important when values are computed by
    concatenating other values.

91
Effective Boolean Values
  • The effective boolean value (EBV) of an
    expression is
  • The actual value if the expression is of type
    boolean.
  • FALSE if the expression evaluates to 0, the
    empty string, or () the empty sequence.
  • TRUE otherwise.

92
EBV Examples
  1. _at_nameJoesBar has EBV TRUE or FALSE, depending
    on whether the name attribute is JoesBar.
  2. /BARS/BAR_at_nameGoldenRail has EBV TRUE if
    some bar is named the Golden Rail, and FALSE if
    there is no such bar.

93
Boolean Operators
  • E1 and E2, E1 or E2, not(E ), if
    (E1) then E2 else E3 apply to any expressions.
  • Take EBVs of the expressions first.
  • Example not(3 eq 5 or 0) has value TRUE.
  • Also true() and false() are functions that
    return values TRUE and FALSE.

94
Quantifier Expressions
  • some x in E1 satisfies E2
  • Evaluate the sequence E1.
  • Let x (any variable) be each item in the
    sequence, and evaluate E2.
  • Return TRUE if E2 has EBV TRUE for at least one
    x.
  • Analogously
  • every x in E1 satisfies E2

95
Document Order
  • Comparison by document order ltlt and gtgt.
  • Example d/BARS/BEER_at_nameBud ltlt
    d/BARS/BEER_at_nameMiller is true iff the Bud
    element appears before the Miller element in the
    document d.

96
Set Operators
  • union, intersect, except operate on sequences of
    nodes.
  • Meanings analogous to SQL.
  • Result eliminates duplicates.
  • Result appears in document order.

97
Other Operators
  • Use Fortran comparison operators to compare
    atomic values only.
  • eq, ne, gt, ge, lt, le.
  • Arithmetic operators , - , , div, idiv, mod.
  • Apply to any expressions that yield arithmetic or
    date/time values.
Write a Comment
User Comments (0)
About PowerShow.com