COMP3311: DTD, Schema, XQuery - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

COMP3311: DTD, Schema, XQuery

Description:

part of the original XML specification. an XML document may have ... Based on Quilt (which is based on XML-QL) http://www.w3.org/TR/xquery/ XML Query data model ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 42
Provided by: cseUn
Category:
Tags: dtd | comp3311 | quilt | schema | xquery

less

Transcript and Presenter's Notes

Title: COMP3311: DTD, Schema, XQuery


1
COMP3311 DTD, Schema, XQuery
  • Week 13

Modified from Prof. Dan Sucius notes.
2
XMLDocument Type Definitions
  • part of the original XML specification
  • an XML document may have a DTD
  • terminology for XML
  • well-formed if tags are correctly closed
  • valid if it has a DTD and conforms to it
  • validation is useful in data exchange

3
Very Simple DTD
lt!DOCTYPE company lt!ELEMENT company
((personproduct))gt lt!ELEMENT person (ssn,
name, office, phone?)gt lt!ELEMENT ssn
(PCDATA)gt lt!ELEMENT name (PCDATA)gt
lt!ELEMENT office (PCDATA)gt lt!ELEMENT phone
(PCDATA)gt lt!ELEMENT product (pid, name,
description?)gt lt!ELEMENT pid (PCDATA)gt
lt!ELEMENT description (PCDATA)gt gt
4
Very Simple DTD
Example of valid XML document
ltcompanygt ltpersongt ltssngt 123456789 lt/ssngt
ltnamegt John lt/namegt
ltofficegt B432 lt/officegt
ltphonegt 1234 lt/phonegt lt/persongt
ltpersongt ltssngt 987654321 lt/ssngt
ltnamegt Jim lt/namegt
ltofficegt B123 lt/officegt lt/persongt
ltproductgt ... lt/productgt ... lt/companygt
5
Content Model
  • Element content what we can put in an element
    (aka content model)
  • Content model
  • Complex a regular expression over other
    elements
  • Text-only PCDATA
  • Empty EMPTY
  • Any ANY
  • Mixed content (PCDATA A B C)
  • (i.e. very restrictied)

6
Attributes in DTDs
lt!ELEMENT person (ssn, name, office,
phone?)gt lt!ATTLIS person age CDATA
REQUIREDgt
ltperson age25gt ltnamegt ....lt/namegt
... lt/persongt
7
Attributes in DTDs
lt!ELEMENT person (ssn, name, office,
phone?)gt lt!ATTLIS person age CDATA
REQUIRED id
ID REQUIRED
manager IDREF REQUIRED
manages IDREFS
REQUIRED gt
ltperson age25 idp29432
managerp48293 managesp34982
p423234gt ltnamegt ....lt/namegt
... lt/persongt
8
Attributes in DTDs
  • Types
  • CDATA string
  • ID key
  • IDREF foreign key
  • IDREFS foreign keys
    separated by space
  • (Monday Wednesday Friday) enumeration
  • NMTOKEN, NMTOKENS, ENTITY
  • you
    dont want to know this

9
Attributes in DTDs
  • Kind
  • REQUIRED
  • IMPLIED optional
  • value default value
  • value FIXED the only value allowed

10
Using DTDs
  • Must include in the XML document
  • Either include the entire DTD
  • lt!DOCTYPE rootElement ....... gt
  • Or include a reference to it
  • lt!DOCTYPE rootElement SYSTEM http//www.mydtd.org
    gt
  • Or mix the two... (e.g. to override the external
    definition)

11
DTDs as Grammars
lt!DOCTYPE paper lt!ELEMENT paper
(section)gt lt!ELEMENT section ((title,section)
text)gt lt!ELEMENT title (PCDATA)gt
lt!ELEMENT text (PCDATA)gt gt
ltpapergt ltsectiongt lttextgt lt/textgt lt/sectiongt
ltsectiongt lttitlegt lt/titlegt ltsectiongt
lt/sectiongt
ltsectiongt lt/sectiongt
lt/sectiongt lt/papergt
12
DTDs as Grammars
  • A DTD a grammar
  • A valid XML document a parse tree for that
    grammar

13
DTDs as Schemas
  • Not so well suited
  • impose unwanted constraints on order
    lt!ELEMENT person (name,phone)gt
  • references cannot be constrained
  • can be too vague
  • lt!ELEMENT person ((namephoneemail))gt

14
XML Schemas
  • http//www.w3.org/TR/xmlschema-1/10/2000
  • generalizes DTDs
  • uses XML syntax
  • two documents structure and datatypes
  • http//www.w3.org/TR/xmlschema-1
  • http//www.w3.org/TR/xmlschema-2
  • XML-Schema is very complex
  • often criticized
  • some alternative proposals

15
XML Schemas
  • ltxsdelement namepaper typepapertype/gt
  • ltxsdcomplexType namepapertypegt
  • ltxsdsequencegt
  • ltxsdelement nametitle
    typexsdstring/gt
  • ltxsdelement nameauthor
    minOccurs0/gt
  • ltxsdelement nameyear/gt
  • ltxsd choicegt lt xsdelement
    namejournal/gt
  • ltxsdelement
    nameconference/gt
  • lt/xsdchoicegt
  • lt/xsdsequencegt
  • lt/xsdelementgt

DTD lt!ELEMENT paper (title,author?,year,
(journalconference))gt
16
Elements v.s. Types in XML Schema
ltxsdelement namepersongt ltxsdcomplexTypegt
ltxsdsequencegt ltxsdelement namename
typexsdstring/gt
ltxsdelement nameaddress
typexsdstring/gt lt/xsdsequencegt
lt/xsdcomplexTypegtlt/xsdelementgt
ltxsdelement nameperson
typetttgtltxsdcomplexType nametttgt
ltxsdsequencegt ltxsdelement namename
typexsdstring/gt
ltxsdelement nameaddress
typexsdstring/gt lt/xsdsequencegtlt/xsdco
mplexTypegt
DTD lt!ELEMENT person (name,address)gt
17
  • Types
  • Simple types (integers, strings, ...)
  • Complex types (regular expressions, like in DTDs)
  • Element-type-element alternation
  • Root element has a complex type
  • That type is a regular expression of elements
  • Those elements have their complex types...
  • ...
  • On the leaves we have simple types

18
Local and Global Types in XML Schema
  • Local type
  • ltxsdelement namepersongt
    define locally the persons type
    lt/xsdelementgt
  • Global type ltxsdelement nameperson
    typettt/gt ltxsdcomplexType nametttgt
    define here the type ttt
    lt/xsdcomplexTypegt

Global types can be reused in other elements
19
Local v.s. Global Elements inXML Schema
  • Local element
  • ltxsdcomplexType nametttgt
    ltxsdsequencegt ltxsdelement
    nameaddress type.../gt...
    lt/xsdsequencegt lt/xsdcomplexTypegt
  • Global element ltxsdelement nameaddress
    type.../gt ltxsdcomplexType nametttgt
    ltxsdsequencegt ltxsdelement
    refaddress/gt ... lt/xsdsequencegt
    lt/xsdcomplexTypegt

Global elements like in DTDs
20
Regular Expressions in XML Schema
  • Recall the element-type-element alternation
  • ltxsdcomplexType name....gt
    regular expression on
    elements lt/xsdcomplexTypegt
  • Regular expressions
  • ltxsdsequencegt A B C lt/...gt
    A B C
  • ltxsdchoicegt A B C lt/...gt
    A B C
  • ltxsdgroupgt A B C lt/...gt
    (A B C)
  • ltxsd... minOccurs0 maxOccursunboundedgt
    ..lt/...gt (...)
  • ltxsd... minOccurs0 maxOccurs1gt ..lt/...gt
    (...)?

21
Local Names in XML-Schema
ltxsdelement namepersongt ltxsdcomplexTypegt
. . . . . ltxsdelement
namenamegt ltxsdcomplexTypegt
ltxsdsequencegt
ltxsdelement namefirstname
typexsdstring/gt
ltxsdelement namelastname typexsdstring/gt
lt/xsdsequencegt
lt/xsdelementgt . . . .
lt/xsdcomplexTypegtlt/xsdelementgt ltxsdelement
nameproductgt ltxsdcomplexTypegt . .
. . . ltxsdelement namename
typexsdstring/gt lt/xsdcomplexTypegtlt/xsdel
ementgt
name has different meanings in person and in
product
22
Subtle Use of Local Names
ltxsdcomplexType nameoneBgt ltxsdchoicegt
ltxsdelement nameB typexsdstring/gt
ltxsdsequencegt ltxsdelement nameA
typeonlyAs/gt ltxsdelement nameA
typeoneB/gt lt/xsdsequencegt
ltxsdsequencegt ltxsdelement nameA
typeoneB/gt ltxsdelement nameA
typeonlyAs/gt lt/xsdsequencegt
lt/xsdchoicegtlt/xsdcomplexTypegt
ltxsdelement nameA typeoneB/gt ltxsdcomplex
Type nameonlyAsgt ltxsdchoicegt
ltxsdsequencegt ltxsdelement nameA
typeonlyAs/gt ltxsdelement nameA
typeonlyAs/gt lt/xsdsequencegt
ltxsdelement nameA typexsdstring/gt
lt/xsdchoicegtlt/xsdcomplexTypegt
Arbitrary deep binary tree with A elements, and a
single B element
23
Summary of XML Schema
  • Formal Expressive Power
  • Can express precisely the regular tree languages
  • Lots of other stuff
  • Some form of inheritance
  • A null value
  • Large collection of data types

24
XQuery
  • Based on Quilt (which is based on XML-QL)
  • http//www.w3.org/TR/xquery/
  • XML Query data model
  • Ordered !

25
FLWR (Flower) Expressions
  • FOR ... LET... FOR... LET...
  • WHERE...
  • RETURN...

26
XQuery
  • Find all book titles published after 1995

FOR x IN document("bib.xml")/bib/book WHERE
x/year gt 1995 RETURN x/title
Result lttitlegt abc lt/titlegt lttitlegt def
lt/titlegt lttitlegt ghi lt/titlegt
27
XQuery
  • For each author of a book by Morgan Kaufmann,
    list all books she published

FOR a IN distinct(document("bib.xml")
/bib/bookpublisherMorgan
Kaufmann/author) RETURN ltresultgt
a, FOR t IN
/bib/bookauthora/title
RETURN t lt/resultgt
distinct a function that eliminates duplicates
28
XQuery
  • Result
  • ltresultgt
  • ltauthorgtJoneslt/authorgt
  • lttitlegt abc lt/titlegt
  • lttitlegt def lt/titlegt
  • lt/resultgt
  • ltresultgt
  • ltauthorgt Smith lt/authorgt
  • lttitlegt ghi lt/titlegt
  • lt/resultgt

29
XQuery
  • FOR x in expr -- binds x to each value in the
    list expr
  • LET x expr -- binds x to the entire list
    expr
  • Useful for common subexpressions and for
    aggregations

30
XQuery
ltbig_publishersgt FOR p IN
distinct(document("bib.xml")//publisher)
LET b document("bib.xml")/bookpublisher
p WHERE count(b) gt 100 RETURN
p lt/big_publishersgt
count a (aggregate) function that returns the
number of elms
31
XQuery
  • Find books whose price is larger than average

LET aavg(document("bib.xml")/bib/book/price) FOR
b in document("bib.xml")/bib/book WHERE
b/price gt a RETURN b
32
XQuery
  • Summary
  • FOR-LET-WHERE-RETURN FLWR

FOR/LET Clauses
List of tuples
WHERE Clause
List of tuples
RETURN Clause
Instance of Xquery data model
33
FOR v.s. LET
  • FOR
  • Binds node variables ? iteration
  • LET
  • Binds collection variables ? one value

34
FOR v.s. LET
Returns ltresultgt ltbookgt...lt/bookgtlt/resultgt
ltresultgt ltbookgt...lt/bookgtlt/resultgt ltresultgt
ltbookgt...lt/bookgtlt/resultgt ...
FOR x IN document("bib.xml")/bib/book RETURN
ltresultgt x lt/resultgt
LET x IN document("bib.xml")/bib/book RETURN
ltresultgt x lt/resultgt
Returns ltresultgt ltbookgt...lt/bookgt
ltbookgt...lt/bookgt
ltbookgt...lt/bookgt ... lt/resultgt
35
Collections in XQuery
  • Ordered and unordered collections
  • /bib/book/author an ordered collection
  • Distinct(/bib/book/author) an unordered
    collection
  • LET a /bib/book ? a is a collection
  • b/author ? a collection (several authors...)

Returns ltresultgt ltauthorgt...lt/authorgt
ltauthorgt...lt/authorgt
ltauthorgt...lt/authorgt
... lt/resultgt
RETURN ltresultgt b/author lt/resultgt
36
Collections in XQuery
  • What about collections in expressions ?
  • b/price ? list of n
    prices
  • b/price 0.7 ? list of n numbers
  • b/price b/quantity ? list of n x m numbers ??

37
Sorting in XQuery
ltpublisher_listgt FOR p IN distinct(document("
bib.xml")//publisher) RETURN ltpublishergt
ltnamegt p/text() lt/namegt ,
FOR b IN document("bib.xml")//bookpublisher
p RETURN ltbookgt

b/title ,
b/price
lt/bookgt SORTBY(price DESCENDING)
lt/publishergt SORTBY(name)
lt/publisher_listgt
38
Sorting in XQuery
  • Sorting arugments refer to the name space of the
    RETURN clause, not the FOR clause

39
If-Then-Else
FOR h IN //holding RETURN ltholdinggt
h/title, IF
h/_at_type "Journal"
THEN h/editor ELSE
h/author lt/holdinggt SORTBY
(title)
40
Existential Quantifiers
FOR b IN //book WHERE SOME p IN b//para
SATISFIES contains(p, "sailing") AND
contains(p, "windsurfing") RETURN b/title
41
Universal Quantifiers
FOR b IN //book WHERE EVERY p IN b//para
SATISFIES contains(p, "sailing") RETURN
b/title
Write a Comment
User Comments (0)
About PowerShow.com