Title: Web Data Management
1Web Data Management
2In this lecture
- XML Schemas
- Elements v. Types
- Regular expressions
- Expressive power
- Resources
- W3C Draft www.w3.org/TR/2001/REC-xmlschema-1-2001
0502
3XML Schemas
- http//www.w3.org/TR/xmlschema-1/10/2000
- generalizes DTDs
- uses XML syntax
- two documents structure and datatypes
- http//www.w3.org/TR/xmlschema-1
- http//www.w3.org/TR/xmlschema-2
- XML-Schema is very complex
- often criticized
- some alternative proposals
4BookCatalogue.dtd
lt!ELEMENT BookCatalogue (Book)gt lt!ELEMENT Book
(Title, Author, Date, ISBN, Publisher)gt lt!ELEMENT
Title (PCDATA)gt lt!ELEMENT Author
(PCDATA)gt lt!ELEMENT Date (PCDATA)gt lt!ELEMENT
ISBN (PCDATA)gt lt!ELEMENT Publisher (PCDATA)gt
5lt?xml version"1.0"?gt ltxsdschema
xmlnsxsd"http//www.w3.org/2000/10/XMLSchema"
targetNamespace"http//www.pu
blishing.org"
xmlns"http//www.publishing.org"
elementFormDefault"qualified"gt
ltxsdelement name"BookCatalogue"gt
ltxsdcomplexTypegt ltxsdsequencegt
ltxsdelement ref"Book" minOccurs"0"
maxOccurs"unbounded"/gt
lt/xsdsequencegt lt/xsdcomplexTypegt
lt/xsdelementgt ltxsdelement name"Book"gt
ltxsdcomplexTypegt ltxsdsequencegt
ltxsdelement ref"Title"
minOccurs"1" maxOccurs"1"/gt
ltxsdelement ref"Author" minOccurs"1"
maxOccurs"1"/gt ltxsdelement
ref"Date" minOccurs"1" maxOccurs"1"/gt
ltxsdelement ref"ISBN" minOccurs"1"
maxOccurs"1"/gt ltxsdelement
ref"Publisher" minOccurs"1" maxOccurs"1"/gt
lt/xsdsequencegt
lt/xsdcomplexTypegt lt/xsdelementgt
ltxsdelement name"Title" type"xsdstring"/gt
ltxsdelement name"Author" type"xsdstring"/gt
ltxsdelement name"Date" type"xsdstring"/gt
ltxsdelement name"ISBN" type"xsdstring"/gt
ltxsdelement name"Publisher" type"xsdstring"/gt
lt/xsdschemagt
(explanations on succeeding pages)
BookCatalogue.xsd
xsd Xml-Schema Definition
6lt?xml version"1.0"?gt ltxsdschema
xmlnsxsd"http//www.w3.org/2000/10/XMLSchema"
targetNamespace"http//www.pu
blishing.org"
xmlns"http//www.publishing.org"
elementFormDefault"qualified"gt
ltxsdelement name"BookCatalogue"gt
ltxsdcomplexTypegt ltxsdsequencegt
ltxsdelement ref"Book" minOccurs"0"
maxOccurs"unbounded"/gt
lt/xsdsequencegt lt/xsdcomplexTypegt
lt/xsdelementgt ltxsdelement name"Book"gt
ltxsdcomplexTypegt ltxsdsequencegt
ltxsdelement ref"Title"
minOccurs"1" maxOccurs"1"/gt
ltxsdelement ref"Author" minOccurs"1"
maxOccurs"1"/gt ltxsdelement
ref"Date" minOccurs"1" maxOccurs"1"/gt
ltxsdelement ref"ISBN" minOccurs"1"
maxOccurs"1"/gt ltxsdelement
ref"Publisher" minOccurs"1" maxOccurs"1"/gt
lt/xsdsequencegt
lt/xsdcomplexTypegt lt/xsdelementgt
ltxsdelement name"Title" type"xsdstring"/gt
ltxsdelement name"Author" type"xsdstring"/gt
ltxsdelement name"Date" type"xsdstring"/gt
ltxsdelement name"ISBN" type"xsdstring"/gt
ltxsdelement name"Publisher" type"xsdstring"/gt
lt/xsdschemagt
All XML Schemas have "schema" as the root element.
lt!ELEMENT BookCatalogue (Book)gt
lt!ELEMENT Book (Title, Author, Date,
ISBN, Publisher)gt
lt!ELEMENT Title (PCDATA)gt lt!ELEMENT Author
(PCDATA)gt lt!ELEMENT Date (PCDATA)gt lt!ELEMENT
ISBN (PCDATA)gt lt!ELEMENT Publisher (PCDATA)gt
7lt?xml version"1.0"?gt ltxsdschema
xmlnsxsd"http//www.w3.org/2000/10/XMLSchema"
targetNamespace"http//www.pu
blishing.org"
xmlns"http//www.publishing.org"
elementFormDefault"qualified"gt
ltxsdelement name"BookCatalogue"gt
ltxsdcomplexTypegt ltxsdsequencegt
ltxsdelement ref"Book" minOccurs"0"
maxOccurs"unbounded"/gt
lt/xsdsequencegt lt/xsdcomplexTypegt
lt/xsdelementgt ltxsdelement name"Book"gt
ltxsdcomplexTypegt ltxsdsequencegt
ltxsdelement ref"Title"
minOccurs"1" maxOccurs"1"/gt
ltxsdelement ref"Author" minOccurs"1"
maxOccurs"1"/gt ltxsdelement
ref"Date" minOccurs"1" maxOccurs"1"/gt
ltxsdelement ref"ISBN" minOccurs"1"
maxOccurs"1"/gt ltxsdelement
ref"Publisher" minOccurs"1" maxOccurs"1"/gt
lt/xsdsequencegt
lt/xsdcomplexTypegt lt/xsdelementgt
ltxsdelement name"Title" type"xsdstring"/gt
ltxsdelement name"Author" type"xsdstring"/gt
ltxsdelement name"Date" type"xsdstring"/gt
ltxsdelement name"ISBN" type"xsdstring"/gt
ltxsdelement name"Publisher" type"xsdstring"/gt
lt/xsdschemagt
The elements that are used to create an XML
Schema come from the XMLSchema namespace
8XMLSchema Namespace
http//www.w3.org/2000/10/XMLSchema
complexType
element
sequence
schema
boolean
string
integer
9lt?xml version"1.0"?gt ltxsdschema
xmlnsxsd"http//www.w3.org/2000/10/XMLSchema"
targetNamespace"http//www.pu
blishing.org"
xmlns"http//www.publishing.org"
elementFormDefault"qualified"gt
ltxsdelement name"BookCatalogue"gt
ltxsdcomplexTypegt ltxsdsequencegt
ltxsdelement ref"Book" minOccurs"0"
maxOccurs"unbounded"/gt
lt/xsdsequencegt lt/xsdcomplexTypegt
lt/xsdelementgt ltxsdelement name"Book"gt
ltxsdcomplexTypegt ltxsdsequencegt
ltxsdelement ref"Title"
minOccurs"1" maxOccurs"1"/gt
ltxsdelement ref"Author" minOccurs"1"
maxOccurs"1"/gt ltxsdelement
ref"Date" minOccurs"1" maxOccurs"1"/gt
ltxsdelement ref"ISBN" minOccurs"1"
maxOccurs"1"/gt ltxsdelement
ref"Publisher" minOccurs"1" maxOccurs"1"/gt
lt/xsdsequencegt
lt/xsdcomplexTypegt lt/xsdelementgt
ltxsdelement name"Title" type"xsdstring"/gt
ltxsdelement name"Author" type"xsdstring"/gt
ltxsdelement name"Date" type"xsdstring"/gt
ltxsdelement name"ISBN" type"xsdstring"/gt
ltxsdelement name"Publisher" type"xsdstring"/gt
lt/xsdschemagt
Says that the elements declared in this
schema (BookCatalogue, Book, Title, Author,
Date, ISBN, Publisher) are to go in
this namespace
10Publishing Namespace (targetNamespace)
http//www.publishing.org (targetNamespace)
BookCatalogue
Author
Book
Title
ISBN
Publisher
Date
11lt?xml version"1.0"?gt ltxsdschema
xmlnsxsd"http//www.w3.org/2000/10/XMLSchema"
targetNamespace"http//www.pu
blishing.org"
xmlns"http//www.publishing.org"
elementFormDefault"qualified"gt
ltxsdelement name"BookCatalogue"gt
ltxsdcomplexTypegt ltxsdsequencegt
ltxsdelement ref"Book" minOccurs"0"
maxOccurs"unbounded"/gt
lt/xsdsequencegt lt/xsdcomplexTypegt
lt/xsdelementgt ltxsdelement name"Book"gt
ltxsdcomplexTypegt ltxsdsequencegt
ltxsdelement ref"Title"
minOccurs"1" maxOccurs"1"/gt
ltxsdelement ref"Author" minOccurs"1"
maxOccurs"1"/gt ltxsdelement
ref"Date" minOccurs"1" maxOccurs"1"/gt
ltxsdelement ref"ISBN" minOccurs"1"
maxOccurs"1"/gt ltxsdelement
ref"Publisher" minOccurs"1" maxOccurs"1"/gt
lt/xsdsequencegt
lt/xsdcomplexTypegt lt/xsdelementgt
ltxsdelement name"Title" type"xsdstring"/gt
ltxsdelement name"Author" type"xsdstring"/gt
ltxsdelement name"Date" type"xsdstring"/gt
ltxsdelement name"ISBN" type"xsdstring"/gt
ltxsdelement name"Publisher" type"xsdstring"/gt
lt/xsdschemagt
The default namespace is http//www.publishing.org
which is the targetNamespace!
This is referencing a Book element
declaration. The Book in what namespace? Since
there is no namespace qualifier it is referencing
the Book element in the default namespace, which
is the targetNamespace!
12lt?xml version"1.0"?gt ltxsdschema
xmlnsxsd"http//www.w3.org/2000/10/XMLSchema"
targetNamespace"http//www.pu
blishing.org"
xmlns"http//www.publishing.org"
elementFormDefault"qualified"gt
ltxsdelement name"BookCatalogue"gt
ltxsdcomplexTypegt ltxsdsequencegt
ltxsdelement ref"Book" minOccurs"0"
maxOccurs"unbounded"/gt
lt/xsdsequencegt lt/xsdcomplexTypegt
lt/xsdelementgt ltxsdelement name"Book"gt
ltxsdcomplexTypegt ltxsdsequencegt
ltxsdelement ref"Title"
minOccurs"1" maxOccurs"1"/gt
ltxsdelement ref"Author" minOccurs"1"
maxOccurs"1"/gt ltxsdelement
ref"Date" minOccurs"1" maxOccurs"1"/gt
ltxsdelement ref"ISBN" minOccurs"1"
maxOccurs"1"/gt ltxsdelement
ref"Publisher" minOccurs"1" maxOccurs"1"/gt
lt/xsdsequencegt
lt/xsdcomplexTypegt lt/xsdelementgt
ltxsdelement name"Title" type"xsdstring"/gt
ltxsdelement name"Author" type"xsdstring"/gt
ltxsdelement name"Date" type"xsdstring"/gt
ltxsdelement name"ISBN" type"xsdstring"/gt
ltxsdelement name"Publisher" type"xsdstring"/gt
lt/xsdschemagt
This is a directive to instance documents
which use this schema Any elements used by the
instance document which were declared by this
schema must be namespace qualified by the
namespace specified by targetNamespace.
13Referencing a schema in an XML instance document
lt?xml version"1.0"?gt ltBookCatalogue xmlns
"http//www.publishing.org"
xmlnsxsi"http//www.w3.org/2000/10/XMLSch
ema-instance"
xsischemaLocation"http//www.publishing.org
BookCatalogue.xsd"gt ltBookgt
ltTitlegtMy Life and Timeslt/Titlegt
ltAuthorgtPaul McCartneylt/Authorgt
ltDategtJuly, 1998lt/Dategt
ltISBNgt94303-12021-43892lt/ISBNgt
ltPublishergtMcMillin Publishinglt/Publishergt
lt/Bookgt ... lt/BookCataloguegt
1. First, using a default namespace declaration,
tell the schema-validator that all of the
elements used in this instance document come from
the publishing namespace. 2. Second, with
schemaLocation tell the schema-validator that the
http//www.publishing.org namespace is defined
in BookCatalogue.xsd. 3. Third, tell the
schema-validator that schemaLocation attribute we
are using is the one in the schema instance
namespace.
14Referencing a schema in an XML instance document
targetNamespace"A"
schemaLocation"A
BookCatalogue.xsd"
BookCatalogue.xsd
BookCatalogue.xml
- uses elements from namespace A
- defines elements in namespace A
15Note multiple levels of checking
BookCatalogue.xml
BookCatalogue.xsd
XMLSchema.xsd (schema-for-schemas)
Validate that the xml document conforms to the
rules described in BookCatalogue.xsd
Validate that BookCatalogue.xsd is a
valid schema document, i.e., it conforms to the
rules described in the schema-for-schemas
16Default Value for minOccurs and maxOccurs
- The default value for minOccurs is "1"
- The default value for maxOccurs is "1"
ltelement ref"Title" minOccurs"1" maxOccurs"1"/gt
Equivalent!
ltelement ref"Title"/gt
17Qualify XMLSchema, Default targetNamespace
- In the last example, we explicitly qualified all
elements from the XML Schema namespace. The
targetNamespace was the default namespace.
http//www.publishing.org
http//www.w3.org/2000/10/XMLSchema
complexType
BookCatalogue
element
Author
annotation
Book
documentation
Title
ISBN
Publisher
sequence
Date
schema
18Default XMLSchema, Qualify targetNamespace
- Alternatively (equivalently), we can design our
schema so that XMLSchema is the default namespace.
http//www.publishing.org
http//www.w3.org/2000/10/XMLSchema
complexType
BookCatalogue
element
Author
annotation
Book
documentation
Title
ISBN
Publisher
sequence
Date
schema
19lt?xml version"1.0"?gt ltschema xmlns"http//www.w3
.org/2000/10/XMLSchema"
targetNamespace"http//www.publishing.org"
xmlnspub"http//www.publishing.org"
elementFormDefault"qualified"gt
ltelement name"BookCatalogue"gt
ltcomplexTypegt ltsequencegt
ltelement ref"pubBook" minOccurs"0"
maxOccurs"unbounded"/gt lt/sequencegt
lt/complexTypegt lt/elementgt ltelement
name"Book"gt ltcomplexTypegt
ltsequencegt ltelement
ref"pubTitle"/gt ltelement
ref"pubAuthor"/gt ltelement
ref"pubDate"/gt ltelement
ref"pubISBN"/gt ltelement
ref"pubPublisher"/gt lt/sequencegt
lt/complexTypegt lt/elementgt ltelement
name"Title" type"string"/gt ltelement
name"Author" type"string"/gt ltelement
name"Date" type"string"/gt ltelement
name"ISBN" type"string"/gt ltelement
name"Publisher" type"string"/gt lt/schemagt
Here we are referencing a Book element. Where is
that Book element defined? In what
namespace? The pub prefix indicates
what namespace this element is in. pub has been
set to be the same as the targetNamespace.
20"pub" References the targetNamespace
http//www.publishing.org (targetNamespace)
http//www.w3.org/2000/10/XMLSchema
complexType
BookCatalogue
element
Author
annotation
Book
documentation
Title
ISBN
Publisher
sequence
Date
schema
pub
21Alternate Schema
- In the previous examples we declared an element
and then we refed that element declaration.
Instead, we can inline the element declarations. - On the following slide is an alternate
(equivalent) way of representing the schema shown
previously, using inlined element declarations.
22Elements Declared Inline
lt?xml version"1.0"?gt ltxsdschema
xmlnsxsd"http//www.w3.org/2000/10/XMLSchema"
targetNamespace"http//www.pu
blishing.org"
xmlns"http//www.publishing.org"
elementFormDefault"qualified"gt
ltxsdelement name"BookCatalogue"gt
ltxsdcomplexTypegt ltxsdsequencegt
ltxsdelement name"Book"
maxOccurs"unbounded"gt
ltxsdcomplexTypegt
ltxsdsequencegt
ltxsdelement name"Title" type"xsdstring"/gt
ltxsdelement
name"Author" type"xsdstring"/gt
ltxsdelement name"Date"
type"xsdstring"/gt
ltxsdelement name"ISBN" type"xsdstring"/gt
ltxsdelement
name"Publisher" type"xsdstring"/gt
lt/xsdsequencegt
lt/xsdcomplexTypegt
lt/xsdelementgt lt/xsdsequencegt
lt/xsdcomplexTypegt lt/xsdelementgt lt/xsdschem
agt
23lt?xml version"1.0"?gt ltxsdschema
xmlnsxsd"http//www.w3.org/2000/10/XMLSchema"
targetNamespace"http//www.pu
blishing.org"
xmlns"http//www.publishing.org"
elementFormDefault"qualified"gt
ltxsdelement name"BookCatalogue"gt
ltxsdcomplexTypegt ltxsdsequencegt
ltxsdelement name"Book"
maxOccurs"unbounded"gt
ltxsdcomplexTypegt
ltxsdsequencegt
ltxsdelement name"Title" type"xsdstring"/gt
ltxsdelement
name"Author" type"xsdstring"/gt
ltxsdelement name"Date"
type"xsdstring"/gt
ltxsdelement name"ISBN" type"xsdstring"/gt
ltxsdelement
name"Publisher" type"xsdstring"/gt
lt/xsdsequencegt
lt/xsdcomplexTypegt
lt/xsdelementgt lt/xsdsequencegt
lt/xsdcomplexTypegt lt/xsdelementgt lt/xsdschem
agt
Anonymous types (no name)
24Named Types
- The following slide shows an alternate
(equivalent) schema which uses a named type.
25Named Types
lt?xml version"1.0"?gt ltxsdschema
xmlnsxsd"http//www.w3.org/2000/10/XMLSchema"
targetNamespace"http//www.pu
blishing.org"
xmlns"http//www.publishing.org"
elementFormDefault"qualified"gt
ltxsdelement name"BookCatalogue"gt
ltxsdcomplexTypegt ltxsdsequencegt
ltxsdelement name"Book"
type"CardCatalogueEntry" maxOccurs"unbounded"/gt
lt/xsdsequencegt
lt/xsdcomplexTypegt lt/xsdelementgt
ltxsdcomplexType name"CardCatalogueEntry"gt
ltxsdsequencegt ltxsdelement
name"Title" type"xsdstring"/gt
ltxsdelement name"Author" type"xsdstring"/gt
ltxsdelement name"Date"
type"xsdstring"/gt ltxsdelement
name"ISBN" type"xsdstring"/gt
ltxsdelement name"Publisher" type"xsdstring"/gt
lt/xsdsequencegt lt/xsdcomplexTypegt lt/x
sdschemagt
Named type
26Please note that ltxsdelement name"A"
type"foo"/gt ltxsdcomplexType name"foo"gt
ltxsdsequencegt ltxsdelement name"B"
/gt ltxsdelement name"C" /gt
lt/xsdsequencegt lt/xsdcomplexTypegt is equivalent
to ltxsdelement name"A"gt
ltxsdcomplexTypegt ltxsdsequencegt
ltxsdelement name"B" /gt
ltxsdelement name"C" /gt
lt/xsdsequencegt lt/xsdcomplexTypegt lt/xsdele
mentgt
Element A references the complexType foo.
Element A has the complexType definition inlined
in the element declaration.
27type Attribute or complexType Child Element, but
not Both!
- An element declaration can have a type attribute,
or a complexType child element, but it cannot
have both a type attribute and a complexType
child element.
ltxsdelement name"A" type"foo"gt
ltxsdcomplexTypegt
lt/xsdcomplexTypegt lt/xsdelementgt
28Summary of Declaring Elements (two ways to do it)
1
ltxsdelement name"name" type"type"
minOccurs"int" maxOccurs"int"/gt
A simple type (e.g., xsdstring) or the name of a
complexType
A nonnegative integer
A nonnegative integer or "unbounded"
Note minOccurs and maxOccurs can only be used in
nested (local) element declarations.
ltxsdelement name"name" minOccurs"int"
maxOccurs"int"gt ltxsdcomplexTypegt
lt/xsdcomplexTypegt lt/xsdelementgt
2
29XML Schemas
- ltxsdelement namepaper typepapertype/gt
- ltxsdcomplexType namepapertypegt
- ltxsdsequencegt
- ltxsdelement nametitle
typexsdstring/gt - ltxsdelement nameauthor
minOccurs0/gt - ltxsdelement nameyear/gt
- ltxsd choicegt lt xsdelement
namejournal/gt - ltxsdelement
nameconference/gt - lt/xsdchoicegt
- lt/xsdsequencegt
- lt/xsdelementgt
DTD lt!ELEMENT paper (title,author,year,
(journalconference))gt
30Elements v.s. Types in XML Schema
ltxsdelement namepersongt ltxsdcomplexTypegt
ltxsdsequencegt ltxsdelement namename
typexsdstring/gt
ltxsdelement nameaddress
typexsdstring/gt lt/xsdsequencegt
lt/xsdcomplexTypegtlt/xsdelementgt
ltxsdelement nameperson
typettt/gtltxsdcomplexType nametttgt
ltxsdsequencegt ltxsdelement namename
typexsdstring/gt
ltxsdelement nameaddress
typexsdstring/gt lt/xsdsequencegtlt/xsdco
mplexTypegt
DTD lt!ELEMENT person (name,address)gt
31Elements v.s. Types in XML Schema
- Types
- Simple types (integers, strings, ...)
- Complex types (regular expressions, like in DTDs)
- Element-type-element alternation
- Root element has a complex type
- That type is a regular expression of elements
- Those elements have their complex types...
- ...
- On the leaves we have simple types
32Local and Global Types in XML Schema
- Local type
- ltxsdelement namepersongt
define locally the persons type
lt/xsdelementgt - Global type ltxsdelement nameperson
typettt/gt ltxsdcomplexType nametttgt
define here the type ttt
lt/xsdcomplexTypegt
Global types can be reused in other elements
33Local v.s. Global Elements inXML Schema
- Local element
- ltxsdcomplexType nametttgt
ltxsdsequencegt ltxsdelement
nameaddress type.../gt...
lt/xsdsequencegt lt/xsdcomplexTypegt - Global element ltxsdelement nameaddress
type.../gt ltxsdcomplexType nametttgt
ltxsdsequencegt ltxsdelement
refaddress/gt ... lt/xsdsequencegt
lt/xsdcomplexTypegt
Global elements like in DTDs
34Regular Expressions in XML Schema
- Recall the element-type-element alternation
- ltxsdcomplexType name....gt
regular expression on
elements lt/xsdcomplexTypegt - Regular expressions
- ltxsdsequencegt A B C lt/...gt
A B C - ltxsdchoicegt A B C lt/...gt
A B C - ltxsdgroupgt A B C lt/...gt
(A B C) - ltxsd... minOccurs0 maxOccursunboundedgt
..lt/...gt (...) - ltxsd... minOccurs0 maxOccurs1gt ..lt/...gt
(...)?
35Local Names in XML-Schema
ltxsdelement namepersongt ltxsdcomplexTypegt
. . . . . ltxsdelement
namenamegt ltxsdcomplexTypegt
ltxsdsequencegt
ltxsdelement namefirstname
typexsdstring/gt
ltxsdelement namelastname typexsdstring/gt
lt/xsdsequencegt
lt/xsdelementgt . . . .
lt/xsdcomplexTypegtlt/xsdelementgt ltxsdelement
nameproductgt ltxsdcomplexTypegt . .
. . . ltxsdelement namename
typexsdstring/gt lt/xsdcomplexTypegtlt/xsdel
ementgt
name has different meanings in person and in
product
36Subtle Use of Local Names
ltxsdcomplexType nameoneBgt ltxsdchoicegt
ltxsdelement nameB typexsdstring/gt
ltxsdsequencegt ltxsdelement nameA
typeonlyAs/gt ltxsdelement nameA
typeoneB/gt lt/xsdsequencegt
ltxsdsequencegt ltxsdelement nameA
typeoneB/gt ltxsdelement nameA
typeonlyAs/gt lt/xsdsequencegt
lt/xsdchoicegtlt/xsdcomplexTypegt
ltxsdelement nameA typeoneB/gt ltxsdcomplex
Type nameonlyAsgt ltxsdchoicegt
ltxsdsequencegt ltxsdelement nameA
typeonlyAs/gt ltxsdelement nameA
typeonlyAs/gt lt/xsdsequencegt
ltxsdelement nameA typexsdstring/gt
lt/xsdchoicegtlt/xsdcomplexTypegt
Arbitrary deep binary tree with A elements, and a
single B element
37Attributes in XML Schema
ltxsdelement namepaper typepapertype/gt ltxsd
complexType namepapertypegt
ltxsdsequencegt ltxsdelement
nametitle typexsdstring/gt . .
. . . . lt/xsdsequencegt ltxsdattribute
namelanguage" type"xsdNMTOKEN"
fixedEnglish"/gt lt/xsdcomplexTypegt
- Attributes are associated to the type, not to the
element - Only to complex types more trouble if we want to
add attributes to simple types.
38Mixed Content, Any Type
ltxsdcomplexType mixed"true"gt . . . .
- Better than in DTDs can still enforce the type,
but now may have text between any elements - Means anything is permitted there
ltxsdelement name"anything" type"xsdanyType"/gt
. . . .
39All Group
ltxsdcomplexType name"PurchaseOrderType"gt
ltxsdallgt ltxsdelement name"shipTo"
type"USAddress"/gt
ltxsdelement name"billTo" type"USAddress"/gt
ltxsdelement ref"comment"
minOccurs"0"/gt ltxsdelement
name"items" type"Items"/gt lt/xsdallgt
ltxsdattribute name"orderDate"
type"xsddate"/gt lt/xsdcomplexTypegt
- A restricted form of in SGML
- Restrictions
- Only at top level
- Has only elements
- Each element occurs at most once
- E.g. comment occurs 0 or 1 times
40Derived Types by Extensions
ltcomplexType name"Address"gt ltsequencegt
ltelement name"street" type"string"/gt
ltelement name"city"
type"string"/gt lt/sequencegt
lt/complexTypegt ltcomplexType name"USAddress"gt
ltcomplexContentgt ltextension
base"ipoAddress"gt ltsequencegt
ltelement name"state" type"ipoUSState"/gt
ltelement name"zip"
type"positiveInteger"/gt lt/sequencegt
lt/extensiongt lt/complexContentgt lt/complexTypegt
Corresponds to inheritance
41Derived Types by Restrictions
- () may restrict cardinalities, e.g. (0,infty)
to (1,1) may restrict choices other
restrictions
ltcomplexContentgt ltrestriction
base"ipoItemsgt rewrite the entire
content, with restrictions...
lt/restrictiongt lt/complexContentgt
Corresponds to set inclusion
42Simple Types
- String
- Token
- Byte
- unsignedByte
- Integer
- positiveInteger
- Int (larger than integer)
- unsignedInt
- Long
- Short
- ...
- Time
- dateTime
- Duration
- Date
- ID
- IDREF
- IDREFS
43Facets of Simple Types
- Facets additional properties restricting a
simple type - 15 facets defined by XML Schema
- Examples
- length
- minLength
- maxLength
- pattern
- enumeration
- whiteSpace
- maxInclusive
- maxExclusive
- minInclusive
- minExclusive
- totalDigits
- fractionDigits
44Facets of Simple Types
- Can further restrict a simple type by changing
some facets - Restriction subset
45Not so Simple Types
- List types
- Union types
- Restriction types
ltxsdsimpleType name"listOfMyIntType"gt
ltxsdlist itemType"myInteger"/gt lt/xsdsimpleTypegt
ltlistOfMyIntgt20003 15037 95977 95945lt/listOfMyIntgt
46Summary of XML Schema
- Formal Expressive Power
- Can express precisely the regular tree languages
(over unranked trees) - Lots of other stuff
- Some form of inheritance
- A null value
- Large collection of data types
47Summary of Schemas
- in SS data
- graph theoretic
- data and schema are decoupled
- used in data processing
- in XML
- from grammar to object-oriented
- schema wired with the data
- emphasis on semantics for exchange