Title: Advanced but eversouseful XML Technologies: Schema, XPath, XQuery, XSL an incomplete introduction
1Advanced but ever-so-usefulXML
TechnologiesSchema, XPath, XQuery, XSL an
incomplete introduction
THE US NATIONAL VIRTUAL OBSERVATORY
2XML Schema
- What it is
- A W3C standard for defining and verifying an XML
grammar - An XML Schema document describes
- a set of legal XML tags and attributes,
- what order they go in, and
- what values are allowed.
- An XML Schema-aware XML parser can tell you if an
XML document follows the rules of the grammar - World Wide Web Consortium
- Why you might care
- VO uses XML to encode metadata and service
messages - XML Schema is used to define metadata encoding
and message syntax - Ability to read XML Schema will help you
understand what metadata is needed by an app and
how to encode it - VO end users and many developers will never need
to know about Schema - Can be helpful for debugging XML documents and
messages - What youll get from this session
- Rudimentary skills for reading an XML Schema
document to discern the specified XML syntax
3Schema uses XML to describe syntax
- http//www.w3.org/TR/xmlschema-0/
- Contains a list of definitions
- Elements and types
- Attributes, groups, attribute-groups
- Types
- Simple types string, integer, dateTime, etc.
- lttitlegtThe Astronomy Digital Image
Librarylt/titlegt - ltmaxRecordsgt10000lt/maxRecordsgt
- ltdate rolecompletedgt2002-09-30T074500lt/dategt
- Complex types contain other elements
- Defining types
- Anonymous type directly inside the definition
of an element - Global type top-level definition, can be
reused
4Namespaces
- Namespace a schemas unique identifier
- Set with the targetNamespace attribute
- Used by an XML document to indicate which schema
it is compliant with - URI format (URL or URN)
- Using a schema
- Instance document XML document that follows the
grammar defined by the schema - xmlns attribute used to identify the default
namespace that elements belong to - Tagging elements with namespace prefixes
- Prefix defined (anywhere) with xmlnsprefix
- xmlnsres"http//nvoss.org/Resource"
- Prefixes attached to element/attribute name
denotes it belongs to the associated schema - ltrestitlegtlt/restitlegt
- Unqualified elements a special technique for
namespaces - Tag the root element (or xsitype) only
- Do not use xmlns
- Can make using multiple schemas easier
5Validating an instance document
- A validating parser can test whether an instance
document is compliant with its schema - xsischemaLocation attribute tells parser where
to find Schema document(s) - Look-up list made up of namespace-location pairs
- xsischemaLocationnamespace file-or-URL
- xsischemaLocation is just a recommendation
- Parser may have local copies cached for more
efficient parsing - validate a tool for validating XML documents
against schemas
6Simple schema
xmltech-simple.xsd
lt?xml version"1.0" encoding"UTF-8"?gt ltxsschema
targetNamespace"http//nvoss.org/VOResource"
xmlnsxs"http//www.w3.org/2001/XMLSchema"
elementFormDefault"qualified"gt
ltxselement name"resource"gt
ltxscomplexTypegt ltxssequencegt
ltxselement name"title" type"xsstring" /gt
ltxselement name"referenceURL"
type"xsanyURI"
minOccurs"0"/gt ltxselement
name"type"
minOccurs"0" maxOccurs"unbounded"gt
ltxssimpleTypegt
ltxsrestriction base"xsstring"gt
ltxsenumeration value"Archive" /gt
ltxsenumeration value"Catalog" /gt
ltxsenumeration
value"Organisation" /gt
lt/xsrestrictiongt
lt/xssimpleTypegt lt/xselementgt
lt/xssequencegt ltxsattribute
name"created" type"xsdateTime" /gt
lt/xscomplexTypegt lt/xselementgt lt/xsschemagt
7Simple schema
xmltech-simple.xsd
lt?xml version"1.0" encoding"UTF-8"?gt ltxsschema
targetNamespace"http//nvoss.org/VOResource"
xmlnsxs"http//www.w3.org/2001/XMLSchema"
elementFormDefault"qualified"gt
ltxselement name"resource"gt
ltxscomplexTypegt ltxssequencegt
ltxselement name"title" type"xsstring" /gt
ltxselement name"referenceURL"
type"xsanyURI"
minOccurs"0"/gt ltxselement
name"type"
minOccurs"0" maxOccurs"unbounded"gt
ltxssimpleTypegt
ltxsrestriction base"xsstring"gt
ltxsenumeration value"Archive" /gt
ltxsenumeration value"Catalog" /gt
ltxsenumeration
value"Organisation" /gt
lt/xsrestrictiongt
lt/xssimpleTypegt lt/xselementgt
lt/xssequencegt ltxsattribute
name"created" type"xsdateTime" /gt
lt/xscomplexTypegt lt/xselementgt lt/xsschemagt
targetNamespace
8Simple schema
xmltech-simple.xsd
lt?xml version"1.0" encoding"UTF-8"?gt ltxsschema
targetNamespace"http//nvoss.org/VOResource"
xmlnsxs"http//www.w3.org/2001/XMLSchema"
elementFormDefault"qualified"gt
ltxselement name"resource"gt
ltxscomplexTypegt ltxssequencegt
ltxselement name"title" type"xsstring" /gt
ltxselement name"referenceURL"
type"xsanyURI"
minOccurs"0"/gt ltxselement
name"type"
minOccurs"0" maxOccurs"unbounded"gt
ltxssimpleTypegt
ltxsrestriction base"xsstring"gt
ltxsenumeration value"Archive" /gt
ltxsenumeration value"Catalog" /gt
ltxsenumeration
value"Organisation" /gt
lt/xsrestrictiongt
lt/xssimpleTypegt lt/xselementgt
lt/xssequencegt ltxsattribute
name"created" type"xsdateTime" /gt
lt/xscomplexTypegt lt/xselementgt lt/xsschemagt
globally-defined element
- global Direct child of xsschema
- Only global elements can serve as a documents
root element
9Simple schema
xmltech-simple.xsd
lt?xml version"1.0" encoding"UTF-8"?gt ltxsschema
targetNamespace"http//nvoss.org/VOResource"
xmlnsxs"http//www.w3.org/2001/XMLSchema"
elementFormDefault"qualified"gt
ltxselement name"resource"gt
ltxscomplexTypegt ltxssequencegt
ltxselement name"title" type"xsstring" /gt
ltxselement name"referenceURL"
type"xsanyURI"
minOccurs"0"/gt ltxselement
name"type"
minOccurs"0" maxOccurs"unbounded"gt
ltxssimpleTypegt
ltxsrestriction base"xsstring"gt
ltxsenumeration value"Archive" /gt
ltxsenumeration value"Catalog" /gt
ltxsenumeration
value"Organisation" /gt
lt/xsrestrictiongt
lt/xssimpleTypegt lt/xselementgt
lt/xssequencegt ltxsattribute
name"created" type"xsdateTime" /gt
lt/xscomplexTypegt lt/xselementgt lt/xsschemagt
Anonymous type definition
10Simple schema
xmltech-simple.xsd
lt?xml version"1.0" encoding"UTF-8"?gt ltxsschema
targetNamespace"http//nvoss.org/VOResource"
xmlnsxs"http//www.w3.org/2001/XMLSchema"
elementFormDefault"qualified"gt
ltxselement name"resource"gt
ltxscomplexTypegt ltxssequencegt
ltxselement name"title" type"xsstring" /gt
ltxselement name"referenceURL"
type"xsanyURI"
minOccurs"0"/gt ltxselement
name"type"
minOccurs"0" maxOccurs"unbounded"gt
ltxssimpleTypegt
ltxsrestriction base"xsstring"gt
ltxsenumeration value"Archive" /gt
ltxsenumeration value"Catalog" /gt
ltxsenumeration
value"Organisation" /gt
lt/xsrestrictiongt
lt/xssimpleTypegt lt/xselementgt
lt/xssequencegt ltxsattribute
name"created" type"xsdateTime" /gt
lt/xscomplexTypegt lt/xselementgt lt/xsschemagt
Content model
- sequence a list of elements that must appear in
order - choice one from a list of elements may appear
- Other models group, all, any
11Simple schema
xmltech-simple.xsd
lt?xml version"1.0" encoding"UTF-8"?gt ltxsschema
targetNamespace"http//nvoss.org/VOResource"
xmlnsxs"http//www.w3.org/2001/XMLSchema"
elementFormDefault"qualified"gt
ltxselement name"resource"gt
ltxscomplexTypegt ltxssequencegt
ltxselement name"title" type"xsstring" /gt
ltxselement name"referenceURL"
type"xsanyURI"
minOccurs"0"/gt ltxselement
name"type"
minOccurs"0" maxOccurs"unbounded"gt
ltxssimpleTypegt
ltxsrestriction base"xsstring"gt
ltxsenumeration value"Archive" /gt
ltxsenumeration value"Catalog" /gt
ltxsenumeration
value"Organisation" /gt
lt/xsrestrictiongt
lt/xssimpleTypegt lt/xselementgt
lt/xssequencegt ltxsattribute
name"created" type"xsdateTime" /gt
lt/xscomplexTypegt lt/xselementgt lt/xsschemagt
Locally-defined elements
- title any string
- referenceURL URI format
- type restricted to a specified list of strings
12Simple schema
xmltech-simple.xsd
lt?xml version"1.0" encoding"UTF-8"?gt ltxsschema
targetNamespace"http//nvoss.org/VOResource"
xmlnsxs"http//www.w3.org/2001/XMLSchema"
elementFormDefault"qualified"gt
ltxselement name"resource"gt
ltxscomplexTypegt ltxssequencegt
ltxselement name"title" type"xsstring" /gt
ltxselement name"referenceURL"
type"xsanyURI"
minOccurs"0"/gt ltxselement
name"type"
minOccurs"0" maxOccurs"unbounded"gt
ltxssimpleTypegt
ltxsrestriction base"xsstring"gt
ltxsenumeration value"Archive" /gt
ltxsenumeration value"Catalog" /gt
ltxsenumeration
value"Organisation" /gt
lt/xsrestrictiongt
lt/xssimpleTypegt lt/xselementgt
lt/xssequencegt ltxsattribute
name"created" type"xsdateTime" /gt
lt/xscomplexTypegt lt/xselementgt lt/xsschemagt
Occurance restrictions
- Default
- minOccurs1 maxOccurs1
- minOccurs0 optional
- minOccurs1 required
13The compliant instance document
xmltech-simple.xml
lt?xml version"1.0" encoding"UTF-8"?gt ltresource
xmlns"http//nvoss.org/VOResource"
xmlnsxsi"http//www.w3.org/2001/XMLSchema-instan
ce" xsischemaLocation"http//nvoss.org
/VOResource
xmltech-simple.xsd" gt lttitlegtNCSA Astronomy
Digital Image Librarylt/titlegt
ltreferenceURLgthttp//adil.ncsa.uiuc.edu/lt/referenc
eURLgt lttypegtArchivelt/typegt lt/resourcegt
14The compliant instance document
xmltech-simple.xml
lt?xml version"1.0" encoding"UTF-8"?gt ltresource
xmlns"http//nvoss.org/VOResource"
xmlnsxsi"http//www.w3.org/2001/XMLSchema-instan
ce" xsischemaLocation"http//nvoss.org
/VOResource
xmltech-simple.xsd" gt lttitlegtNCSA Astronomy
Digital Image Librarylt/titlegt
ltreferenceURLgthttp//adil.ncsa.uiuc.edu/lt/referenc
eURLgt lttypegtArchivelt/typegt lt/resourcegt
Default namespace
15The compliant instance document
xmltech-simple.xml
lt?xml version"1.0" encoding"UTF-8"?gt ltresource
xmlns"http//nvoss.org/VOResource"
xmlnsxsi"http//www.w3.org/2001/XMLSchema-instan
ce" xsischemaLocation"http//nvoss.org
/VOResource
xmltech-simple.xsd" gt lttitlegtNCSA Astronomy
Digital Image Librarylt/titlegt
ltreferenceURLgthttp//adil.ncsa.uiuc.edu/lt/referenc
eURLgt lttypegtArchivelt/typegt lt/resourcegt
Default namespace
xsi namespace Prefix defined
16The compliant instance document
xmltech-simple.xml
lt?xml version"1.0" encoding"UTF-8"?gt ltresource
xmlns"http//nvoss.org/VOResource"
xmlnsxsi"http//www.w3.org/2001/XMLSchema-instan
ce" xsischemaLocation"http//nvoss.org
/VOResource
xmltech-simple.xsd" gt lttitlegtNCSA Astronomy
Digital Image Librarylt/titlegt
ltreferenceURLgthttp//adil.ncsa.uiuc.edu/lt/referenc
eURLgt lttypegtArchivelt/typegt lt/resourcegt
Default namespace
xsi namespace Prefix defined
xsischemaLocation
- xsischemaLocation says,
- Load schema called http//nvoss.org/VOResour
ce - from local file, xmltech-simple.xsd
17The compliant instance document
xmltech-simple.xml
lt?xml version"1.0" encoding"UTF-8"?gt ltresource
xmlns"http//nvoss.org/VOResource"
xmlnsxsi"http//www.w3.org/2001/XMLSchema-instan
ce" xsischemaLocation"http//nvoss.org
/VOResource
xmltech-simple.xsd" gt lttitlegtNCSA Astronomy
Digital Image Librarylt/titlegt
ltreferenceURLgthttp//adil.ncsa.uiuc.edu/lt/referenc
eURLgt lttypegtArchivelt/typegt lt/resourcegt
Default namespace
xsi namespace Prefix defined
xsischemaLocation
- xsischemaLocation says,
- Load schema called http//nvoss.org/VOResour
ce - from local file, xmltech-simple.xsd
Let's try Exercise 1
18Global (Reusable) Types
xmltech-globaltypes.xml
lt?xml version"1.0" encoding"UTF-8"?gt ltxsschema
targetNamespace"http//nvoss.org/VOResource"
xmlnsres"http//nvoss.org/VOResource"
xmlnsxs"http//www.w3.org/2001/XMLSchema
" elementFormDefault"qualified"gt
ltxssimpleType name"Type"gt ltxsrestriction
base"xsstring"gt ltxsenumeration
value"Archive" /gt ltxsenumeration
value"Catalog" /gt ltxsenumeration
value"Organisation" /gt lt/xsrestrictiongt
lt/xssimpleTypegt ltxscomplexType
name"Resource"gt ltxssequencegt
ltxselement name"title" type"xsstring" /gt
ltxselement name"referenceURL"
type"xsanyURI"
minOccurs"0"/gt ltxselement name"type"
type"resType"
minOccurs"0" maxOccurs"unbounded"/gt
lt/xssequencegt ltxsattribute name"created"
type"xsdateTime" /gt lt/xscomplexTypegt
ltxselement name"resource" type"resResource"
/gt lt/xsschemagt
Define a prefix for the targetNamespace
19Global (Reusable) Types
xmltech-globaltypes.xml
lt?xml version"1.0" encoding"UTF-8"?gt ltxsschema
targetNamespace"http//nvoss.org/VOResource"
xmlnsres"http//nvoss.org/VOResource"
xmlnsxs"http//www.w3.org/2001/XMLSchema
" elementFormDefault"qualified"gt
ltxssimpleType name"Type"gt ltxsrestriction
base"xsstring"gt ltxsenumeration
value"Archive" /gt ltxsenumeration
value"Catalog" /gt ltxsenumeration
value"Organisation" /gt lt/xsrestrictiongt
lt/xssimpleTypegt ltxscomplexType
name"Resource"gt ltxssequencegt
ltxselement name"title" type"xsstring" /gt
ltxselement name"referenceURL"
type"xsanyURI"
minOccurs"0"/gt ltxselement name"type"
type"resType"
minOccurs"0" maxOccurs"unbounded"/gt
lt/xssequencegt ltxsattribute name"created"
type"xsdateTime" /gt lt/xscomplexTypegt
ltxselement name"resource" type"resResource"
/gt lt/xsschemagt
Define a prefix for the targetNamespace
Type defined here
20Global (Reusable) Types
xmltech-globaltypes.xml
lt?xml version"1.0" encoding"UTF-8"?gt ltxsschema
targetNamespace"http//nvoss.org/VOResource"
xmlnsres"http//nvoss.org/VOResource"
xmlnsxs"http//www.w3.org/2001/XMLSchema
" elementFormDefault"qualified"gt
ltxssimpleType name"Type"gt ltxsrestriction
base"xsstring"gt ltxsenumeration
value"Archive" /gt ltxsenumeration
value"Catalog" /gt ltxsenumeration
value"Organisation" /gt lt/xsrestrictiongt
lt/xssimpleTypegt ltxscomplexType
name"Resource"gt ltxssequencegt
ltxselement name"title" type"xsstring" /gt
ltxselement name"referenceURL"
type"xsanyURI"
minOccurs"0"/gt ltxselement name"type"
type"resType"
minOccurs"0" maxOccurs"unbounded"/gt
lt/xssequencegt ltxsattribute name"created"
type"xsdateTime" /gt lt/xscomplexTypegt
ltxselement name"resource" type"resResource"
/gt lt/xsschemagt
Define a prefix for the targetNamespace
Type defined here
and used here
21Global (Reusable) Elements
xmltech-elrefs.xml
lt?xml version"1.0" encoding"UTF-8"?gt ltxsschema
targetNamespace"http//nvoss.org/VOResource"
xmlnsres"http//nvoss.org/VOResource"
xmlnsxs"http//www.w3.org/2001/XMLSchema
" elementFormDefault"qualified"gt
ltxssimpleType name"Type"gt ltxsrestriction
base"xsstring"gt ltxsenumeration
value"Archive" /gt ltxsenumeration
value"Catalog" /gt ltxsenumeration
value"Organisation" /gt lt/xsrestrictiongt
lt/xssimpleTypegt ltxselement name"type"
type"resType"/gt ltxscomplexType
name"Resource"gt ltxssequencegt
ltxselement name"title" type"xsstring" /gt
ltxselement name"referenceURL"
type"xsanyURI"
minOccurs"0"/gt ltxselement
ref"restype" minOccurs"0"
maxOccurs"unbounded"/gt lt/xssequencegt
ltxsattribute name"created" type"xsdateTime"
/gt lt/xscomplexTypegt ltxselement
name"resource" type"resResource"
/gt lt/xsschemagt
Element defined here
and used here
22Schema Documentation
xmltech-documented.xml
ltxscomplexType name"Resource"gt
ltxsannotationgt ltxsdocumentationgt
Any entity or component of a VO application
that is describable and identifiable
by a IVOA Identifier.
lt/xsdocumentationgt lt/xsannotationgt
ltxssequencegt ltxselement name"title"
type"xsstring" gt ltxsannotationgt
ltxsdocumentationgt
the full name given to the resource
lt/xsdocumentationgt
lt/xsannotationgt lt/xselementgt
ltxselement name"referenceURL" type"xsanyURI"
minOccurs"0"gt
ltxsannotationgt ltxsdocumentationgt
URL pointing to a
human-readable document
describing this resource.
lt/xsdocumentationgt lt/xsannotationgt
lt/xselementgt
- Most schema components can have documentation
attached - Documentation is important for defining metadata
schemas - Carnivore registry extracts documentation
directly from schema for display to users
23Derived Types
- Two ways to derive a new type from an existing
one - Extension
- Applicable only to complex types
- Adds additional elements or attributes to the
content model
ltxscomplexType name"Service"gt
ltxscomplexContentgt ltxsextension
base"resResource"gt ltxssequencegt
ltxselement name"accessURL"
type"xsanyURI" /gt lt/xssequencegt
lt/xsextensiongt lt/xscomplexContentgt lt/xscompl
exTypegt ltxselement name"service"
type"resService"gt
ltservice xmlns"http//nvoss.org/VOResource"
created"1994-11-01T120000" gt
lttitlegtADIL Query Pagelt/titlegt
ltreferenceURLgthttp//adil.ncsa.uiuc.edu/help.htmllt
/referenceURLgt lttypegtArchivelt/typegt
ltaccessURLgthttp//adil.ncsa.uiuc.edu/QueryPage.htm
llt/accessURLgt lt/servicegt
24Derived Types
- Two ways to derive a new type from an existing
one - Restriction
- Simple types restrict the legal values in some
way - Ex integer restrict range
- string restrict to match a pattern
- Complex types
- Disallow optional elements, attributes
- Restricting occurances
- Setting default or fixed values where none were
previously set
25Extending Schemas Iplugging-in derived entities
- Suppose you want to define an element in terms of
a base type but allow any type derived from it to
be inserted in its place - a form of polymorphism
- Two techniques
- xsitype
- a label in the instance document)
- Substitution groups
- a label in the schema document
26Extending Schemas Iplugging-in derived entities
- xsitype technique
- From our example
- The resource element has the type Resource
- The Service type is derived from the Resource
type - Declaring a service element is not necessary
In the schema document ltxselement
name"resourceList"gt ltxscomplexContentgt
ltxssequencegt ltxselement
name"resource"
type"resResource"
maxOccurs"unbounded" /gt lt/xssequencegt
lt/xscomplexContentgt lt/xselementgt
In the instance document ltresourceList
xmlns"http//nvoss.org/VOReso ltresource
created"1994-11-01T120000" gt lttitlegtNCSA
Astronomy Digital Image Li
ltreferenceURLgthttp//adil.ncsa.uiuc.ed
lttypegtArchivelt/typegt lt/resourcegt ltresource
xsitype"Service"
created"1994-11-01T120000" gt lttitlegtADIL
Query Pagelt/titlegt ltreferenceURLgthttp//adil
.ncsa.uiuc.ed lttypegtArchivelt/typegt
ltaccessURLgthttp//adil.ncsa.uiuc.edu/Q
lt/resourcegt lt/resourceListgt
IVOA VOResource schema uses this technique
27Extending Schemas Iplugging-in derived entities
- Substitution group technique
- From our example
- The Service type is derived from the Resource
type - Add substitutionGroup attribute to service
element definition - means we can substitute service element anywhere
a resource is allowed
In the instance document ltresourceList
xmlns"http//nvoss.org/VOReso ltresource
created"1994-11-01T120000" gt lttitlegtNCSA
Astronomy Digital Image Li
ltreferenceURLgthttp//adil.ncsa.uiuc.ed
lttypegtArchivelt/typegt lt/resourcegt ltservice
created"1994-11-01T120000" gt lttitlegtADIL
Query Pagelt/titlegt ltreferenceURLgthttp//adil
.ncsa.uiuc.ed lttypegtArchivelt/typegt
ltaccessURLgthttp//adil.ncsa.uiuc.edu/Q
lt/servicegt lt/resourceListgt
In the schema document ltxselement
name"resource" type"resResource"gt ltxselement
name"service" type"resService"
substitutionGroup"resresource"gt ltxselement
name"resourceList"gt ltxscomplexContentgt
ltxssequencegt ltxselement
ref"resresource"
maxOccurs"unbounded" /gt lt/xssequencegt
lt/xscomplexContentgt lt/xselementgt
IVOA Space-Time Coordinates schema uses this
technique
28Extending Schemas IIplacing extensions into a
separate namespace
- separate schema file, separate namespace
- Enables schema evolution in a backward compatible
way. - Extension file uses xsimport to load schema
being extended - Instance documents generally must define prefixes
for both the original schema namespace and the
extension namespace. - Example VOResource metadata schemas
- Core metadata schema VOResource
- Extension schemas VODataService, VORegistry,
ConeSearch, SimpleImageAccess
29A word about namespaces in instance
documents(FYI)
lt?xml version"1.0" encoding"UTF-8"?gt ltxsschema
targetNamespace"http//nvoss.org/VOResource"
xmlnsres"http//nvoss.org/VOResource"
xmlnsxs"http//www.w3.org/2001/XMLSchema
" elementFormDefault"qualified"gt
- An instance document must indicate what
namespace(s) the elements belong to - So that the document can be validated
- 3 ways to indicate this
- controlled by elementFormDefault attribute in
Schema file
elementFormDefault"qualified"
elementFormDefault"unqualified"
- Default namespace
- xmlns tags a whole section
- Advantages
- Great for simple ns use
- Disadvantages
- Can be confusing for authors when drawing on
multiple schema
- Namespace prefixes
- Invidually tag elements
- Advantages
- Visually explicit about ns membership
- Can mix in elements from different ns with the
same name. - Disadvantages
- Error-prone, ugly when drawing multiple schema
- Unqualified tag (global) root element
- Advantages
- Single prefix tag needed at the top of the
document no other tags needed - Good for when using multiple schemas together
- Least error prone
- Makes XPaths simpler
- Disadvantage
- Cant have 2 elements from different ns w/same
name
IVOA VOResource uses this technique
30A word about namespaces in instance
documents(FYI)
lt?xml version"1.0" encoding"UTF-8"?gt ltxsschema
targetNamespace"http//nvoss.org/VOResource"
xmlnsres"http//nvoss.org/VOResource"
xmlnsxs"http//www.w3.org/2001/XMLSchema
" elementFormDefault"qualified"gt
- An instance document must indicate what
namespace(s) the elements belong to - So that the document can be validated
- 3 ways to indicate this
- controlled by elementFormDefault attribute in
Schema file
elementFormDefault"qualified"
elementFormDefault"unqualified"
- Default namespace
- xmlns tags a whole section
- Advantages
- Great for simple ns use
- Disadvantages
- Can be confusing for authors when drawing on
multiple schema
- Namespace prefixes
- Invidually tag elements
- Advantages
- Visually explicit about ns membership
- Can mix in elements from different ns with the
same name. - Disadvantages
- Error-prone, ugly when drawing multiple schema
- Unqualified tag (global) root element
- Advantages
- Single prefix tag needed at the top of the
document no other tags needed - Good for when using multiple schemas together
- Least error prone
- Makes XPaths simpler
- Disadvantage
- Cant have 2 elements from different ns w/same
name
Let's try Exercise 2
IVOA VOResource uses this technique
31XPath
- What it is
- A W3C standard syntax for pointing to elements,
attributes, and/or their values in an XML file - Why you might care
- XPath is used in two other important XML
technologies XQuery and XSL - XPath is used in ADQL to query a registry via the
standard Registry Interface - What youll get from this session
- Ability to form simple XPath queries
32Can you tell me how to get to Sesame Street?
- An XPath is a set of directions from one point in
an XML document to another - ltNewYorkgt
- ltBorough name"Brooklyn"gt
- ltlight/gt
- ltlight/gt
- ltlightgt
- ltSesameStreetgt
- ltweathergtsunnylt/weathergt
- lt/SesameStreetgt
- lt/lightgt
- lt/Boroughgt
- ltBorough name"Queens"/gt
- lt/NewYorkgt
- /NewYork/Burough_at_name"Brooklyn/light3/Sesame
Street - Begin at the start of the document
33Can you tell me how to get to Sesame Street?
- An XPath is a set of directions from one point in
an XML document to another - ltNewYorkgt
- ltBorough name"Brooklyn"gt
- ltlight/gt
- ltlight/gt
- ltlightgt
- ltSesameStreetgt
- ltweathergtsunnylt/weathergt
- lt/SesameStreetgt
- lt/lightgt
- lt/Boroughgt
- ltBorough name"Queens"/gt
- lt/NewYorkgt
- /NewYork/Burough_at_name"Brooklyn/light3/Sesame
Street - Go to NewYork
34Can you tell me how to get to Sesame Street?
- An XPath is a set of directions from one point in
an XML document to another - ltNewYorkgt
- ltBorough name"Brooklyn"gt
- ltlight/gt
- ltlight/gt
- ltlightgt
- ltSesameStreetgt
- ltweathergtsunnylt/weathergt
- lt/SesameStreetgt
- lt/lightgt
- lt/Boroughgt
- ltBorough name"Queens"/gt
- lt/NewYorkgt
- /NewYork/Burough_at_name"Brooklyn"/light3/Sesame
Street - Find the Borough named Brooklyn
35Can you tell me how to get to Sesame Street?
- An XPath is a set of directions from one point in
an XML document to another - ltNewYorkgt
- ltBorough name"Brooklyn"gt
- ltlight/gt
- ltlight/gt
- ltlightgt
- ltSesameStreetgt
- ltweathergtsunnylt/weathergt
- lt/SesameStreetgt
- lt/lightgt
- lt/Boroughgt
- ltBorough name"Queens"/gt
- lt/NewYorkgt
- /NewYork/Burough_at_name"Brooklyn"/light3/Sesame
Street - Go to the 3rd light
36Can you tell me how to get to Sesame Street?
- An XPath is a set of directions from one point in
an XML document to another - ltNewYorkgt
- ltBorough name"Brooklyn"gt
- ltlight/gt
- ltlight/gt
- ltlightgt
- ltSesameStreetgt
- ltweathergtsunnylt/weathergt
- lt/SesameStreetgt
- lt/lightgt
- lt/Boroughgt
- ltBorough name"Queens"/gt
- lt/NewYorkgt
- /NewYork/Burough_at_name"Brooklyn"/light3/Sesame
Street - And theres SesameStreet
37XPath syntax
- XPath fields (between the /s)
- Each represents a descent in the XML hierarchy
- // means drop any number of levels
- Points to an XML node
- element name or, if preceeded with an _at_, an
attribute name - Other useful node-matching symbols
- wildcard (any name) . current node ..
parent node - Context node starting point
- If it does not start with a /, XPath is relative
to a context-specific starting point. - /NewYork/Burough_at_name"Brooklyn"
- _at_name is an XPath relative to /NewYork/Burough
- Predicates
- Read as where
- XPaths inside resolve to string value inside
element/attribute pointed to - Operators ! lt gt lt gt and or
- Many Useful Functions contains(string, string),
position(), count(), last(), local-name() - 3 is short-hand for position()3
38XPath as a Query
- An XPath returns matched XML nodes
- /NewYork/Burough_at_name"Brooklyn/light3/Sesame
Street - On
- ltNewYorkgt
- ltBorough name"Brooklyn"gt
- ltlight/gt
- ltlight/gt
- ltlightgt
- ltSesameStreetgt
- ltweathergtsunnylt/weathergt
- lt/SesameStreetgt
- lt/lightgt
- lt/Boroughgt
- ltBorough name"Queens"/gt
- lt/NewYorkgt
Returns ltSesameStreetgt ltweathergtsunnylt/weather
gt lt/SesameStreetgt
39XPath as a Query
- An XPath returns matched XML nodes
- /NewYork/Burough/light
- On
- ltNewYorkgt
- ltBorough name"Brooklyn"gt
- ltlight/gt
- ltlight/gt
- ltlightgt
- ltSesameStreetgt
- ltweathergtsunnylt/weathergt
- lt/SesameStreetgt
- lt/lightgt
- lt/Boroughgt
- ltBorough name"Queens"/gt
- lt/NewYorkgt
Returns ltlight/gt ltlight/gt ltlightgt
ltSesameStreetgt ltweathergtsunnylt/weathergt
lt/SesameStreetgt lt/lightgt
40XPath as a Query
- An XPath returns matched XML nodes
- /NewYork/Burough/light/SesameStreet/weather
- On
- ltNewYorkgt
- ltBorough name"Brooklyn"gt
- ltlight/gt
- ltlight/gt
- ltlightgt
- ltSesameStreetgt
- ltweathergtsunnylt/weathergt
- lt/SesameStreetgt
- lt/lightgt
- lt/Boroughgt
- ltBorough name"Queens"/gt
- lt/NewYorkgt
Returns ltweathergtsunnylt/weathergt
41XPath as a Query
- An XPath returns matched XML nodes
- string(/NewYork/Burough/light/SesameStreet/weather
) - On
- ltNewYorkgt
- ltBorough name"Brooklyn"gt
- ltlight/gt
- ltlight/gt
- ltlightgt
- ltSesameStreetgt
- ltweathergtsunnylt/weathergt
- lt/SesameStreetgt
- lt/lightgt
- lt/Boroughgt
- ltBorough name"Queens"/gt
- lt/NewYorkgt
Returns sunny
42XPath as a Query
- An XPath returns matched XML nodes
- /NewYork/Burough/light/SesameStreetweather'sunny
' - On
- ltNewYorkgt
- ltBorough name"Brooklyn"gt
- ltlight/gt
- ltlight/gt
- ltlightgt
- ltSesameStreetgt
- ltweathergtsunnylt/weathergt
- lt/SesameStreetgt
- lt/lightgt
- lt/Boroughgt
- ltBorough name"Queens"/gt
- lt/NewYorkgt
Returns ltSesameStreetgt ltweathergtsunnylt/weather
gt lt/SesameStreetgt weather was automatically
converted to a string before comparison
operator was applied.
43XPath as a Query
- An XPath returns matched XML nodes
- If path is ambiguous, all matching nodes are
returned - If path does not resolve to an existing node, the
empty set is returned - Predicates, , provide constraints
- In some contexts, XPath is automatically
converted to string value inside the matched
element or attribute. - Examples querying a set of VOResource documents
- /Resourcecontains(content/description,
'cluster') - Return all resource elements where the
description contains the word cluster - /Resourcefacility
- Return all resources that have a facility element
- /Resource/capability_at_xsitype'csConeSearch'/in
terface/accessURL - Return the interface URLs of all ConeSearch
services - /Resource_at_xsitype'vsDataCollection'/coverage/
stcSTCResourceProfile//stcAllSky
44XPath as a Query
- An XPath returns matched XML nodes
- If path is ambiguous, all matching nodes are
returned - If path does not resolve to an existing node, the
empty set is returned - Predicates, , provide constraints
- In some contexts, XPath is automatically
converted to string value inside the matched
element or attribute. - Examples querying a set of VOResource documents
- /Resourcecontains(content/description,
'cluster') - Return all resource elements where the
description contains the word cluster - /Resourcefacility
- Return all resources that have a facility element
- /Resource/capability_at_xsitype'csConeSearch'/in
terface/accessURL - Return the interface URLs of all ConeSearch
services - /Resource_at_xsitype'vsDataCollection'/coverage/
stcSTCResourceProfile//stcAllSky
Let's try Exercise 3
45XQuery (a.k.a. XML Query)
- What it is
- A W3C standard for querying XML documents
- An analogue to SQL for tables
- Why you might care
- It is one of the supported query languages in the
standard VO Registry Interface. - It can handle certain complex registry queries
that ADQL cannot - What youll get from this session
- Ability to query XML documents by modifying
existing an XQuery
46XQuery an analogy to SQL
- SQL queries tables
- Result of an SQL statement is a table
- Columns of result table controlled by the SELECT
clause - Rows controlled by the WHERE clause
- XQuery queries XML documents
- Result of an XQuery is an XML document
- The form of the XML document is set by the return
clause - The contents of the result is controlled by the
for, let, and where clauses
47XQuery syntax think FLWOR
- XQuery supports several types of expressions that
can return XML results - XPath, Constructors,
- FLWOR for let where orderby return
- FLWOR clauses
- for/let clause selects data from source XML
documents
48XQuery syntax think FLWOR
declare namespace vr "http//www.ivoa.net/xml/VOR
esource/v0.10" declare namespace vs
"http//www.ivoa.net/xml/VODataService/v0.5"
for vr in //Resource/capability_at_xsitype"csCo
neSearch" where contains(vr//description,
"quasar") return ltconesearchgt
lttitlegtstring(title)lt/titlegt
lturlgtstring(vr/capability/interface/accessURL)lt
/urlgt lt/conesearchgt
- Searching for Cone Search services suitable for
getting data about galaxy clusters - To be used with the Carnivore Registry
49XQuery syntax think FLWOR
Declare namespace prefixes to use in query
- declare namespace vr "http//www.ivoa.net/xml/VOR
esource/v0.10" - declare namespace vs "http//www.ivoa.net/xml/VOD
ataService/v0.5" - for vr in //Resource/capability_at_xsitype"csCon
eSearch" - where contains(vr//description, "quasar")
- return
- ltconesearchgt
- lttitlegtstring(title)lt/titlegt
- lturlgtstring(vr/capability/interface/acces
sURL)lt/urlgt - lt/conesearchgt
- Searching for Cone Search services suitable for
getting data about galaxy clusters - To be used with the Carnivore Registry
50XQuery syntax think FLWOR
Declare namespace prefixes to use in query
declare namespace vr "http//www.ivoa.net/xml/VOR
esource/v0.10" declare namespace vs
"http//www.ivoa.net/xml/VODataService/v0.5"
for vr in //Resource/capability_at_xsitype"csCo
neSearch" where contains(vr//description,
"quasar") return ltconesearchgt
lttitlegtstring(title)lt/titlegt
lturlgtstring(vr/capability/interface/accessURL)lt
/urlgt lt/conesearchgt
Loop over all ConeSearch resources
- for clause sets up a loop around matching
occurrences - XPath both selects the Resource element node to
put into the variable and constrains which
Resources are included - let clause (not used here) can also set
variables. - Used to join across documents, self-joins
- If vr is used in the variable definition, the
new value would be different in each pass of the
loop.
51XQuery syntax think FLWOR
Declare namespace prefixes to use in query
declare namespace vr "http//www.ivoa.net/xml/VOR
esource/v0.10" declare namespace vs
"http//www.ivoa.net/xml/VODataService/v0.5"
for vr in //Resource/capability_at_xsitype"csCo
neSearch" where contains(vr//description,
"quasar") return ltconesearchgt
lttitlegtstring(title)lt/titlegt
lturlgtstring(vr/capability/interface/accessURL)lt
/urlgt lt/conesearchgt
Loop over all ConeSearch resources
Restrict output to ConeSearch services about
quasars
- where clause further restricts the output
- Optional, often dont need it
- for vr in //vrResource_at_xsitype"csConeSearch
and - contains(vr//vrdescrip
tion, "quasar")
52XQuery syntax think FLWOR
Declare namespace prefixes to use in query
declare namespace vr "http//www.ivoa.net/xml/VOR
esource/v0.10" declare namespace vs
"http//www.ivoa.net/xml/VODataService/v0.5"
for vr in //Resource/capability_at_xsitype"csCo
neSearch" where contains(vr//description,
"quasar") return ltconesearchgt
lttitlegtstring(title)lt/titlegt
lturlgtstring(vr/capability/interface/accessURL)lt
/urlgt lt/conesearchgt
Loop over all ConeSearch resources
Restrict output to ConeSearch services about
quasars
Extract and display Desired information
- return clause sets the output template
- used to denote non-literal output
- Applied to each value of vr
- XQuery supports several other expression types
not shown here - Conditionals, function definition, etc.
- Many more predefined funtions
- XQuery a full XML processing language
53XSL XML Stylesheet Language
- What it is
- A language for describing the transformation of
XML data from one form to another - XML -gt HTML
- XML -gt XML
- XML -gt Plain text
- Why you might care
- XSL can be used to create human readable
renderings of XML data from the VO - XSL is used by the Java Skynode toolkit for
converting ADQL to local SQL - What youll get from this session
- Ability to modify XML transformations via simple
changes to a stylesheet
54XML Stylesheet Language
- An XML document that describes a transformation
- Contains a list of templates
- Each describes the transformation for one type of
node (e.g. an element with a particular name) - Node is identified by an XPath
- Relative to current context node
- Usually one template for /, the root of the
document - A template can call other templates
- XSL provides a number of programming structures
- Conditionals, looping, user-defined functions,
built-in functions, extensibilty - Variables are immutable!
- XSLT XSL Transformation
55A tour through a stylesheetxmltech-VOResource.xsl
to transform xmltech-adil.xml into plain text
- lt?xml version"1.0" encoding"UTF-8"?gt
- ltxslstylesheet xmlns"http//www.ivoa.net/xml/VOR
esource/v1.0" - xmlnsvr"http//www.ivoa.net/xml/VORe
source/v1.0" - xmlnsvs"http//www.ivoa.net/xml/VODa
taService/v1.0" - xmlnsxsl"http//www.w3.org/1999/XSL/
Transform" - xmlnsxsi"http//www.w3.org/2001/XMLS
chema-instance" - version"1.0"gt
- ltxsloutput method"text"/gt
- ltxsltemplate match"/"gt
- Resource Description Record
- ltxslapply-templates select"Resource" /gt
- lt/xsltemplategt
56A tour through a stylesheetxmltech-VOResource.xsl
to transform xmltech-adil.xml into plain text
lt?xml version"1.0" encoding"UTF-8"?gt ltxslstyles
heet xmlns"http//www.ivoa.net/xml/VOResource/v1.
0" xmlnsvr"http//www.ivoa.net/xml/
VOResource/v1.0" xmlnsvs"http//www
.ivoa.net/xml/VODataService/v1.0"
xmlnsxsl"http//www.w3.org/1999/XSL/Transform"
xmlnsxsi"http//www.w3.org/2001/XML
Schema-instance" version"1.0"gt
ltxsloutput method"text"/gt ltxsltemplate
match"/"gt Resource Description
Record ltxslapply-templates select"Resource" /gt
lt/xsltemplategt
Define prefixes for all namespaces well be using
57A tour through a stylesheetxmltech-VOResource.xsl
to transform xmltech-adil.xml into plain text
lt?xml version"1.0" encoding"UTF-8"?gt ltxslstyles
heet xmlns"http//www.ivoa.net/xml/VOResource/v1.
0" xmlnsvr"http//www.ivoa.net/xml/
VOResource/v1.0" xmlnsvs"http//www
.ivoa.net/xml/VODataService/v1.0"
xmlnsxsl"http//www.w3.org/1999/XSL/Transform"
xmlnsxsi"http//www.w3.org/2001/XML
Schema-instance" version"1.0"gt
ltxsloutput method"text"/gt ltxsltemplate
match"/"gt Resource Description
Record ltxslapply-templates select"Resource" /gt
lt/xsltemplategt
Define prefixes for all namespaces well be using
Our output format will be plain text
- Three output types
- xml, html, text
58A tour through a stylesheetxmltech-VOResource.xsl
to transform xmltech-adil.xml into plain text
lt?xml version"1.0" encoding"UTF-8"?gt ltxslstyles
heet xmlns"http//www.ivoa.net/xml/VOResource/v1.
0" xmlnsvr"http//www.ivoa.net/xml/
VOResource/v1.0" xmlnsvs"http//www
.ivoa.net/xml/VODataService/v1.0"
xmlnsxsl"http//www.w3.org/1999/XSL/Transform"
xmlnsxsi"http//www.w3.org/2001/XML
Schema-instance" version"1.0"gt
ltxsloutput method"text"/gt ltxsltemplate
match"/"gt Resource Description
Record ltxslapply-templates select"Resource" /gt
lt/xsltemplategt
Define prefixes for all namespaces well be using
Our output format will be plain text
Our root document template sets up the output
document and calls next template
- When raw text appears, XSLT engine will preserve
spacing (like carriage returns) around text - Resource is the root of our data document
59A tour through a stylesheetxmltech-VOResource.xsl
to transform xmltech-adil.xml into plain text
- ltxsltemplate match"Resource" gt
- ltxslvalue-of select"substring-after(_at_xsi
type,'')"/gt - ltxsltextgt
- lt/xsltextgt
- ltxslvalue-of select"title"/gt
- ltxsltextgt (lt/xsltextgt
- ltxslvalue-of select"shortName"/gt
- ltxsltextgt)
- IVOA Identifier lt/xsltextgt
- ltxslvalue-of select"identifier"/gt
- ltxslapply-templates select"content" /gt
- ltxslapply-templates select"curation" /gt
- lt/xsltemplategt
- ltxsltemplate match"content"gt
- ltxslapply-templates select"description"
/gt - ltxsltextgt
- lt/xsltextgt
Resource template
- When a template runs, it changes the context
node to the node matched by the template - Subsequent XPaths within template are relative to
that context node
60A tour through a stylesheetxmltech-VOResource.xsl
to transform xmltech-adil.xml into plain text
- ltxsltemplate match"Resource" gt
- ltxslvalue-of select"substring-after(_at_xsi
type,'')"/gt - ltxsltextgt
- lt/xsltextgt
- ltxslvalue-of select"title"/gt
- ltxsltextgt (lt/xsltextgt
- ltxslvalue-of select"shortName"/gt
- ltxsltextgt)
- IVOA Identifier lt/xsltextgt
- ltxslvalue-of select"identifier"/gt
- ltxslapply-templates select"content" /gt
- ltxslapply-templates select"curation" /gt
- lt/xsltemplategt
- ltxsltemplate match"content"gt
- ltxslapply-templates select"description"
/gt - ltxsltextgt
- lt/xsltextgt
Resource template
text tags can be used to take explicit control
of spacing
61A tour through a stylesheetxmltech-VOResource.xsl
to transform xmltech-adil.xml into plain text
- ltxsltemplate match"Resource" gt
- ltxslvalue-of select"substring-after(_at_xsi
type,'')"/gt - ltxsltextgt
- lt/xsltextgt
- ltxslvalue-of select"title"/gt
- ltxsltextgt (lt/xsltextgt
- ltxslvalue-of select"shortName"/gt
- ltxsltextgt)
- IVOA Identifier lt/xsltextgt
- ltxslvalue-of select"identifier"/gt
- ltxslapply-templates select"content" /gt
- ltxslapply-templates select"curation" /gt
- lt/xsltemplategt
- ltxsltemplate match"content"gt
- ltxslapply-templates select"description"
/gt - ltxsltextgt
- lt/xsltextgt
Resource template
text tags can be used to take explicit control
of spacing
value-of will print the string values of nodes
- our XPaths point to elements
- Relative to vrResource!
- value-of will convert it to a string
62A tour through a stylesheetxmltech-VOResource.xsl
to transform xmltech-adil.xml into plain text
- ltxsltemplate match"Resource" gt
- ltxslvalue-of select"substring-after(_at_xsi
type,'')"/gt - ltxsltextgt
- lt/xsltextgt
- ltxslvalue-of select"title"/gt
- ltxsltextgt (lt/xsltextgt
- ltxslvalue-of select"shortName"/gt
- ltxsltextgt)
- IVOA Identifier lt/xsltextgt
- ltxslvalue-of select"identifier"/gt
- ltxslapply-templates select"content" /gt
- ltxslapply-templates select"curation" /gt
- lt/xsltemplategt
- ltxsltemplate match"content"gt
- ltxslapply-templates select"description"
/gt - ltxsltextgt
- lt/xsltextgt
Resource template
text tags can be used to take explicit control
of spacing
value-of will print the string values of nodes
Pass control to other templates
63A tour through a stylesheetxmltech-VOResource.xsl
to transform xmltech-adil.xml into plain text
- ltxsltemplate match"Resource" gt
- ltxslvalue-of select"substring-after(_at_xsi
type,'')"/gt - ltxsltextgt
- lt/xsltextgt
- ltxslvalue-of select"title"/gt
- ltxsltextgt (lt/xsltextgt
- ltxslvalue-of select"shortName"/gt
- ltxsltextgt)
- IVOA Identifier lt/xsltextgt
- ltxslvalue-of select"identifier"/gt
- ltxslapply-templates select"content" /gt
- ltxslapply-templates select"curation" /gt
- lt/xsltemplategt
- ltxsltemplate match"content"gt
- ltxslapply-templates select"description"
/gt - ltxsltextgt
- lt/xsltextgt
Resource template
text tags can be used to take explicit control
of spacing
value-of will print the string values of nodes
Pass control to other templates
Loop over all occurances of contentLevel
- Normally, apply-templates will automatically loop
over multiple occurances - Here, we need to insert commas
- for-each also changes the context node
64A tour through a stylesheetxmltech-VOResource.xsl
to transform xmltech-adil.xml into plain text
- ltxsltemplate match"Resource" gt
- ltxslvalue-of select"substring-after(_at_xsi
type,'')"/gt - ltxsltextgt
- lt/xsltextgt
- ltxslvalue-of select"title"/gt
- ltxsltextgt (lt/xsltextgt
- ltxslvalue-of select"shortName"/gt
- ltxsltextgt)
- IVOA Identifier lt/xsltextgt
- ltxslvalue-of select"identifier"/gt
- ltxslapply-templates select"content" /gt
- ltxslapply-templates select"curation" /gt
- lt/xsltemplategt
- ltxsltemplate match"content"gt
- ltxslapply-templates select"description"
/gt - ltxsltextgt
- lt/xsltextgt
Resource template
text tags can be used to take explicit control
of spacing