Title: XML, XMLSchema, RDF and RDFSchema
1XML, XML-Schema, RDF and RDF-Schema
2XML What is XML?
- XML (eXstensible Markup Language) is a framework
for defining markup languages - XML derives from SQML
- XML is a standardization effort by W3C
- XML conforms to ISO 8879
- XML is more a syntax than a language i.e. there
is no fixed set of markup tags
3XML What was XML designed for?
- To seperate syntax from semantics to provide a
common framework for structuring information - To allow tailor-made markup for any possibly
application domain - To support internationalization and platform
independence - To be the future of structured information
4XML How does XML differ from HTML?
- New tags and attributes can be defined
- Document structures can be nested to any level of
complexity - An XML document can contain or refer to optional
descriptions of its grammar for use by
applications that need to perform structure
validation
5XML How does XML differ from HTML?
- We can structure the same information in
HTML and XML
lt?xml version1.0?gt ltemployeesgt ltmarketinggt
ltemployee id"1834"gt ltnamegtGustav
Sielmannlt/namegt ltemailgtgsielmann_at_Dot.comlt/em
ailgt lttelgt43/0662/723942-124lt/telgt
ltfaxgt43/0662/723942-800lt/faxgt lt/employeegt
lt/marketinggt lt/employeesgt
lthtmlgt ltheadgt lttitlegtEmployeeslt/titlegt lt/headgt
ltbodygt lth1gtMarketinglt/h1gt lth2gtGustav Sielman
(1834)lt/h2gt ltpgte-mail gsielmann_at_Dot.comlt/pgt
ltpgtTel. 43/0662/723942-124lt/pgt
ltpgtFax.43/0662/723942-800lt/pgt lt/bodygt lt/htmlgt
6XML XML in a Nutshell
- XML provides a standardized syntax for markup
languages - XML uses elements and attributes to define a tree
structure - An XML document can have a tree structure of
arbitrary level of complexity
7XML The structure of an XML document
- An XML document is composed of
- Prolog
- Elements
- Attributes
- Entitiy References
- Comments
- Possibly a DTD (Document Type Definition)
8XML The Prolog
- The Prolog is the first structural element in the
XML document - It is usually divided into an XML declaration and
(optional) a DTD - E.g.
lt?xml version1.0 encodingUTF-8
standaloneyes?gt
9XML Elements
- All subsequent elements must be within one
document element - XML elements must contain a start tag and a
matching end tag prefixed by a slash. E.g.
ltYEARgt1976lt/YEARgt - Empty elements can be written ltYEAR/gt instead of
using both tags without content - XML is case-sensitive!
- Element names must begin with an underscore or a
letter. Subsequent letters in the element name
may include letters, digits, underscores,
hyphens and periods
10XML Attributes
- XML attributes are attached to elements
- They are a way of associating values to an
element without making them part of the actual
content - They must begin with a letter or an underscore
and must not contain any white spaces
ltemployee nameGustav Sielmanngt
gsielman_at_Dot.com lt/employeegt
11XML Entity References
- Entity references are used to reference data that
is not directly in the structure - Entity references can be internal external
- Pre-built entitiy references are used to
represent , lt, gt, and . - e.g. the string PeterTom(Dont cry for me)
would be written
PeterampTom(quotDonapost cry for mequot)
12XML Why use Entitiy References?
- to make maintenance easy and scalable
- out of comfort
- to use symbols we could not use else
13XML Comments
- Comments are a special set of tags that start
with lt!-- and end with --gt - All data written between these two tags is
ignored by the XML processor. - Comments are usually used to make small notes
inside the XML document or to comment out entire
sections of XML code
lt!-- I HAVE TO GET GUSTAVS EMAIL ltemployee
nameGustav Sielmann gt ltemail/gt
lt/employeegt --gt
14XML Well-formdness and Valdity
With an XML parser an XML document can be checked
for two things
- Well-formdness i.e. if the document obeys the
syntactical rules of XML (has a prolog, a
document element, all elements closed,) - Validity i.e. if the document obeys the rules in
a DTD or in an XML-Schema
15XML Document Type Definitions
- DTDs (Document Type Definitions) containt a list
of elements, tags, attributes and entity
references contained in an XML document and
describes their relationships to each other. - A DTD specifies a set of rules for the structure
of a document therefore making it easy to share
data with everyone that conforms to the same
encoding standard
16XML Why use DTDs?
- to define a grammar for one of several XML
documents - to check XML documents against this grammar
17XML DTDs
- Internal, i.e. placed in the prolog of an XML
document or - External, beeing identified by an URL, thus
making it easy to share the same encoding
standard with other people
Document Type Definitions can be
18XML Structure of a DTD
- A DTD always starts with lt!DOCTYPE and always
ends with gt - Directly after the lt!DOCTYPE comes the name of
the document element followed by a - Then comes a list of all elements and attributes
contained in the XML file, including the document
element
19XML Example of a DTD
ltemployeesgt ltmarketinggt ltemployee
id"1834"gt ltnamegtGustav Sielmannlt/namegt
ltemailgtgsielmann_at_Dot.comlt/emailgt
lttelgt43/0662/723942-124lt/telgt
ltfaxgt43/0662/723942-800lt/faxgt lt/employeegt
lt/marketinggt lt/employeesgt
XML Structure
lt!DOCTYPE employees lt!ELEMENT employees
(marketing)gt lt!ELEMENT marketing
(employee)gt lt!ELEMENT employee
(name,email,tel,fax?)gt lt!ATTLIST employee id
CDATAIMPLIEDgt lt!ELEMENT name (PCDATA)gt lt!ELEMENT
email (PCDATA)gt lt!ELEMENT tel
(PCDATA)gt lt!ELEMENT fax (PCDATA)gt gt
DTD
20XML Companion Standards
- XML Namespaces allow for modular document
definition, multiple inheritance and collision
avoidance - XPath or the XML Path Language allows navigation
of the document tree - XPointer allows tree components as targets
- The XML Linking Language defines linking
capability
21XML Companion Standards
- The XML Style Language degines presentation
capbility - XSLT provides for the transformation of documents
- XML Schema to allow DTDs to be defined as XML
documents and to define custom data types in
order for content value control
22XML XML Namespaces
- The XML namespaces recommendation defines a way
to distinguish between duplicate element type and
attribute names. - An XML namespace is a collection of element type
and attribute names. The namespace is identified
by a unique name, which is a URI. - XML namespaces are declared with an xmlns
attribute, which can associate a prefix with the
namespace.
23XML XPath
- XPath is a non-XML language used to identify
particular parts of XML documents. - XPath lets you write expressions that refer to
the document's first person element, the seventh
child element of the third person element, the ID
attribute of the first person element whose
contents are the string "Fred Jones, - XSLT and XPointer use XPath to identify
particular points in an XML document.
24XML XPointer
- XPointer defines an addressing scheme for
individual parts of an XML document. - XPointers enable you to target a given element by
number, name, type, or relation, to other
elements in the document. - XPointers uses the XPath syntax to identify the
parts of the document they point to.
25XML XML Linking Language
- The XML Linking Language (XLink) allows elements
to be inserted into XML documents in order to
create and describe links between resources. - XLink provides a framework for creating both
basic unidirectional links and more complex
linking structures, e.g. linking relationships
among more than two resources, associate metadata
with a link,
26XML XML Style Language
- XSL (XML Style Language) is a language for
expressing stylesheets for XML documents - XSL describes how to display an XML document of a
given type - XSL defines a set of elements called Formatting
Objects and attributes (in part borrowed from
CSS2 properties and adding more complex ones)
27XML XSLT
- XSLT (XSL Transformations) is a language for
transforming XML documents into other XML
documents - XSLT is also designed to be used independently of
XSL. However, XSLT is not intended as a
completely general-purpose XML transformation
language. Rather it is designed primarily for the
kinds of transformations that are needed when
XSLT is used as part of XSL. - A transformation expressed in XSLT describes
rules for transforming a source tree into a
result tree.
28XML-Schema Why use XML-schema?
In order to validate an XML document you use
XML-schema to specify
- The allowed structure of an XML document
- The allowed data types contained in one (or
several) XML documents
29XML-Schema Why use schemas instead of DTDs?
- XML-schema documents themselves are XML documents
and therefore use the same syntax and can
themselves be validated with schemas - XML-schema has more powerful possibilites to
define custom data types
30XML-Schema Why use schemas instead of DTDs?
- XML-schema allows to define elements with nil
content - XML-schema allows to define multiple elements
with the same name but different content - XML-schema uses XML namespaces
31XML-Schema A Simple XML-Schema
lt?xml version1.0 encodingUTF-8?gt ltmarketing
xmlnsxsi http//www.w3c.org/2001/XMLSchema
-Instance xsinoNameSpaceSchemaLocation
http//www.Dot.com/mySchema.xsdgt
ltemployeegtGustav Sielmannlt/employeegt
ltemployeegtArnold Rummerlt/employeegt
ltemployeegtJohann Neumeierlt/employeegt lt/marketinggt
lt?xml version1.0 encodingUTF-8?gt ltxsdschema
xmlnsxsd http//www.w3c.org/2001/XMLSchemagt
ltxsdelement name marketinggt
ltxsdcomplexTypegt ltxsdsequencegt
ltxsdelement name employee type
xsdstring maxOccurs
unbounded/gt lt/xsdsequencegt
lt/xsdcomplexTypegt lt/xsdelementgt lt/xsdschemagt
32XML-Schema XML-Schema Structure
An XML-schema is composed of
- The Schema Element
- Element Definitions
- Attribute Definitions
- Annotations
- Type Definitions
33XML-Schema The Schema Element
- The schema element is the container element where
all elements, attributes and data types contained
in an XML document are stored - The schema element refers to the XML-schema
definition at W3C
ltxsdschema xmlnsxsd http//www.w3c.org/2001/X
MLSchemagt . . . . . lt/xsdschemagt
34XML-Schema Defining Elements
- Elements must have a name and a type
- Elements can contain simple, predefined data
types - Elements can be defined in regard to their
cardinality - Elements can refer to other element definitions
ltxsdelement name employee type
xsdstring maxOccurs unbounded/gt
35XML-Schema Defining Attributes
- Like elements, attributes must have a name and
type - Attributes can use custom data types
- Attributes can be restricted in regard to
cardinality or default values - Attributes can refer to other attribute
definitions
ltxsdattribute name country type
xsdstring fixed Austria/gt
36XML-Schema Annotations
- XML-schema provides several tags for annotating a
schema documentation (intended for human
readers), appInfo (intended for applications) and
annotation - Documentation and appInfo usually appear as
subelements of annotation
ltxsdannotationgt ltxsddocumentation xmllang
engt here goes the documentation text for
the schema lt/xsddocumentationgt lt/xsdannotation
gt
37XML-Schema Data Type Definitions
- Pre-built Simple Types
- Derived from Simple Types
- Complex Types
XML-schema data types are either
38XML-Schema Simple Types
- Simple types are elements that contain data
- Simple types may not contain attributes or
sub-elements - New simple types are defined by deriving them
from built-in simple types
ltxsdsimpleType name mySimpleDayOfMothgt
ltxsdrestriction base xsdpositiveIntegergt
lt!--positiveInteger defines the minimum to be
1--gt ltxsdmaxInclusion value 31gt
lt/xsdrestrictiongt lt/xsdsimpleTypegt
39XML-Schema Some Built-in Simple Types
40XML-Schema Complex Types
- Complex types are elements that allow
sub-elements and/or attributes - Complex types are defined by listing the elements
and/or attributes nested within - Complex types are used if one wants to define
groups or choices of elements
ltxsdcomplexType name myAdressTypegt
ltxsdsequencegt ltxsdelement name Name
type xsdstringgt ltxsdelement name
Email type xsdstringgt ltxsdelement
name Tel type xsdstringgt
lt/xsdsequencegt lt/xsdcomplexTypegt
41XML-Schema Complex Types
In complex types elements can be com-bined using
the following constructs
- Group collection of elements, usually used to
refer to a common group of elements - Sequence all the named elements must apear in
the sequence listed - Choice one and only one of the elements must
appear - All all the named elements must appear, but in
no specific order
42XML-Schema Mixed, Empty and Any Content
- Mixed content is used if you want to model
elements that includes both subelements and
character data - Empty content is used to define elements that are
not to include any subelements and character data - Any content (the most basic data type) does not
constrain the content in any way
43XML-Schema Inheritance
- XML-Schema provides a pseudo inheritance via
type-derivations - In XML-Schema all inheritance has to be defined
explicitly - New types can only be created by extending or
restricting existing types - Types can only be derived from one type
multiple inheritance is not supported
44XML-Schema Restricting a Type
- New simple types can be derived by constraining
facets of a simple type - The XSDRESTRICTION element is used to state the
base type
ltxsdsimpleType name AustrianPostalCodegt
ltxsdrestriction base xsdintegergt
ltxsdminInclusive value 1000gt
ltxsdmaxInclusive value 9999gt
lt/xsdrestrictiongt lt/xsdsimpleTypegt
45XML-Schema Extending a Type
- New complex types can be derived by extending
simple types - The XSDEXTENSION element is used to state the
base type
ltxsdcomplexTypegt ltxsdsimpleContentgt
ltxsdextension base "xsddecimal"gt
ltxsdattribute name "currency"
type"xsdstring"/gt lt/xsdextensiongt
lt/xsdsimpleContentgt lt/xsdcomplexTypegt