Title: COS 381
1COS 381
2Agenda
- Questions??
- Resources
- Source Code Available for examples in Text Book
in Blackboard - Also _at_ http//perleybrook.umfk.maine.edu/SourceCod
e/ - Html and XHTML examples
- http//perleybrook.umfk.maine.edu/samples/
- In Class Work
- http//perleybrook.umfk.maine.edu/SourceCode/incla
ssWork/ - Assignment 3 Not corrected
- Assignment 4 is posted
- Due April 4
- Quiz 2 is not corrected
- One student has yet to take quiz
- Finish Discussion on XML
3XML Schemas
- Problems with DTDs 1. Syntax is different
from XML - cannot be parsed with an XML
parser 2. It is confusing to deal with two
different syntactic forms 3. DTDs do not
allow specification of particular kinds of
data XML Schemas is one of the alternatives to
DTD - Two purposes 1. Specify the structure
of its instance XML documents 2. Specify the
data type of every element and attribute of its
instance XML documents DTD can be converted to
schemas http//www.hitsw.com/xml_utilites/
Resource http//www.w3.org/XML/Schema
4XML Schemas (continued)
- - Schemas are written using a namespace
- http//www.w3.org/2001/XMLSchema
- - Every XML schema has a single root, schema
- The schema element must specify the namespace
for schemas as its xmlnsxsd attribute - - Every XML schema itself defines a tag set,
which must be named - targetNamespace
- http//cs.uccs.edu/planeSchema
- For this class
- http//perleybrook.umfk.maine.edu/ home
directory - If we want to include nested elements, we must
set the elementFormDefault attribute to qualified - - The default namespace must also be specified
- xmlns "http//cs.uccs.edu/planeSchema"
5XML Schemas (continued)
- - A complete example of a schema element
ltxsdschema lt!-- Namespace for the schema itself
--gt ltxmlnsxsd "http//www.w3.org/
2001/XMLSchema" lt!-- Namespace where elements
defined here will be placed --gt
lttargetNamespace "http//cs.uccs.edu/planeSchema"
lt!-- Default namespace for this document --gt
xmlns "http//cs.uccs.edu/planeSchema" lt!--
Next, specify non-top-level elements to be in the
target namespace --gt elementFormDefault
"qualified"gt
6XML Schemas (continued)
- - A complete example of a schema element
ltschema lt!-- Namespace for the schema itself --gt
ltxmlns "http//www.w3.org/2001/XMLS
chema" lt!-- Namespace where elements defined
here will be placed --gt lttargetNamespace
"http//cs.uccs.edu/planeSchema" lt!-- Default
namespace for this document --gt xmlnsplanes
"http//cs.uccs.edu/planeSchema" lt!-- Next,
specify non-top-level elements to be in the
target namespace --gt elementFormDefault
"qualified"gt
7XML Schemas (continued)
- Defining an instance document - The root
element must specify the namespaces it uses
1. The default namespace 2. The standard
namespace for instances
(XMLSchema-instance) 3. The location
where the default namespace is defined, using
the schemaLocation attribute, which is assigned
two values ltplanes xmlns "http//cs.uccs.edu
/planeSchema" xmlnsxsi
http//www.w3.org/2001/XMLSchema-instance"
xsischemaLocation "http//cs.uccs.edu/pla
neSchema planes.xsd" gt Data Type
Categories 1. Simple (strings only, no
attributes and no nested elements) 2.
Complex (can have attributes and nested elements)
8 XML Schemas (continued)
- XMLS defines over 40 data types http//www.w3.o
rg/TR/xmlschema-2/built-in-datatypes -
Primitive string, Boolean, float, -
Derived byte, decimal, positiveInteger, -
User-defined (derived) data types specify
constraints on an existing type (the base type)
- Constraints are given in terms of facets
(totalDigits, maxInclusive, etc.) - Both
simple and complex types can be either named or
anonymous - DTDs define global elements
(context is irrelevant) - With XMLS, context
is essential, and elementscan be either
1. Local, which appears inside an element that is
a child of schema, or 2. Global, which
appears as a child of schema
9XML Schemas (continued)
- Defining a simple type - Use the element
tag and set the name and type attributes
ltxsdelement name "bird" type
"xsdstring" /gt - An instance could have
ltbirdgt Yellow-bellied sap sucker lt/birdgt -
Element values can be constant, specified with
the fixed attribute fixed "three-toed" -
User-Defined Types - Defined in a simpleType
element, using facets specified in the content of
a restriction element - Facet values are
specified with the value attribute
10XML Schemas (continued)
ltxsdsimpleType name "middleName" gt
ltxsdrestriction base "xsdstring" gt
ltxsdmaxLength value "20" /gt
lt/xsdrestrictiongt lt/xsdsimpleTypegt -
Categories of Complex Types 1. Element-only
elements (elements no text) 2. Text-only
elements (text no elements) 3.
Mixed-content elements (both text and elements
4. Empty elements - Element-only elements
- Defined with the complexType element - Use
the sequence tag for nested elements that must be
in a particular order - Use the all tag if
the order is not important
11XML Schemas (continued)
ltxsdcomplexType name "sports_car" gt
ltxsdsequencegt ltxsdelement name "make"
type "xsdstring" /gt ltxsdelement name
"model " type "xsdstring" /gt
ltxsdelement name "engine" type "xsdstring"
/gt ltxsdelement name "year" type
"xsdstring" /gt lt/xsdsequencegt
lt/xsdcomplexTypegt - Nested elements can include
attributes that give the allowed number of
occurrences (minOccurs, maxOccurs,
unbounded) - We can define nested elements
elsewhere ltxsdelement name "year" gt
ltxsdsimpleTypegt ltxsdrestriction base
"xsddecimal" gt ltxsdminInclusive value
"1990" /gt ltxsdmaxInclusive value
"2003" /gt lt/xsdrestrictiongt
lt/xsdsimpleTypegt lt/xsdelementgt
12XML Schemas (continued)
- The global element can be referenced in the
complex type with the ref attribute
ltxsdelement ref "year" /gt - Validating
Instances of XML Schemas - Can be done with
several different tools - One of them is xsv,
which is available from http//www.ltg.ed.ac.uk
/ht/xsv-status.html - Note If the schema is
incorrect (bad format), xsv reports that it can
find the schema planes.xsd Plane.xml
13Modified Plane example from text
lt?xml version "1.0"?gt lt!-- plane.xml A
simple XML document for illustrating a schema
The schema is in planes.xsd
--gt lt?xml-stylesheet type"text/css"
href"planes.css"?gt ltplane xmlns
"http//perleybrook.umfk.maine.edu/examples/planes
" xmlnsxsi "http//www.w3.org/2001/XMLSchema-
instance" xsischemaLocation
"http//perleybrook.umfk.maine.edu/examples/planes
planes.xsd"gt ltmakegt Cessna lt/makegt
ltmakegt Piper lt/makegt ltmakegt Beechcraft
lt/makegt lt/planegt
14Cont
lt?xml version "1.0"?gt lt!-- planes.xsd A
simple schema for plane.xml --gt ltxsdschema
xmlnsxsd "http//www.w3.org/2001/XMLSchema
" targetNamespace "http//perleybrook.umfk.m
aine.edu/examples/planes" xmlns
"http//perleybrook.umfk.maine.edu/examples/planes
" elementFormDefault "qualified"gt
ltxsdelement name "planes"gt
ltxsdcomplexTypegt ltxsdallgt
ltxsdelement name "make"
type "xsdstring"
minOccurs "1"
maxOccurs "unbounded" /gt
lt/xsdallgt lt/xsdcomplexTypegt
lt/xsdelementgt lt/xsdschemagt
15Validating Instances of Schemas
- Various systems for validating instances against
schemas - Online http//www.w3.org/2001/03/webdata/xsv
- XML support libraries include validation Xerces
from Apache, Saxon, Altova XML tools - Some IDEs have automatic validation Altova Spy,
Eclipse with Oxygen, Eclipse with XML Buddy Pro - Certain IDEs will use schemas to provide support
for XML file creation
16Displaying Raw XML Documents
- There is no presentation information in an XML
document - An XML browser should have a default
style sheet for an XML document that does not
specify one - You get a stylized listing of
the XML
17Displaying XML Documents with CSS
- A CSS style sheet for an XML document is just
a list of its tags and associated styles -
The connection of an XML document and its style
sheet is made through an xml-stylesheet
processing instruction lt?xml-stylesheet
type "text/css" href
"mydoc.css"?gt --gt SHOW planescss.xml and
planes.css
187.9 Overview of XSLT
- A functional style programming language
- Basic syntax is XML
- There is some similarity to LISP and Scheme
- An XSLT processor takes an XML document as input
and produces output based on the specifications
of an XSLT document
198.9 XSLT Style Sheets
Resources http//www.w3.org/Style/XSL/ http//www.
vbxml.com/xsl/tutorials/intro/default.asp http//w
ww.w3.org/TR/xsl/ -An XSLT processor merges an
XML document into an XSLT style sheet -
This merging is a template-driven process - An
XSLT style sheet can specify page layout,
page orientation, writing direction, margins,
page numbering, etc.
207.9 XSLT Style Sheets
- A family of specifications for transforming XML
documents - XSLT specifies how to transform documents
- XPath specifies how to select parts of a
document and compute values - XSL-FO specifies a target XML language
describing the printed page - XSLT describes how to transform XML documents
into other XML documents such as XHTML - XSLT can be used to transform to non-XML
documents as well
21Figure 7.5 XSLT processing
227.9 XSLT Structure
- An XSLT document contains templates
- XPath is used to specify patterns of elements to
which the templates should apply - The content of a template specifies how the
matched element should be processed - The XSLT processor will look for parts of the
input document that match a template and apply
the content of the template when a match is found - Two models
- Template-driven works with highly regular data
- Data-driven works with more loosely structured
data with a recursive structure (like XHTML
documents)
23XSLT Style Sheets (continued)
- The processing instruction we used for
connecting a CSS style sheet to an XML document
is used to connect an XSLT style sheet to an XML
document lt?xml-stylesheet type "text/xsl href
"XSLT style sheet"?gt - An example lt?xml
version "1.0"?gt lt!-- xslplane.xml --gt
lt?xml-stylesheet type "text/xsl"
href "xslplane.xsl" ?gt ltplanegt
ltyeargt 1977 lt/yeargt ltmakegt Cessna lt/makegt
ltmodelgt Skyhawk lt/modelgt ltcolorgt Light
blue and white lt/colorgt lt/planegt
24XSLT Style Sheets (continued)
- An XSLT style sheet is an XML document with a
single element, stylesheet, which defines
namespaces ltxslstylesheet xmlnsxsl
http//www.w3.org/1999/XSL/Format xmlns
http//www.w3.org/1999/xhtmlgt - If a style
sheet matches the root element of the XML
document, it is matched with the template
ltxsltemplate match "/"gt - A template can
match any element, just by naming it (in place of
/) ltxsltemplate match plane"gt -
XSLT elements include two different kinds of
elements, those with content and those for which
the content will be merged from the XML doc
- Elements with content often represent xHTML
elements ltspan style "font-size 14"gt
Happy Easter! lt/spangt
25XML Transformations and Style Sheets (continued)
- - XSLT elements that represent xHTML elements are
simply copied to the merged document - - The XSLT value-of element
- - Has no content
- - Uses a select attribute to specify part of
the XML data to be merged into the XSLT
document - ltxslvalue-of select CAR/ENGINE" /gt
- - The value of select can be any branch of
the document tree - SHOW xslplane.xml xslplane.xsl
- - The XSLT for-each element
- - Used when an XML document has a sequence of
- the same elements
- --gt SHOW xslplanes.xml
- --gt SHOW xslplanes.xsl
267.9 Producing Transformation Output
- Elements not belonging to XSLT and other text
will be copied to the output when the containing
template is applied - The value-of tag causes the select attribute
value to be evaluated and the result is put into
the output - The value of an element is the text contained in
it and in sub-elements - The value of an attribute is the value
- Example xslplane1.xsl transforms the xslplane.xml
file into XHTML for display purposes - If the style sheet is in the same directory as
the XML file, some browsers will pick up the
transformation and apply it - This works with Firefox and Internet Explorer but
not Opera
277.9 Processing Repeated Elements
- File xslplanes.xml contains data about multiple
airplanes - The style sheet xslplanes.xsl uses a for-each
element to process each plane element in the
source document - A sort element could be included to sort output
- The element
- ltxslsort selectyear data-typenumber/gt
- Specifies sorting by year
28Overview
297.10 XML Processors
- XML processors provide tools in programming
languages to read in XML documents, manipulate
them and to write them out
307.10 Purposes of XML Processors
- Four purposes
- Check the basic syntax of the input document
- Replace entities
- Insert default values specified by schemas or
DTDs - If the parser is able and it is requested,
validate the input document against the
specified schemas or DTDs - The basic structure of XML is simple and
repetitive, so providing library support is
reasonable - Examples
- Xerces-J from the Apache foundation provides
library support for Java - Command line utilities are provided for checking
well-formedness and validity - Two different standards/models for processing
- SAX
- DOM
317.10 Parsing
- The process of reading in a document and
analyzing its structure is called parsing - The parser provides as output a structured view
of the input document
327.10 The SAX Approach
- In the SAX approach, an XML document is read in
serially - As certain conditions, called events, are
recognized, event handlers are called - The program using this approach only sees part of
the document at a time
337.10 The DOM Approach
- In the DOM approach, the parser produces an
in-memory representation of the input document - Because of the well-formedness rules of XML, the
structure is a tree - Advantages over SAX
- Parts of the document can be accessed more than
once - The document can be restructured
- Access can be made to any part of the document at
any time - Processing is delayed until the entire document
is checked for proper structure and, perhaps,
validity - One major disadvantage is that a very large
document may not fit in memory entirely
347.11 Web Services
- Allow interoperation of software components on
different systems written in different languages - Servers that provide software services rather
than documents - Remote Procedure Call
- DCOM and CORBA provide implementations
- DCOM is Microsoft specific
- CORBA is cross-platform
357.11 Web Service Protocols
- Three roles in web services
- Service providers
- Service requestors
- Service registry
- The Web Services Definition Language provides a
standard way to describe services - The Universal Description, Discovery and
Integration service provides a standard way to
provide information about services in response to
a query - SOAP is used to specify requests and responses