Title: Extensible Stylesheet Language Transformations : XSLT
1Lecture 15 Extensible Stylesheet Language
Transformations XSLT
2RECAP
Example (well-formed) XML document (Lec. 13)
lt?xml version"1.0" encoding"UTF-8"?gt ltpatient
nhs-no"7503557856"gt ltnamegt ltfirstgtJosephlt/firs
tgt ltmiddlegtMichaellt/middlegt ltlastgtBloggslt/last
gt ltprevious /gt ltpreferredgtJoelt/preferredgt lt/n
amegt lttitlegtMrlt/titlegt ltaddressgt ltstreetgt2
Gloucester Roadlt/street1gt ltstreet /gt ltstreet
/gt ltcitygtBristollt/citygt ltcountygtAvonlt/countygt
ltpostcodegtBS2 4QSlt/postcodegt lt/addressgt lttelgt
lthomegt0117 9541054lt/homegt ltmobilegt07710
234674lt/mobilegt lt/telgt ltemailgtjoe.bloggs_at_email.c
omlt/emailgt ltfax /gt lt/patientgt
repeating street element.
3RECAP
Example document with validation (Lec. 14)
patient.xsd
patient.xml
 lt?xml version"1.0" encoding"UTF-8" ?gt
ltxsschema elementFormDefault"qualified
attributeFormDefault"unqualified
xmlnsxs"http//www.w3.or
g/2001/XMLSchema"gt  ltxselement name"patientgt
ltxscomplexTypegt ltxssequencegt Â
ltxselement name"name" type"nameType" /gt Â
ltxselement name"title"
type"titleType" /gt  ltxselement
name"address" type"addressType" /gt Â
ltxselement name"tel" type"telType"
maxOccurs"2" /gt  ltxselement
name"email" type"emailType" minOccurs"0" /gt Â
ltxselement name"fax"
type"xsstring" minOccurs"0" /gt Â
lt/xssequencegt  ltxsattribute
name"nhs-no" type"xsinteger" use"required" /gt
 lt/xscomplexTypegt lt/xselementgt ltxscomplexTy
pe name"nameType"gt ltxssequencegt Â
ltxselement name"first" type"nameStringType" /gt
 ltxselement name"middle"
type"nameStringType" /gt  ltxselement
name"last" type"nameStringType" /gt Â
ltxselement name"previous" type"nameStringType"
/gt  ltxselement name"preferred"
type"nameStringType" /gt Â
lt/xssequencegt lt/xscomplexTypegt . . . lt/xsschema
gt
lt?xml version"1.0" encoding"UTF-8"?gt ltpatient
nhs-no"7503557856 xmlnsxsi"http//www.w3.org/2
001/XMLSchema-instance" xsinoNamespaceSchemaLocat
ion"patient.xsd"gt ltnamegt ltfirstgtJosephlt/first
gt ltmiddlegtMichaellt/middlegt ltlastgtBloggslt/lastgt
ltprevious /gt ltpreferredgtJoelt/preferredgt lt/na
megt lttitlegtMrlt/titlegt ltaddressgt ltstreetgt2
Gloucester Roadlt/street1gt ltstreet /gt ltstreet
/gt ltcitygtBristollt/citygt ltcountygtAvonlt/countygt
ltpostcodegtBS2 4QSlt/postcodegt lt/addressgt lttelgt
lthomegt0117 9541054lt/homegt ltmobilegt07710
234674lt/mobilegt lt/telgt ltemailgtjoe.bloggs_at_email.c
omlt/emailgt ltfax /gt lt/patientgt
patient.xml is now an instance document in
the vocabulary defined in the schema patient.xsd
4XSLT
XSLT Extensible Stylesheet Language
Transformations is an application for specifying
rules which transform one XML document into
another document. It uses template rules in the
stylesheet to match patterns in the input
document and when a match is found it writes the
template from the rule to the output tree.
Basic XSLT processing model
5XSLT Document Model (showing parser)
xpath engine
6XSLT Parser Processing Model
Both the source document and XSLT stylesheet are
loaded into the processor's memory. How this
happens is dependent on the implementation. One
option is that both are loaded as DOM documents
under the control of a program. Another option is
that the stylesheet is referenced by a processing
instruction in the source XML document. IE5/6 or
Netscape6 can load the stylesheet when the XML
document is loaded.
7XSLT is a functional 4gl programming language
A function maps one set of things onto another
set of things using one or more rules. simple
function x2 y simple xslt function or
template rule
1 2 3 4 5 - - -
1 4 9 16 25 - - -
when this pattern found in the input document
ltnamegtReubenltnamegt
output this ltpgtHello Reubenlt/pgt
y
x
ltxsltemplate match//name /gt ltpgtHello ltxslte
xtgt lt/xsltextgt ltxslvalue-of select.
/gt lt/pgt lt/xsltemplategt
OUTPUT
8XSLT uses XPath to find nodes in a xml document
Example 1
xslt rule using xpath expression match the
ltfirstgt element
ltnamegt ltfirstgtJosephlt/firstgt
ltmiddlegtMichaellt/middlegt ltlastgtBloggslt/lastgt
ltprevious /gt ltpreferredgtJoelt/preferre
dgt lt/namegt
ltxsltemplate match //patient/name/first
gt .do something with content lt/xsltemplate
gt
xslt rule using xpath expressions (2) get the
value of the attribute named nhs-no in the
ltpatientgt element.
Example 2
ltpatient nhs-no"7503557856gt lt/patientgt
ltxsltemplate match //patientgt
ltxslvalue of select ./_at_nhs-no
/gt lt/xsltemplategt
9the tree view of example xml document
patient
nhs-no
7503557856
address
fax
name
title
tel
Mr
first
middle
last
previous
preferred
street1
street2
street3
city
county
postcode
BS2 4QS
Bristol
Avon
2 Gloucester Rd
Joseph
Michael
Bloggs
Joe
mobile
home
xpath is simply a way of finding specific nodes
in a document tree like files in a file
hierarchy e.g. c\teaching\myfiles\thisdoc.doc
.
07710234674
01179541054
KEY
element
attribute
content
10xpath axes (node sets)
xpath has thirteen axis child parent decendent an
cestor descendent-of-self ancestor-of-self followi
ng-sibling preceding-sibling following preceding a
ttribute namespace self
11xslt push and pull models of document processing
push model - source document controls the
structure e.g cascading style sheet (CSS)
applies a style but cannot change the structure
of the input document. pull model the
stylesheet controls the structure and the source
documents acts as the data source. xslt can
apply both the push and pull model - you can
write a xslt stylesheet to change the order of
elements, do calculations based on the number of
elements (using xpath), do branching depending on
an element value, generate other stylesheets,
write java or c code or source code for any
other language, use svg to generate graphics,
apply formatting object (fo) constructs that tell
a fo-processor to lay out pages for printing or
write pdf and almost everything else supported by
other programming languages. hence it has all
the constructs to apply our fundamental Jackson
concepts of sequence, selection and iteration.
12XSLT parser processing APIs and the push/pull
models
Tree Based these parsers read the source and
style documents and build a tree in memory of all
nodes. Often some indexing mechanism will be
applied. The trees can then be processed in any
order since all nodes are available once the tree
has been built. Examples include the Document
Object Model (DOM) based APIs such as that used
in Apache Xerces DOM API. Distadvantages of tree
based API models is that because they often load
the entire XML document into memory they can
require an awful lot of memory. Sometimes a
hundred times as much as the document itself.
This makes this model unwieldy for very large
documents. Push Model or the producer/consumer
model. These parsers control the pace of the
application by parsing the producer (the xml
source doc) and informing the consumer (the
application program) when certain events occur.
The classic example is the SAX (Simple Model for
XML) API. The best known SAX implementation is
Apache Xerces SAX API. Advantages include the
fact that the whole document need not exist in
memory all at once. But jumping to various
places in the document is hard and must be
handled by the programmer. Pull Model - the API
requests events from the producer (xml source
doc) rather than waiting for these events to
occur. A notable pull model parser is found in
the XMLReader class in the .NET framework.
Typically a loop is created that continually
reads from the xml document until the end but
acts on open items as they are seen. Curser Model
- the newest class of xml parsing API. The
cursor acts as a lens on the source document but
unlike push/pull models the cursor can jump to
anywhere in the document. Thus it has all the
advantages of the tree based model but without
the massive memory overhead. The
ObjectXpathNavigator class in the .NET frameowrk
implements a cursor based API.
13example xslt stylesheet patient.xslt (1)
lt?xml version"1.0" encoding"UTF-8"?gt ltpatient
nhs-no"7503557856"gt ltnamegt
ltfirstgtJosephlt/firstgt
ltmiddlegtMichaellt/middlegt
ltlastgtBloggslt/lastgt ltprevious /gt
ltpreferredgtJoelt/preferredgt lt/namegt
lttitlegtMrlt/titlegt ltaddressgt
ltstreetgt2 Gloucester Roadlt/street1gt
ltstreet /gt ltstreet /gt
ltcitygtBristollt/citygt
ltcountygtAvonlt/countygt ltpostcodegtBS2
4QSlt/postcodegt lt/addressgt lttelgt
lthomegt0117 9541054lt/homegt
ltmobilegt07710 234674lt/mobilegt lt/telgt
ltemailgtjoe.bloggs_at_email.comlt/emailgt ltfax
/gt lt/patientgt
stylesheet patiemt.xslt used to generate HTML
14example xslt stylesheet (2)
minimum stylesheet
lt?xml version"1.0" encoding"UTF-8"?gt
ltxslstylesheet version"1.0"
xmlnsxsl"http//www.w3.org/1999/XSL/Transform"gt
lt/xslstylesheetgt
the xslt parser (msxml) that built into
internet explorer applies the above stylesheet
15xslt stylesheet fragment from patient.xslt (3)
start the template
fragment 1
output a table
ltxsltemplate match"//patient"gt
lthtmlgt ltheadgtlttitlegtPatient
Record XSL Transformation
Examplelt/titlegt lt/headgt
ltbodygt lth3gtPatient
Recordlt/h3gt lttable border"1"
cellpadding"4" cellspacing"0"
bordercolor"cccccc" width"400"gt
lttrgtlttd width"100"gtNHS
Numberlt/tdgt
lttdgt ltxslvalue-of
select"./_at_nhs-no" /gt
lt/tdgt lt/trgt
ltxslapply-templates
/gt
lt/tablegt lt/bodygt
lt/htmlgt lt/xsltemplategt
- output 1 row with two columns
- write NHS Number in first column
- write the value of the attrubute in the
- second column
apply all other templates
output
lthtmlgt ltheadgtlttitlegtPatient Record XSL
Transformation Examplelt/titlegt
lt/headgt ltbodygt lth3gtPatient Recordlt/h3gt lttable
border"1" cellpadding"4" cellspacing"0"
bordercolor"cccccc" width"400"gt lttrgtlttd
width"100"gtNHS Numberlt/tdgt lttdgt7503557856
lt/tdgtlt/trgt RESULT FROM APPLYING ALL
OTHER TEMPLATES
lt/tablegt lt/bodygt lt/htmlgt
end the template
16 xslt stylesheet fragment from patient.xslt
(4)
fragment 2 make a selection
ltxsltemplate match"//fax"gt lttrgt
lttd align"left" valign"top"gtFaxlt/tdgt
lttdgt ltxslchoosegt
ltxslwhen test"not(node())"gt
ltxsltextgt-lt/xsltextgt
lt/xslwhengt ltxslotherwisegt
ltxslvalue-of select"." /gt
lt/xslotherwisegt
lt/xslchoosegt lt/tdgt
lt/trgt lt/xsltemplategt
test if the context node (self) is empty if it
is write - otherwise output the
content string of the context node
represented by .
fragment 3 output a html hyperlink
ltxsltemplate match"//email"gt lttrgt
lttd align"left" valign"top"gtEmaillt/tdgt
lttdgt ltagt ltxslattribute
name"href"gt mailto ltxslvalue-of
select"." /gt lt/xslattributegt ltxslvalue-of
select"." /gt lt/agt
lt/tdgt lt/trgt lt/xsltemplategt
output
lta href"mailtojoe.bloggs_at_email.com"gtjoe.bloggs_at_e
mail.comlt/agt
17example xslt stylesheet from patient.xslt (5)
fragment 4 do a while loop (iteration)
ltxslfor-each select"//street"gt ltxslif
test"node()gt ltxslvalue-of
select"."/gtltbr/gt lt/xslifgt lt/xslfor-eachgt
cycle through all the street nodes and if not
empty then output any content
full version of the patient.xslt file can be
found with the other example files patient.xml
now points to it ( notice how Internet Explorer
applies the stylesheet )