Title: Experiences of Document Transformations with XSLT and DOM
1Experiences of Document Transformations with XSLT
and DOM
- Anne Honkaranta, Virpi Lyytikäinen, Pasi
Tiitinen, University of Jyväskylä, Finland - inSGML project
2Content
- Poem Publishers, Inc.
- Poems
- Publishing environment
- Transformations
- Tranformation techniques
- Transformations in server-client environment
- Tranformations in Poem Publishers, Inc
- Challenges encountered
- Lessons learned
3Poem Publishers, Inc.
- Fictional company
- Publishes Finnish poems on WWW
- Poems are authored in XML format according to a
DTD - The company offers the poets an authoring
environment if so desired - The poems can form collections
4Poem.dtd
5Publishing environment
- Microsoft IIS server v. 5.0
- Jscript, VBScript
- ASP 3.0
- DOM II
- Internet Explorer 5.5 or newer
- CSS Level 2
- MSXML 3.0
6Transformation
- Changing/converting document
- format
- structure /information schema
- content organization
- filtering the content
- all the above
- Conversion, filtering, and transformation are
sometimes used as synonyms
7Why you need transformations?
- Authors need content-oriented DTD
- Different end-user devices
- When managing documents we need to have them in
an optimal format for processing - --gt three-step publication process
- authoring -- processing -- output
8Transformation techniques
9Transformations in client-server environment
(XSLT/DOM)
- Alternatives
- using PI in XML source document (c)
- (can be written to the source document on a
web server) - DOM-interface and DOM objects for loading the
source XML and XSLT (c/s) - using DOM-interface scripting language
(Vbscript, Jscript) or Java
10Transformation chain (an example)
Output HTML/ XHTML doc rendered by CSS
11Exampleusing PI in source XML
12Example using DOM-objectsXSLT
13Example using VbscriptDOM
14Transformation types tested in Poem Publishers,
Inc.
- XML-to-XML
- XML-to-HTML
- XML-to-XHTML
15Transformation needs tested in Poem Publishers,
Inc.
- Tasks tested
- combining multiple source documents into output
view (poemheader/footer, poem list, poem
metadata) - combining multiple source documents into one file
(making a poem collection) - combining XSLT transformation documents for
transformation needs (poemfooter)
16Example combining XSLT-stylesheets
lt?xml version1.0 encodingiso-8859-1?gt ltxsls
tylesheet xmlnsxsl"http//www.w3.org/1999/XSL/T
ransform" version"1.0"
xmlnsxlink"http//www.w3.org/1999/xlink"
xmlns"http//www.w3.org/1999/REC-html40
1"gt ltxslimport href"header.xsl"/gt ltxsloutput
method"html" encoding"ISO-8859-1" /gt
lt?xml version1.0 encodingiso-8859-1?gt !--
Filename header.xsl
--gt ltxslstylesheet xmlnsxsl "http//www.w3.org
/1999/XSL/Transform" version"1.0"
xmlnsxlink"http//www.w3.org/1999/xlink"
xmlns"http//www.w3.org/1999/REC-html4
01"gt ltxsloutput method"html" encoding"ISO-8859-
1" /gt ltxsltemplate match"/" name"header"gt
17Challenges Encountered
- Problems with
- parsers and versions
- character encodings
- figures and links
- too many tools, scripting languages, and
programs
18Example Character encodings and parser
OUTPUT DOC
MSXML 3.0
INPUT DOC
-input doc encoding -maybe character
entities -entities are changed to actual
character reps. when transformed
-uses UTF-16 -detects output encoding from PI
when appropriate load/save methods
used -otherwise outputs UTF-16
-has some encoding -has an encoding
declaration -problem either of them is wrong
19Possibilities
- you can use XSLT-stylesheets as components and
combine them - a stylesheet can be seen as a re-usable component
on the server - you can also chain transformations
- you can keep your data in content-oriented form
and provide multiple output versions by using
transformations - problem management of DTDs, transformation
components and versions
20Lessons learned
- Use same character encodings in source documents
and transformation scripts - Offer a content oriented DTD for your authors
there is propably need for transformations anyway - Support level of CSS, XSLT and XML varies in
browsers - Tools are available for building XML publishing
environments allow extra time for dealing with
possible problems - Multiple skills and tools needed in publishing
environment, XML is not enough!
21More information inSGML project
http//haades.it.jyu.fi/inSGML/