Inside an XSLT Processor - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Inside an XSLT Processor

Description:

Recently joined XSL WG as invited expert. About this talk: The XSLT Processing Model ... and attribute value templates. Build rule base for matching patterns ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 16
Provided by: micha82
Category:

less

Transcript and Presenter's Notes

Title: Inside an XSLT Processor


1
Inside an XSLT Processor
  • Michael Kay, ICL
  • 19 May 2000

2
About me
  • ICL Fellow, systems architect
  • Database background
  • Developer of SAXON
  • Author of XSLT Programmers Referencepublished
    by Wrox Press
  • Recently joined XSL WG as invited expert

3
About this talk
  • The XSLT Processing Model
  • Structure of an XSLT Processor
  • Performance
  • current limitations
  • possible ways forward
  • Ideas on future development of the language

4
The XSLT Processing Modelfirst approximation
Style
sheet
Source
Result
Document
Document
Transformation
Process
5
The XSLT Processing Modelin more detail
Style
sheet
Parsing
Serialization
Stylesheet
Tree
Source
Result
Document
Document
SourceTree
ResultTree
TransformationProcess
6
An XSLT Template Rule
Pattern
ltxsltemplate match"appendix/para1"gt
lth4gt ltxslnumber level"single"/gt
ltxslvalue-of select"_at_title"/gt lt/h4gt
ltpgt ltxslapply-templates/gt
lt/pgt lt/xsltemplategt
ResultElement
XPathExpression
Instruction
7
Architecture of an XSLT processor
XML Parser
Stylesheet
Tree Builder
XPathcompiler
XSLTcompiler
Compiled Stylesheet
XSLT interpreter
XPath interpreter
Source
Result
XMLParser
TreeBuilder
OutputManager
XML serializer
HTML serializer
Text serializer
SourceTree
8
At compile time
  • Parse and validate the stylesheet
  • Parse and validate all XPath expressions
  • and attribute value templates
  • Build rule base for matching patterns
  • Resolve references to named variables, functions,
    and templates
  • Flatten the import tree
  • Optimize XPath expressions

9
Where does the time go?
Serialize
Build Source
Output
Tree
Process
Templates
Compile
Stylesheet
10
Is Performance a Problem?
  • Client side usually not
  • XSLT processing is generally faster than download
    speed
  • Server side sometimes
  • CPU usage when handling very high throughput
  • Memory problems when handling very large
    documents

11
Some performance tips
  • Keep documents small split them first
  • Process once, at publishing time
  • or use caching
  • Do several simple transforms in series
  • Avoid complex patterns in template rules
  • Use keys
  • Use external functions
  • Avoid "//item"

12
Performance progress
Simpleoptimization
Stylesheet compilation Java code
optimization Lazy evaluation Simple XPath
optimization Tail recursion
20 sec/Mb
Incremental parsing Pipelining Use of
schema Pattern matching Full XPath
optimization Compile to bytecodes
Advancedoptimization
5 sec/Mb
1 sec/Mb
Today
13
Interesting research areas
  • Database integration transforming a document
    without loading into memory
  • Applying regular expression theory
  • Execution as a sequence of serial passes
  • Using schema knowledge at compile time
  • Eager node numbering

14
Potential language features
  • Serial transformation language?
  • Multi-pass stylesheets
  • Higher-level "relational" constructs grouping,
    joins, logical quantifiers
  • Richer data types
  • Assignment statement ????

15
Summary
  • XSLT language is now stable
  • XSLT processor technology is starting to be well
    understood
  • First crop of products are capable of significant
    performance
  • Now the research needs to start on the next phase
    of optimization techniques
Write a Comment
User Comments (0)
About PowerShow.com