Using XQuery and XSLT on NonXML Data - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Using XQuery and XSLT on NonXML Data

Description:

Everything in my childhood always connected together nicely. Then I discovered life was not filled with ... Fahrenheit 451, Ray Bradbury, 978-0345342966, 208 ... – PowerPoint PPT presentation

Number of Views:267
Avg rating:3.0/5.0
Slides: 26
Provided by: jimik
Category:
Tags: nonxml | xslt | bradbury | data | ray | using | xquery

less

Transcript and Presenter's Notes

Title: Using XQuery and XSLT on NonXML Data


1
Using XQuery and XSLT on Non-XML Data
  • XML 2007 1445 on 04 Dec 2007
  • Tony Lavinio (alavinio(at)datadirect.com)

2
Tinkertoy, TOGL, LEGO
  • Everything in my childhood always connected
    together nicely.
  • Then I discovered life was not filled with
    interchangeable parts.
  • Lets do something about it.

3
TSaxon Replaced XML with HTML Parser
java -jar saxon.jar -H html-doc style-doc
  • How simple is that?
  • Just replace the parser!
  • Can we generalize that mechanism?

4
TSaxon Replaced XML with HTML Parser
java -jar saxon.jar -H html-doc style-doc
  • How simple is that?
  • Just replace the parser!
  • Can we generalize that mechanism?
  • It would be like a universal adapter

5
GUI Tools Demo
  • Open CSV file with notepad.exe
  • Open CSV file in GUI tool as XML
  • Open EDI file with notepad.exe
  • Open EDI file in GUI tool as XML

6
How does it work?
7
The URIResolver Interface
8
How Were Going To Do It
  • Typically, catalogs allow reaching otherwise
    unreachable resources.
  • In this case, we are using them to reach normally
    reachable resources, but also to transform them
    mid-flight.
  • Non-XML ? Converter ? XML ? XQuery/XSLT
  • or
  • XQuery/XSLT ? XML ? Converter ? Non-XML

9
Why Were Going To Do It
  • We Must Transform Data
  • XSLT and XQuery are excellent tools for
    transforming. Use the proper tool for the job.
  • But they work with XML, and much of the world
    isnt XML.
  • Really, they work with the XML data model, and
    XML is just a convenient representation, right?
    So they should be able to work with anything
    XMLish anything tree-shaped or even square.

10
Use the Source, Luke
  • Source
  • ?StreamSource
  • Simplest to codejust write text.
  • ?DOMSource
  • Nothing good to say about DOM.
  • ?SAXSource
  • Offers a push event interface.
  • ?StAXSource
  • Offers a pull interface, harder to implement.
  • (StAX and possibly SAX can provide string
    pooling, which can also help performance.)

11
Comma-Separated Value file sample
(What is this fascination we have with lists of
books in our demos?) (And, no, Walden Two is not
the sequel to Walden, despite what modern
movie-goers might think.)
12
CSV Demo from Command Line
  • java -cp binsaxon9.jarnet.sf.saxon.Query-r
    com.ddtek.xml2007.CSVResolver-s
    x-csvfile///c/XML_2007/books.txt-u"lthtmlgtltbo
    dygt.lt/bodygtlt/htmlgt"
  • java -cp binsaxon9.jarnet.sf.saxon.Transform-r
    com.ddtek.xml2007.CSVResolver-s
    x-csvfile///c/XML_2007/books.txt-utable.xsl

13
table.xsl
  • lt?xml version"1.0" encoding"ASCII"?gt
  • ltxsltransform xmlnsxsl"http//www.w3.org/1999/X
    SL/Transform" version"2.0"gt
  • ltxsloutput method"html" encoding"ASCII"/gt
  • ltxsltemplate match"/"gt
  • lthtmlgt
  • ltbodygt
  • ltxslcopy-of select"."/gt
  • lt/bodygt
  • lt/htmlgt
  • lt/xsltemplategt
  • lt/xsltransformgt

14
EDI X12 file sample
  • ISA0000ZZISACUST9
    0892541100600706071458U00401820
    0Pgt'GSFAGSCUST95137624388200706071458820X
    004010'ST9970001'AK1AG38'AK9A111'SE4000
    1'GE1820'IEA1820'

15
EDI Demo from Command Line
  • java -cp binsaxon9.jarnet.sf.saxon.Query-r
    com.ddtek.xml2007.MultiResolver-s
    x-edifile///c/XML_2007/997.x12-u"for i in
    (/X12/GS/GS08, ltxgt-lt/xgt, /X12/ST/ST01) return
    i/text()"!omit-xml-declarationyes
  • java -cp binsaxon9.jarnet.sf.saxon.Transform-r
    com.ddtek.xml2007.MultiResolver-s
    x-edifile///c/XML_2007/997.x12-uedi.xsl

16
edi.xsl
  • lt?xml version"1.0"?gt
  • ltxslstylesheet version"2.0" xmlnsxsl"http//ww
    w.w3.org/1999/XSL/Transform"gt
  • ltxsloutput method"text"/gt
  • ltxsltemplate match"/"gt
  • ltxslvalue-of select"X12/GS/GS08"/gt
  • ltxsltextgt-lt/xsltextgt
  • ltxslvalue-of select"X12/ST/ST01"/gt
  • lt/xsltemplategt
  • lt/xslstylesheetgt

17
How the CSV Resolver Works
  • Look for URI with x-csv scheme
  • If not found, return null which means use default
    URI handling
  • If found, strip off the x-csv and take the
    remainder as a URI
  • And instead of just returning a stream
    containing that
  • Build a new stream that reads from that,
    transforms it, and returns that.

18
How the Multi Resolver Works
  • Just like the CSV Resolver,
  • Except looks for and dispatches multiple schemes
  • x-csv for comma-separated-value files
  • x-edi for EDI X12 files

19
A StreamSource Converter
  • Implements a java.io.InputStream or
    java.io.Reader interface
  • When the caller calls read() or read(...), pull
    from underlying stream
  • ...Translate on-the-fly enough characters to
    satisfy the request (at least one)
  • And return that converted text instead of the
    original text.

20
A SAXSource Converter
  • SAXSource is a little easier to write, because
    instead of data being pulled through, you push it
    at your convenience
  • Just implement a org.xml.sax.XMLReader
  • When you see your parse() method get called, read
    that input as fast as you please and call the
    methods on the ContentHandler (and maybe
    LexicalHandler, etc.) you were given

21
Please, Dont Do This
  • If driving the input via SAXSource, please dont
    do this to start your XML
  • content.processingInstruction( "xml", "version'
    1.0' encoding'utf-8'")
  • because

22
Its not a PI!
23
References
  • DataDirect XQuery blog
  • http//www.xml-connection.com/
  • DataDirect XML Converters
  • http//www.xmlconverters.com/
  • Stylus Studio
  • http//www.stylusstudio.com/
  • Saxonica
  • http//www.saxonica.com/
  • XML Catalogs
  • http//www.oasis-open.org/committees/
    ?download.php/14809/xml-catalogs.html
  • The official web site for SAX
  • http//www.saxproject.org/

24
Explanation of the attached code
.zip file with source and data.
  • Dont read this now! This is reference for after
    the conference.
  • CSVResolver is the simple CSV resolver. It is a
    subset of the MultiResolver.
  • MultiResolver is a resolver that can handle CSV
    or EDI.
  • CsvToXmlSaxReader reads CSV files and emits them
    as a series of SAX events.See also
    EdiToXmlSaxReader for the EDI equivalent.This is
    a push interface, where we push the data
    through. See also CsvToXmlStreamReader, which is
    a pull interface since the caller pulls, or
    requests data from us.StAX is a very effective
    interface that offers event-driven access like
    SAX but uses a pull paradigm like Reader. It
    often results in superior performance, at the
    cost of a considerably larger API and
    consequently more complicated implementation. I
    didn't do a demo of one. Maybe next year.
  • EdiToXmlSaxReader reads EDI files and emits them
    as a series of SAX events. See also
    CsvToXmlSaxReader for the CSV equivalent.
  • CsvToXmlStreamReader reads CSV files and emits
    them characters through the java.io.Reader
    interface.This is a pull interface, since the
    caller is asking us for the data.See also
    CsvToXmlSaxReader, which is a push interface,
    where we do the driving.
  • JaxpXsltDemo uses JAXP to drive XSLT through a
    converter.
  • SaxonXQueryDemo uses Saxon to drive XQuery
    through a converter.
  • DataDirectXQueryDemo does the same but using
    DataDirect XQuery.
  • DemoCSVtoXML is a little program that opens a
    file and prints the contents.Then it does it
    again, but inserts the CsvToXmlStreamReader which
    catches the content and converts it into XML
    character text in-flight. This just proves the
    CsvToXmlStreamReader works.
  • XmlSaxReaderBase This is the code that is common
    to the CsvToXmlSaxReader and EdiToXmlSaxReader
    classes.

25
Questions?
Write a Comment
User Comments (0)
About PowerShow.com