Title: SAX: Simple API for XML 1.0 Showing structure of XML with a java program
1SAX Simple API for XML 1.0Showing structure of
XML with a java program
- The java program Tree.java runs the SAX parser on
an XML file to display tree structure and can
also be used to show parsing errors. - Program appears in notes and in Dietel xml text,
chapter 9.
2An xml file spacing1.xml
- lt?xml version "1.0"?gt
- lt!-- Fig. 9.4 spacing1.xml --gt
- lt!-- Whitespaces in nonvalidating parsing --gt
- lt!-- XML document without DTD --gt
- lttest name " spacing 1 "gt
- ltexamplegtltobjectgtWorldlt/objectgtlt/examplegt
- lt/testgt
3SAX Simple API for XML
- A java program to show tree structure of an XML
document Tree.java. - Run on command line
- java Tree yes/no f.xml
4Run Tree on notvalid.xml
- lt?xml version "1.0"?gt
- lt!-- Fig. 9.6 notvalid.xml --gt
- lt!-- Validation and non-validation --gt
- lt!DOCTYPE test
- lt!ELEMENT test (example)gt
- lt!ELEMENT example (PCDATA)gt
- gt
- lttestgt
- lt?test message?gt
- ltexamplegtltitemgtlt!CDATAHello
Welcome!gtlt/itemgtlt/examplegt - lt/testgt
5Run Tree on notvalid.xml
- C\PROGRA1\JAVA\JDK151.0_0\BINgtjava Tree no
notvalid.xml - URL fileC/PROGRA1/Java/JDK151.0_0/bin/notvali
d.xml - document root
- - element test
- - ignorable
- - proc-inst test "message"
- - ignorable
- - element example
- - element item
- - text "Hello Welcome!"
- - ignorable
- document end
- C\PROGRA1\JAVA\JDK151.0_0\BINgt
6Running again but showing SAX parse errors
- C\PROGRA1\JAVA\JDK151.0_0\BINgtjava Tree yes
notvalid.xml - URL fileC/PROGRA1/Java/JDK151.0_0/bin/notvali
d.xml - document root
- - element test
- - ignorable
- - proc-inst test "message"
- - ignorable
- - element example
- Parse Error Element type "item" must be
declared. - C\PROGRA1\JAVA\JDK151.0_0\BINgt
7The pastry xml file
- lt?xml version "1.0"?gt
- lt!-- pastry.xml --gt
- lt!-- Using an external subset --gt
- lt!DOCTYPE donuts SYSTEM "pastry.dtd"gt
- ltdonutsgt
- ltjellygtgrapelt/jellygt
- ltlemongtsourlt/lemongt
- ltlemongtreal sourlt/lemongt
- ltglazedgtchocolatelt/glazedgt
- lt/donutsgt
8Running a SAX parser on this (java code in notes)
- C\PROGRA1\JAVA\JDK151.0_0\BINgtjava MySAXApp
pastry.xml - Start document
- Start element donuts
- Start element jelly
- Characters "grape"
- End element jelly
- Start element lemon
- Characters "sour"
- End element lemon
- Start element lemon
- Characters "real sour"
- End element lemon
- Start element glazed
- Characters "chocolate"
- End element glazed
- End element donuts
- End document
- C\PROGRA1\JAVA\JDK151.0_0\BINgt
9A day planner using the SAX parser for XML and
java
10SAX Simple API for XML 2.0Sax 2.0 recently
released
- Xerces parser available at
- http//xml.apache.org/xerces
- You may need to search the apache site, I found
the latest version, zip file at - http//apache.cs.utah.edu/xml/xerces-j/
- This SAX v2.0 parser is needed to run
PrintXML.java example, chapter 9 Dietel.
11Sun App Server an aside
- This server may be needed for the WSDP
(webservicesdevelopmentpack) which contains a
tutorial, xerxes and xalan parsers, and classes
for SOAP - C\Sun\AppServer\docs\about.html
12Xerces parser
- Dietels XML How to program comes with a release
of the Xerces parser. You can unzip these files
youll need them for SOAP in any case. You can
also find the Xerces parser on the Apache site.
13Xerces
- Xerces-J is packaged as a ZIP file for all
platforms and operating systems. - You can run the Java jar command to unpack the
distribution. - jar xf Xerces-J-bin.1.2.0.zip
- jar xf Xerces-J-src.1.2.0.zip
- This command creates a "xerces-1_2_0"
sub-directory in the current directory - containing all the files.
- Files in the binary package release
- LICENSE License for Xerces-J
- Readme.html Web page redirect to
docs/html/index.html - xerces.jar Jar file containing all the
parser class files - xercesSamples.jar Jar file containing all sample
class files - data/ Directory containing sample
XML data files - docs/html/ Directory containing
documentation - docs/apiDocs/ Directory containing Javadoc
API for parser framework -
14Xerces
- I just ran the .exe files in the release which
seemed to unpack things ok.
15Xerces
- Running Xerces. Xerces is a java program which
comes as a jar file. Youll need the path set to
your java/bin directory and the classpath set to
wherever Xerces.jar is. I created batch files to
run SAXCount and DOMCount java programs
displaying the parser at work on xml files.
16The next few examples
- These examples come with the distribution and are
at - C\Xerces\xerces-1_2_0\docs\html\domwriter.html
17The xml files personal.xml
- lt?xml version"1.0" encoding"UTF-8"?gt
- lt!DOCTYPE personnel SYSTEM "personal.dtd"gt
- ltpersonnelgt
- ltperson id"Big.Boss"gt
- ltnamegtltfamilygtBosslt/familygt
ltgivengtBiglt/givengtlt/namegt - ltemailgtchief_at_foo.comlt/emailgt
- ltlink subordinates"one.worker two.worker
three.worker four.worker five.worker"/gt - lt/persongt
- ltperson id"one.worker"gt
- ltnamegtltfamilygtWorkerlt/familygt
ltgivengtOnelt/givengtlt/namegt - ltemailgtone_at_foo.comlt/emailgt
- ltlink manager"Big.Boss"/gt
- lt/persongt
- ltperson id"two.worker"gt
- ltnamegtltfamilygtWorkerlt/familygt
ltgivengtTwolt/givengtlt/namegt - ltemailgttwo_at_foo.comlt/emailgt
- ltlink manager"Big.Boss"/gt
- lt/persongt
- ltperson id"three.worker"gt
18Personal.dtd
- lt?xml encoding"UTF-8"?gt
- lt!ELEMENT personnel (person)gt
- lt!ELEMENT person (name,email,url,link?)gt
- lt!ATTLIST person id ID REQUIREDgt
- lt!ATTLIST person note CDATA IMPLIEDgt
- lt!ATTLIST person contr (truefalse) 'false'gt
- lt!ATTLIST person salary CDATA IMPLIEDgt
- lt!ELEMENT name ((family,given)(given,family))gt
- lt!ELEMENT family (PCDATA)gt
- lt!ELEMENT given (PCDATA)gt
- lt!ELEMENT email (PCDATA)gt
- lt!ELEMENT url EMPTYgt
- lt!ATTLIST url href CDATA 'http//'gt
19BatchfilesDOMCount.bat
- set PATHPATHC\Progra1\Java\jdk15.0_0\bin
- set CLASSPATHCLASSPATHc\xerces-1_2_0\xerces.j
arc\xerces-1_2_0\xercesSamples.jar - cd c\xerces-1_2_0
- java dom.DOMCount data/personal.xml
C\xerces-1_2_0gtjava dom.DOMCount
data/personal.xml data/personal.xml 170 ms (37
elems, 18 attrs, 26 spaces, 242 chars)
20BatchfilesSAXTest.bat
- set PATHPATHC\Progra1\Java\jdk15.0_0\bin
- set CLASSPATHCLASSPATHc\xerces-1_2_0\xerces.j
arc\xerces-1_2_0\xercesSamples.jar - cd c\xerces-1_2_0
- java sax.SAXCount data/personal.xml
21Saxcounter
- C\XERCES1gtSAXTest.bat
- C\XERCES1gtset PATHC\WINDOWS\system32C\WINDOW
SC\WINDOWS\System32\WbemC\ - PROGRA1\COMMON1\ADAPTE1\SystemC\jakarta\JAKAR
T1.28\binc\progra1\java\jd - k151.0_0\binc\jakarta\jakart1.28\common\libC
\Progra1\Java\jdk15.0_0\bin - C\XERCES1gtset CLASSPATHc\progra1\java\jdk151
.0_0\lib\tools.jarc\progra1 - \java\jdk151.0_0\binc\progra1\java\jdk151.0_0
\bin\helloc\jakarta\jakart1 - .28\common\libc\xerces-1_2_0\xerces.jarc\xerc
es-1_2_0\xercesSamples.jar - C\XERCES1gtcd c\xerces-1_2_0
- C\xerces-1_2_0gtjava sax.SAXCount
data/personal.xml - data/personal.xml 181 ms (37 elems, 18 attrs, 26
spaces, 242 chars)
22SAXWriter.batwriting out the xml content
- set PATHPATHC\Progra1\Java\jdk15.0_0\bin
- set CLASSPATHCLASSPATHc\xerces-1_2_0\xerces.j
arc\xerces-1_2_0\xercesSamples.jar - cd c\xerces-1_2_0
- java sax.SAXWriter data/personal.xml
- The next slide is the outputI cut some blank
lines to get it to fit.
23DOMWriter
- C\xerces-1_2_0gtjava dom.DOMWriter
data/personal.xml - data/personal.xml
- lt?xml version"1.0" encoding"UTF-8"?gt
- ltpersonnelgt
- ltperson contr"false" id"Big.Boss"gt
- ltnamegtltfamilygtBosslt/familygt
ltgivengtBiglt/givengtlt/namegt - ltemailgtchief_at_foo.comlt/emailgt
- ltlink subordinates"one.worker two.worker
three.worker four.worker five.worker"gtlt/linkgt - lt/persongt
- ltperson contr"false" id"one.worker"gt
- ltnamegtltfamilygtWorkerlt/familygt
ltgivengtOnelt/givengtlt/namegt - ltemailgtone_at_foo.comlt/emailgt
- ltlink manager"Big.Boss"gtlt/linkgt
- lt/persongt
- ltperson contr"false" id"two.worker"gt
- ltnamegtltfamilygtWorkerlt/familygt
ltgivengtTwolt/givengtlt/namegt - ltemailgttwo_at_foo.comlt/emailgt
- ltlink manager"Big.Boss"gtlt/linkgt
- lt/persongt
24- C\xerces-1_2_0gtjava sax.SAXWriter
data/personal.xml - data/personal.xml
- lt?xml version"1.0" encoding"UTF-8"?gt
- ltpersonnelgt
- ltperson contr"false" id"Big.Boss"gt
- ltnamegtltfamilygtBosslt/familygt
ltgivengtBiglt/givengtlt/namegt - ltemailgtchief_at_foo.comlt/emailgt
- ltlink subordinates"one.worker two.worker
three.worker four.worker five.work - er"gtlt/linkgt
- lt/persongt
- ltperson contr"false" id"one.worker"gt
- ltnamegtltfamilygtWorkerlt/familygt
ltgivengtOnelt/givengtlt/namegt - ltemailgtone_at_foo.comlt/emailgt
- ltlink manager"Big.Boss"gtlt/linkgt
- lt/persongt
- ltperson contr"false" id"two.worker"gt
- ltnamegtltfamilygtWorkerlt/familygt
ltgivengtTwolt/givengtlt/namegt - ltemailgttwo_at_foo.comlt/emailgt
- ltlink manager"Big.Boss"gtlt/linkgt
25Iterator viewsome classes missing org.w3c.dom.
- set PATHPATHC\Progra1\Java\jdk15.0_0\bin
- set CLASSPATHCLASSPATHc\xerces-1_2_0\xerces.j
arc\xerces-1_2_0\xercesSamples.jar - cd c\xerces-1_2_0
- java dom.traversal.IteratorView data/personal.xml
- --not running
26Xalan XSL
- Xalan can be downloaded as with Xerces. It also
comes on the Dietel XML CD. Xalan contains
software to process XSL. The distribution also
contains Xerces. - There are documents to help you get started at
C\Xalan\Xalan Getting Started.htm - Youll have to set your path and class path as we
did for Xerces. - Ive provided an example but your version will
depend on where you unpack the zip files.
27SimpleTransform output
- C\Xalan\xalan-j_1_2_D02\samples\SimpleTransformgtj
ava SimpleTransform - lt?xml version"1.0" encoding"UTF-8"?gt
- ltoutgtHellolt/outgt
- C\XALAN\XALAN-1\SAMPLESgt
28SimpleTransform xml and xsl files
- The xml
- lt?xml version"1.0"?gt
- ltdocgtHellolt/docgt
- The xsl
- lt?xml version"1.0"?gt
- ltxslstylesheet xmlnsxsl"http//www.w3.org/1999/
XSL/Transform" version"1.0"gt - ltxsltemplate match"doc"gt
- ltoutgtltxslvalue-of select"."/gtlt/outgt
- lt/xsltemplategt
- lt/xslstylesheetgt
29Batch file to run SimpleTransform
- set PATHPATHC\Progra1\Java\jdk15.0_0\bin
- set CLASSPATHCLASSPATHc\xerces-1_2_0\xerces.j
arc\xalan\xalan-j_1_2_D02\xalan.jarc\xalan\xal
an-j_1_2_D02\samples\xalansamples.jar - cd c\xalan\xalan-j_1_2_D02\samples\SimpleTransfor
m - java SimpleTransform
30ApplyXPath looking for a particular item in the
xml
- set PATHPATHC\Progra1\Java\jdk15.0_0\bin
- set CLASSPATHCLASSPATHc\xerces-1_2_0\xerces.j
arc\xalan\xalan-j_1_2_D02\xalan.jarc\xalan\xal
an-j_1_2_D02\samples\xalansamples.jar - cd c\xalan\xalan-j_1_2_D02\samples\ApplyXPath
- java ApplyXPath foo.xml /doc/name/_at_first
31ApplyXPath output
- C\Xalan\XALAN-1\samplesgtcd c\xalan\xalan-j_1_2_
D02\samples\ApplyXPath - C\Xalan\xalan-j_1_2_D02\samples\ApplyXPathgtjava
ApplyXPath foo.xml /doc/name/_at_first - ltoutputgt
- DavidDavidDonaldEmilyJackMyriamPaulRobertScottShan
elt/outputgt
32PureSAX
- The PureSAX class uses SAX DocumentHandlers and
the Xerces SAX parser to produce a stylesheet
tree, an XML input tree, and the transformation
result tree.
33Batch file for PureSAX
- set PATHPATHC\Progra1\Java\jdk15.0_0\bin
- set CLASSPATHCLASSPATHc\xerces-1_2_0\xerces.j
arc\xalan\xalan-j_1_2_D02\xalan.jarc\xalan\xal
an-j_1_2_D02\samples\xalansamples.jar - cd c\xalan\xalan-j_1_2_D02\samples\PureSAX
- java PureSAX
34An Applet to transform XML into HTML
- The applet uses a stylesheet to transform an XML
document into HTML. It displays the XML document,
the stylesheet, and the HTML output. - How to Use the Xalan-Java applet wrapper
- Include XSLTProcessorApplet class in an HTML
client. - Specify the XML source document and XSL
stylesheet.You can use the DocumentURL and
StyleURL PARAM tags or the XSLTProcessorApplet
setDocumentURL() method and XSLTProcessorApplet
setStyleURL() method. If the XML document
contains a stylesheet Processing Instruction
(PI), you do not need to specify an XSL
stylesheet. - Call the XSLTProcessorApplet transformToHTML()
method which performs the transformation and
returns the new document as a String.
35The applet remarks on classes and running it
- This applet transforms XML into HTML. Given the
restrictions imposed by the applet sandbox, the
local copy of this applet does not load and run
correctly in some environments and with some
versions of IE/Netscape. Run the applet from an
HTTP server, and these problems disappear. - To run the applet from one of our Domino servers,
click here. - The local copy of client.html assumes that
xalan.jar and xerces.jar are in the Xalan root
directory, two directories above the
samples/applet subdirectory. If these JAR files
are located elsewhere, you must edit the applet
archive attribute in client.html to point to
xalan.jar and xerces.jar. - To run the applet locally, click here.
36The applet before hitting button
37Running applet (in IE window)
38A Servlet example delivering XML as HTML
- The client (which you must set up) specifies an
XML document and a stylesheet. The servlet
performs the transformation and returns the
output to the client. You can use
media.properties to specify which stylesheet is
to be used depending on the client
browser/device. - How to run it
- Configure your application server (Websphere or
JServ, for example) so it can find the classes
(in xalansamples.jar) as well as the stylesheets
and properties file in the servlet subdirectory. - Set up an HTML client to call DefaultApplyXSL
with arguments as illustrated below. - Examples
- http//localhost/servlet/DefaultApplyXSL?URL/data
.xmlxslURL /style.xsl - ...applies the style.xsl stylesheet to the
data.xml data. Both files areserved from the Web
server's HTTP XSLTInputSource root.http//localh
ost/servlet/DefaultApplyXSL?URL/data.xmlxslURL
/style.xsldebugtrue - ...ensures that XML and XSL processor messages
are returned in the event of problems applying
style.xsl to data.xml - http//localhost/servlet/DefaultApplyXSL/data.xml?
xslURL/style.xsl - ...applies the style.xsl stylesheet to the
data.xml data, just like the first example. This
is an alternative way of specifying the XML
XSLTInputSource by utilizing the HTTP request's
path information.
39More servlet examples
- More examples
- http//localhost/servlet/DefaultApplyXSL/data.xml
- ...examines data.xml for an associated XSL
stylesheet. If multiple XSLs are associated with
the data, the stylesheet whose media attribute
maps to your browser type will be chosen. If no
mapping is successful, the primary associated
stylesheet is used. - http//localhost/servlet/data.xml
- ...provides the same function as the previous
example, but this example assumes that
/servlet/data.xml has been mapped to be executed
by this servlet. The servlet engine may be
configured to map all or some .xml files to this
servlet through the use of servlet aliases or
filters. - http//localhost/servlet/data.xml?cataloghttp//w
ww.xml.org/dtds/oag.xml - ...supplements any servlet-configured XCatalog
with a catalog of supply chain DTDs residing at
the XML.ORG DTD repository. - For more information, see the comments in
DefaultApplyXSL.java.