Title: 95-733 Week 5
195-733 Week 5
Basic SAX Example From Chapter 5 of XML and
Java Working with XML SAX Filters as described
in Chapter 5
2Finding a Pattern using SAX
lt?xml version"1.0" encoding"utf-8"?gt ltdepartment
gt ltemployee id"J.D"gt ltnamegtJohn
Doelt/namegt ltemailgtJohn.Doe_at_foo.comlt/emailgt
lt/employeegt ltemployee id"B.S"gt ltnamegtBob
Smith lt/namegt ltemailgtBob.Smith_at_foo.comlt/emailgt
lt/employeegt lt/departmentgt
department.xml
3TextMatch.java
import java.io.IOException import
java.util.Stack import org.xml.sax.Attributes im
port org.xml.sax.SAXException import
org.xml.sax.XMLReader import org.xml.sax.helpers.
DefaultHandler import org.xml.sax.helpers.XMLRead
erFactory public class TextMatch extends
DefaultHandler StringBuffer buffer
String pattern Stack context
4 public TextMatch(String pattern)
this.buffer new StringBuffer()
this.pattern pattern this.context
new Stack()
5protected void flushText() if
(this.buffer.length() gt 0) String
text new String(this.buffer) if
(pattern.equals(text))
System.out.print("Pattern '"this.pattern
"' has been found
around ") for (int i 0 i lt
this.context.size() i)
System.out.print("/"this.context.elementAt(i))
System.out.println("")
this.buffer.setLength(0)
6 public void characters(char ch, int start,
int len) throws SAXException
this.buffer.append(ch, start, len)
public void ignorableWhitespace(char ch, int
start, int len) throws SAXException
this.buffer.append(ch, start, len)
public void processingInstruction(String target,
String data) throws SAXException
// Nothing to do because PI does not affect the
meaning // of a document.
7 public void startElement(String uri, String
local, String qname,
Attributes atts) throws SAXException
this.flushText() this.context.push(l
ocal) public void endElement(String
uri, String local, String qname) throws
SAXException this.flushText()
this.context.pop()
8public static void main(String argv)
if (argv.length ! 2)
System.out.println("TextMatch ltpatterngt
ltdocumentgt") System.exit(1)
try XMLReader xreader
XMLReaderFactory.createXMLReader(
"org.apache.xerces.parsers.SAXParser")
xreader.setContentHandler(new
TextMatch(argv0))
xreader.parse(argv1) catch
(IOException ioe)
ioe.printStackTrace() catch
(SAXException se)
se.printStackTrace()
The XMLReader interface declares setContentHandler
and parse.
9lt?xml version"1.0" encoding"utf-8"?gt ltdepartment
gt ltemployee id"J.D"gt ltnamegtJohn
Doelt/namegt ltemailgtJohn.Doe_at_foo.comlt/emailgt
lt/employeegt ltemployee id"B.S"gt ltnamegtBob
Smith lt/namegt ltemailgtBob.Smith_at_foo.comlt/emailgt
lt/employeegt lt/departmentgt
Looking for Bob.Smith_at_foo.com
10D\McCarthy\www\95-733\examples\chap05gtjava
TextMatch "Bob.Smith_at_foo.com" Department.xml Pat
tern 'Bob.Smith_at_foo.com' has been found around
/department/employee/email
11Filtering XML
Perhaps we would like to modify an existing
XML document. Or, perhaps we would like to
generate and XML document from a flat file or
Database. Well look at six examples that will
make the filtering process clear.
12XMLReader
13org.xml.sax Interface XMLReader
XMLReader is the interface that an XML parser's
SAX2 driver must implement. This interface
allows an application to set and query features
and properties in the parser, to register event
handlers for document processing, and to
initiate a document parse.
14org.xml.sax Interface XMLReader
Two example methods declared in this interface
are voidsetDTDHandler(DTDHandler handler)
Allow an application to register a DTD
event handler. voidparse(InputSource input)
Parse an XML document.
15XMLReader
Create XMLReader. Tell it what to parse. Tell it
where its contentHandler is. Tell it to parse.
parse
XML source
setContenthandler
contentHandler
16XMLFilter
17org.xml.XMLFilter Interface
An XML filter is like an XML reader, except that
it obtains its events from another XML reader
rather than a primary source like an XML
document or database. Filters can modify a stream
of events as they pass on to the final
application. For example, the Filter might set
its own contentHandler. The parser will call that
one. This intervening handler can be programmed
to call the applications handler. Thus, the
calls from the parser to the handler are filtered.
18XMLFilter
package org.xml.sax public interface XMLFilter
extends XMLReader // This method allows the
application to link // the filter to a parent
reader (which may // be another filter). The
argument may not be null. public void
setParent(XMLReader parent)
19 // This method allows the application to query
the // parent reader (which may be another
filter). // It is generally a bad idea to
perform any // operations on the parent
reader directly // they should all pass
through this filter. public XMLReader
getParent()
20XMLFilter
XMLReader Interface
XMLFilter Interface
14 Methods
14 XMLReader Methods 2
21XMLFilter
XMLReader Object
XMLFilter Object
All methods of XMLReader are here. They
may block, pass on, or modify the calls to the
parent
22org.xml.sax.helpers Class XMLFilterImpl
All Implemented Interfaces ContentHandler,
DTDHandler, EntityResolver, ErrorHandler,
XMLFilter, XMLReader
All XMLReader methods are defined. These
methods, by default, pass calls to the parent
XMLReader.
By default, the XMLReader is set to call methods
defined here, in XMLFilterImpl, for XML content.
23org.xml.sax.helpers Class XMLFilterImpl
This class is designed to sit between an
XMLReader and the client application's event
handlers. By default, it does nothing but pass
requests up to the reader and events on to the
handlers unmodified, but subclasses can override
specific methods to modify the event stream or
the configuration requests as they pass
through. A Constructor XMLFilterImpl(XMLReader
parent) Construct an XML filter with
the specified parent.
24Some Examples Using Filters
// Filter demon 1 // A very simple SAX
program import org.xml.sax.XMLReader import
org.xml.sax.helpers.XMLReaderFactory import
org.xml.sax.helpers.DefaultHandler import
java.io.IOException import org.xml.sax.SAXExcepti
on
25public class MainDriver public static
void main(String argv) throws SAXException,
IOException
// Get a parser XMLReader parser
XMLReaderFactory.createXMLRead
er(
"org.apache.xerces.parsers.SAXParser")
// Get a handler MyHandler myHandler
new MyHandler() // Tell the parser
about the handler parser.setContentHand
ler(myHandler) // Parse the input
document parser.parse(argv0)
26 class MyHandler extends DefaultHandler
// Handle events from the parser
public void startDocument() throws
SAXException
System.out.println("startDocument is called")
public void
endDocument() throws SAXException
System.out.println("endDocument is
called")
D\McCarthy\www\95-733\examples\xmlfiltergtjava
MainDriver department.xml startDocument is
called endDocument is called
27Filter Demo 2
// Filter demon 2 // Adding an XMLFilterImpl
that does nothing but supply // an object that
acts as an intermediary. import
org.xml.sax.XMLReader import org.xml.sax.helpers.
XMLReaderFactory import org.xml.sax.helpers.Defau
ltHandler import org.xml.sax.helpers.XMLFilterImp
l import java.io.IOException import
org.xml.sax.SAXException
28 public class MainDriver2 public static
void main(String argv) throws SAXException,
IOException
// Get a parser XMLReader parser
XMLReaderFactory.createXMLRe
ader(
"org.apache.xerces.parsers.SAXParser")
// Get a handler MyHandler myHandler
new MyHandler()
29// Get a filter and pass a pointer to the
parser XMLFilterImpl myFilter new
XMLFilterImpl(parser) // After we create the
XMLFilterImpl, all of the calls we make // on
the parser will go through the filter. For
example, we will // call setContentHandler on the
filter and not the parser. // When we create the
filter (it implements many interfaces), // the
parser will call filter methods first. These
methods will, // in turn, call our methods.
// Tell the XMLFilterImpl about the
handler myFilter.setContentHandler(myHandler) //
Parse the input document myFilter.parse(argv0)
30 class MyHandler extends DefaultHandler //
Handle events from the parser public void
startDocument() throws SAXException
System.out.println("startDocument is called")
public void endDocument() throws
SAXException System.out.println("endDocument
is called")
31D\McCarthy\www\95-733\examples\xmlfiltergt java
MainDriver2 department.xml startDocument is
called endDocument is called
32Filter Demo 3
// Filter demon 3 // Adding an XMLFilterImpl
import org.xml.sax.XMLReader import
org.xml.sax.helpers.XMLReaderFactory import
org.xml.sax.helpers.DefaultHandler import
org.xml.sax.helpers.XMLFilterImpl import
java.io.IOException import org.xml.sax.SAXExcepti
on
33class MyCoolFilterImpl extends XMLFilterImpl
public MyCoolFilterImpl(XMLReader parser)
super(parser) // There
are two startDocument methods in this //
class. This one overrides the inherited method.
// The inherited method calls the outside
// contentHandler. // The parser calls
this method, this method calls // the base
class method wich calls the outside handler.
public void startDocument() throws SAXException
System.out.println("Inside
filter") super.startDocument()
System.out.println("Leaving filter")
34public void endDocument() throws SAXException
System.out.println("Inside filter")
super.startDocument()
System.out.println("Leaving filter")
35public class MainDriver3 public static
void main(String argv) throws SAXException,
IOException
// Get a parser XMLReader parser
XMLReaderFactory.createXMLRe
ader(
"org.apache.xerces.parsers.SAXParser")
// Get a handler MyHandler myHandler
new MyHandler() // Get a filter
that we will treat as a parser
XMLFilterImpl myFilter new MyCoolFilterImpl(pars
er)
36 // Tell the XMLFilterImpl about the
handler myFilter.setContentHandler(myHa
ndler) // Parse the input document
myFilter.parse(argv0)
class MyHandler extends DefaultHandler
// Handle events from the parser public void
startDocument() throws SAXException
System.out.println("startDocument is called")
public void endDocument() throws SAXException
System.out.println("endDocument is
called")
37D\McCarthy\www\95-733\examples\xmlfiltergt java
MainDriver3 department.xml Inside
filter startDocument is called Leaving
filter Inside filter startDocument is
called Leaving filter
38Filter Demo 4
// Filter demon 4 // Passing xml to an
XMLSerializer import org.xml.sax.XMLReader impo
rt org.xml.sax.helpers.XMLReaderFactory import
org.xml.sax.helpers.DefaultHandler import
org.xml.sax.helpers.XMLFilterImpl import
java.io.FileOutputStream import
java.io.IOException import org.xml.sax.SAXExcepti
on import org.apache.xml.serialize.XMLSerializer
// not standard import org.apache.xml.serialize.
OutputFormat // not standard
39public class MainDriver4 public static
void main(String argv) throws
SAXException, IOException // Get a
parser XMLReader parser
XMLReaderFactory.createXMLReader(
"org.apache.xerces.par
sers.SAXParser") // we
need to write to a file
FileOutputStream fos
new FileOutputStream("Filtered.xml"
) // An XMLSerializer can
collect SAX events
XMLSerializer xmlWriter new XMLSerializer(fos,
null)
40 // Tell the parser about the handler
(XMLSerializer) parser.setContentHandle
r(xmlWriter) // Parse the input
document // The parser sends events to
the XMLSerializer parser.parse(argv0)
41D\McCarthy\www\95-733\examples\xmlfiltergt java
MainDriver4 department.xml D\McCarthy\www\95-733
\examples\xmlfiltergttype filtered.xml lt?xml
version"1.0"?gt ltdepartmentgt ltemployee
id"J.D"gt ltnamegtJohn Doelt/namegt
ltemailgtJohn.Doe_at_foo.comlt/emailgt
lt/employeegt ltemployee id"B.S"gt
ltnamegtBob Smithlt/namegt ltemailgtBob.Smith_at_fo
o.comlt/emailgt lt/employeegt ltemployee
id"A.M"gt ltnamegtAlice Millerlt/namegt
lturl href"http//www.foo.com/amiller/"/gt
lt/employeegt lt/departmentgt
42Filter Demo 5
// Filter demon 5 // Placing a filter between the
parser and the // XMLSerializer import
org.xml.sax.XMLReader import org.xml.sax.helpers.
XMLReaderFactory import org.xml.sax.helpers.Defau
ltHandler import org.xml.sax.helpers.XMLFilterImp
l import java.io.FileOutputStream import
java.io.IOException import org.xml.sax.SAXExcepti
on import org.apache.xml.serialize.XMLSerializer
// not standard import org.apache.xml.serialize.
OutputFormat // not standard
43public class MainDriver5 public static
void main(String argv) throws SAXException,
IOException // Get a
parser XMLReader parser
XMLReaderFactory.createXMLReader(
"org.apache.xerces.parsers.SAXParser")
// we need to write to a file
FileOutputStream fos new
FileOutputStream("Filtered.xml")
// An XMLSerializer can collect SAX events
XMLSerializer xmlWriter new
XMLSerializer(fos, null) // Get a
filter XMLFilterImpl myFilter new
AnotherCoolFilterImpl(parser)
44 // Tell the XMLFilterImpl about the
handler (XMLSerializer)
myFilter.setContentHandler(xmlWriter)
// Parse the input document
myFilter.parse(argv0)
45class AnotherCoolFilterImpl extends XMLFilterImpl
public AnotherCoolFilterImpl(XMLReader
parser) super(parser)
public void startDocument() throws
SAXException System.out.println("Ins
ide filter") super.startDocument()
System.out.println("Leaving filter")
public void endDocument() throws
SAXException System.out.println("Ins
ide filter") super.endDocument()
System.out.println("Leaving filter")
46D\McCarthy\www\95-733\examples\xmlfiltergt java
MainDriver5 department.xml Inside filter Leaving
filter Inside filter Leaving filter Filtered.xml
is as before.
47Filter Demo 6
// Filter demo 6 // Writing our own parser and
passing calls to a filter import
org.xml.sax.XMLReader import org.xml.sax.helpers.
XMLReaderFactory import org.xml.sax.helpers.Defau
ltHandler import org.xml.sax.helpers.XMLFilterImp
l import java.io.FileOutputStream import
java.io.IOException import org.xml.sax.SAXExcepti
on import org.apache.xml.serialize.XMLSerializer
// not standard import org.apache.xml.serialize
.OutputFormat // not standard
48 public class MainDriver6 public static
void main(String argv) throws SAXException,
IOException
// Get a parser XMLReader parser
new MyCoolParser() // we
need to write to a file
FileOutputStream fos new FileOutputStream("Filte
red.xml") // An
XMLSerializer can collect SAX events
XMLSerializer xmlWriter new
XMLSerializer(fos, null) //
Tell the parser about the handler
(XMLSerializer) parser.setContentHandle
r(xmlWriter)
49 // Parse the input document
parser.parse("Some query or file name or ...")
class
MyCoolParser extends XMLFilterImpl
public MyCoolParser()
50public void parse(String aFileNameOrSQLQuery)
throws IOException, SAXException
char ch new char10
ch0 'H' ch1 'i'
// go to a file or go to a DBMS with a query
// make calls to call back methods
when this // code feels it's
appropriate startDocument()
startElement("", "MyNewTag", "", null)
characters(ch, 0, 2)
endElement("", "MyNewTag", "")
endDocument()
51 D\McCarthy\www\95-733\examples\xmlfiltergtjava
MainDriver6 D\McCarthy\www\95-733\examples\xmlfi
ltergttype filtered.xml lt?xml version"1.0"?gt ltMyNe
wTaggtHilt/MyNewTaggt