XML StudySession: Part III - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

XML StudySession: Part III

Description:

Yellow colored Golden Retreiver /Description /Pet Pet ID= 002'Registered= 101100' ... Golden Retriever. Dog. Understanding DOM (contd. ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 17
Provided by: qkc
Category:

less

Transcript and Presenter's Notes

Title: XML StudySession: Part III


1
XML Study-Session Part III
  • Parsing XML Documents

2
Objectives
  • By completing this study-session, you should be
    able to
  • Learn to use the IBM XML4J Java XML parser.
  • Gain familiarity with the Document Object Model
    (DOM).
  • Be able to create a parsing application to
    display, navigate, and modify an XML document.

3
What is parsing?
  • Interpretation of text.
  • The XML parsers job is load the document, check
    that follows all necessary rules (at minimum, for
    well-formedness), and build a document tree
    structure that can be passed on to the
    application.
  • The application is any program (e.g. browser,
    reader, middleware) that acts upon the tree
    structure, processing the data it contains.

4
Overview of XML parsing
Packets of parsed XML data
Application to manipulate XML Data
XML Document
XML Parser
XML Application
Fig. 1 (from Building XML Applications, St.
Laurent and Cerami) Every XML application
includes at least two pieces an XML parser and
an application to manipulate the parsed XML data.
5
Types of parsers
  • Validating vs. Non-validating
  • A validating parser checks a document against a
    declared DTD.
  • Tree-based vs. Event-driven interface
  • Parser with tree-based interface will read entire
    document and create an internal tree
    representation of the data which can then be
    traversed by the application. A standardized API
    for this interface is the W3C DOM.
  • In the event-driven model, the parser reads
    through the document and signals each significant
    parsing event (e.g. start of document, start of
    element, end of element). Callback methods are
    used to handle these events as they occur. This
    approach is used by the Simple API for XML (SAX).

6
The IBM XML4J parser
  • Open source Java parser developed by IBM and now
    available as part of the xml.apache.org project
    under the codename Xerces.
  • Version 3.1.1 API supports DOM level 1 and SAX
    level 1.
  • Can be downloaded from as .zip file from
    www.alphaworks.ibm.com/tech/xml4j.
  • Ideal for standalone Java applications and
    working with Java servlets.

7
Setting up your environment
  • To use the classes in XML4J, you must set your
    Java CLASSPATH variable so that Java can locate
    the xerces.jar and xercesSamples.jar files
  • To set classpath in Jcreator
  • Configure - Options - JDK Profiles - select
    JDK version - Edit - Add Package - add
    d/xml4j/xerces.jar and d/xml4j/xercesSamples.jar
  • To run/execute project with command-line
    arguments
  • Project - Project Settings - JDK Tools -
    Select tool type Run Application - select
    - Edit - Parameters - set Prompt
    for main function argument checkbox to True.

8
Understanding DOM
  • The W3C DOM specifies an interface for treating a
    document as a tree of nodes.
  • A Node object, implemented in Java DOM, has
    methods such as getChildNode(), getNextSibling(),
    getParentNode(), getNodeType(), etc.
  • Possible node types in DOM include Element,
    Attribute, Comment, Text, CDATA section, Entity
    reference, Entity, Processing Instruction,
    Document, Document type, Document fragment, and
    Notation.

9
Example (petfile.xml)
  • Rover
  • 3
  • Yellow colored Golden Retreiver
  • Ella
  • 1
  • Green and black shelled pond crawler

10
Example DOM structure
Pets
Pet
Pet
ID
Registered
Name
Age
Description
ID
Yellow colored Golden Retriever
001
030801
Rover
3
Dog
11
Understanding DOM (contd.)
  • In XML4J, the classes that support the W3C DOM
    interface are stored in the org.w3c.dom class and
    the classes for the DOM parser are stored in the
    org.apache.xerces.parsers.DOMparser class.
  • High-level constructs such as Element and
    Attribute in DOM extend the Node interface. So,
    for instance, an Attribute object has methods
    such as getName() and getValue() and also
    getNodeName().
  • Complete API documentation can be found online at
    http//xml.apache.org/apiDocs/index.html.

12
Creating a parser
  • From the XML Reference page, download and view
    the FirstParser.java sample code.
  • This program will parse an XML document
    (customer.xml, passed as a command-line
    argument) and display the number of a certain
    element (in this case, the number of
    elements) in it.

13
Displaying a document
  • From the XML Reference page, download and view
    the IndentingParser.java sample code.
  • This program will parse and display an entire XML
    document (passed as a command-line argument) with
    proper indentation.
  • Separate handler methods are used to handle the
    document (i.e. root) node, element nodes,
    attributes, CDATA sections, text nodes, and
    Processing Instruction nodes.

14
Navigating a document
  • From the XML Reference page, download and view
    the nav.java sample code.
  • This program will parse the meetings.xml
    document and navigate the tree structure to
    locate the name of the third person.
  • Note that the XML4J parser treats indented space
    in the XML document as text nodes. We can set
    the parser to ignore whitespace by calling the
    parser method setIncludeIgnorableWhitespace with
    the value false.

15
Modifying a document
  • From the XML Reference page, download and view
    the XMLWriter.java sample code.
  • This program will parse an XML document
    (customer.xml, passed as a command-line
    argument) and modify it by adding a new
    XML element to every
    customer.
  • The modified document tree is then written to a
    new file with the name customer2.xml.

16
Next session
  • Presenting XML Documents
  • Stylesheets
  • Writing your own XSL applications
Write a Comment
User Comments (0)
About PowerShow.com