XML DOM Tutorial - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

XML DOM Tutorial

Description:

walk(child); //without this the ending tags will miss. if ( type ... end of walk. DOM or SAX ? Dom - Suitable for small documents - Easily modify document ... – PowerPoint PPT presentation

Number of Views:337
Avg rating:3.0/5.0
Slides: 17
Provided by: meng9
Category:
Tags: dom | xml | tutorial | walk

less

Transcript and Presenter's Notes

Title: XML DOM Tutorial


1
XML DOM Tutorial
  • CSC 309
  • By Meng Lou

2
DOM
  • Introduction
  • Overview
  • Steps for DOM parsing
  • Examples
  • DOM or SAX?
  • Summary

3
Introduction
  • DOM supports navigating and modifying XML
    documents.
  • Hierarchical tree representation of documents
  • Language Neutral, C, Java, CORBA
  • www.w3c.org/DOM

4
Pros and Cons
  • Advantages Robust API for the DOM TREE
    Relatively simple to modify the data structure
    and extract data
  • Disadvantages Stores the entire document in
    memory As DOM was written for any language,
    method naming conventions dont follow standard
    Java conventions

5
Overview of steps
6
Steps for parsing
  • Specify parser
  • Create a document builder
  • Invoke the parser to create a Document
    representing the XML document
  • Normalize
  • Obtain the root node
  • Modify and examine the properties of nodes

7

8
Specifying a Parser
  • Use the command line java D option
  • In the program, use System.setProperty, eg.
  • System.setProperty( javax.xml.parsers.DocumentB
    uilderFactory, org.apache.xerces.jaxp.DocumentBu
    ilderFactoryImpl )

9
Create a Document Handler
  • Create an instance of builder factory, then use
    it to create a DocumentBuilder Object
  • DocumentBuilderFactory builderFactory
    DocumentBuilderFactory.newInstance()
  • DocumentBuilder builder builderFactory.newDocum
    entBuilder()

10
Create a Dcoument
  • Call the parse method
  • Document doc builder.parse (someInputStream)
  • The Document class represents the parsed result
    in a tree structure

11
Normalize the Tree
  • Normalization has two affects
  • - Combines textual nodes that span multiple
    lines
  • - Eliminates empty textual nodes
  • doc.getDocumentElement().normalize()

12
Obtain the root node
  • Traversing begins at the root node
  • Element rootElement doc.getDocumentElement()
  • - Element is a subclass of the more general Node
    class represents an XML element
  • - Node represents all the various components of
    an XML document
  • eg. Document, Element, Attribute, Entity

13
Examine and Modify Nodes
  • Various properties
  • - getNodeName
  • - getNodeType
  • - getAttributes
  • - getChildNodes
  • - setNodeValue
  • - appendChild
  • - removeChild
  • - replaceChild

14
Sample Code Bits
  • //walk the DOM tree and print as u go
  • public void walk(Node node)
  • int type node.getNodeType()
  • switch(type)
  • case Node.DOCUMENT_NODE
  • System.out.println("lt?xml
    version\"1.0\" encoding\""
  • "UTF-8"
    "\"?gt")
  • break
  • //end of document
  • case Node.ELEMENT_NODE
  • System.out.print('lt'
    node.getNodeName() )
  • NamedNodeMap nnm
    node.getAttributes()
  • if(nnm ! null )
  • int len nnm.getLength()

15
DOM or SAX ?
  • Dom
  • - Suitable for small documents
  • - Easily modify document
  • - Memory intensive
  • SAX (Simple API for XML)
  • - Suitable for large documents
  • - Only traverse document once
  • - event Driven, saves memory

16
Summary
  • DOM is a tree representation of an XML document
    in memory
  • JAXP provides a vendor-neutral interface to the
    underlying parser
  • Every component of the XML document is a Node
  • Use normalization to combine text elements that
    spans multiple lines
Write a Comment
User Comments (0)
About PowerShow.com