Title: DOM
1DOM
- Document Object Model
- Presented by
- Anthony Corbett
2What is the DOM?
- The Document Object Model is an platform- and
language-neutral application programming
interface (API) for valid HTML and well-formed
XML documents.
3Why use the DOM?
- With the Document Object Model, programmers can
build documents, navigate their structure, and
add, modify, or delete elements and content. - Anything found in an HTML or XML document can be
accessed, changed, deleted, or added dynamically
using the Document Object Model
4W3C Recommendations
- The DOM is defined by a set of W3C
Recommendations that describe a logical object
structure of documents and the methods used to
access and manipulate a document. - The latest Recommendation, the DOM Level 3 Core
Recommendation, which was released on April 7,
2004, provides models for manipulating HTML and
XML documents.
5Level 1 Level 2
- W3Cs first release, Level 1 Recommendation only
specified the structure and processing of an
existing HTML and XML document in a browser
context. - Level 2 Recommendation added functionality, a set
of specifications, updating existing Core and
HTML modules and adding new modules for Views,
Events, Style, Traversal, and Range.
6Level 3
- Level 3 added Abstract Schemas, Load, Save, XPath
modules and updated the Core and Events modules. - The DOM Level 3 Core Recommendation specified the
updated Core and XML modules and underlying
interfaces, to which this presentation pertains.
7DOM Architecture
View of the modules, by feature name, defined by
the DOM specifications
Image from W3.org
8Objective of DOM Specifications
- The objective of the DOM specifications are to
provide a standard interface that can be used in
a variety of applications and platforms. - This means that DOM interfaces are programming
language-neutral and can be implemented in any
language.
9DOM Notation
- The Document Object Model is specified using the
Interface Description Language (IDL) notation
defined by the Object Management Group.
10IDL Example
- interface NodeList
- Node item(int unsigned long index)
- readonly attribute unsigned long length
-
- This notation must then be translated into a
specific programming language, like Java.
11Java binding of IDL example
- package org.w3c.dom
- public interface NodeList
- public Node item(int index)
- public int getLength( )
-
- Note that there is no length attribute as
described by the IDL example, this is because
Java interfaces can not have data members.
Instead IDL attributes are usually mapped to
getter and setter methods in a specific interface
implementation. - If the attribute is defined readonly the specific
implementation will only have a getter method, as
is in this example.
12DOM Core - Structure Model
- A document parsed by DOM is represented as a
hierarchical tree of objects in memory. - Some types of objects may have child objects of
various types, and others are leaf objects that
cannot have anything below them in the document
structure.
13Generic vs Specific Interfaces
- To simplify different types of document
processing and enable efficient implementation of
DOM there are actually two distinct methods for
accessing a document tree from within the DOM. - Through the generic interfaces of the Core module
or the specific interfaces in the XML module.
14DOM Core Interfaces
- At the root of the DOM Core generic interfaces is
the Node interface, which provides a generic set
of methods for accessing a document or documents
nodes and their content. - All other interfaces, those specified in the XML
module, are derived from the Node interface.
15Derived Node Types
16Parent, children, siblings
- Each Node-derived object in a parsed DOM document
contains pointers to its parent, child, and
sibling nodes. - These pointers make it possible to using a
variety of tree-traversal methods to enumerate
document data in the DOM.
17Document Node
- Each parsed document causes the creation of a
single Document node, which is the root of the
tree. - As the children of this Document node is the
DocumentType node and a single Element node that
contains the entire body of the parsed document
can be accessed
18Mapping Structure to Nodes
- ltsample bogus"value"gtlttext_nodegtTest
data.lt/text_nodegtlt/samplegt
Image from XML in a Nutshell, 3rd Edition
19XML File
- lt?xml-stylesheet type"text/css"
href"song.css"?gt - ltSONG xmlns"http//www.cafeconleche.org/namespace
/song" xmlnsxlink"http//www.w3.org/1999/xlink"gt
- ltTITLEgtHot Coplt/TITLEgt
- ltPHOTO ALT"Victor Willis in Cop Outfit"
HEIGHT"200" WIDTH"100" xlinkhref"hotcop.jpg"
xlinkshow"onLoad" xlinktype"simple"gt - lt/PHOTOgt
- ltCOMPOSERgtJacques Moralilt/COMPOSERgt
- ltCOMPOSERgtHenri Belololt/COMPOSERgt
- ltCOMPOSERgtVictor Willislt/COMPOSERgt
- ltPRODUCERgtJacques Moralilt/PRODUCERgt
- ltPUBLISHER xlinkhref"http//www.amrecords.com/"
xlinktype"simple"gt A amp M Records
lt/PUBLISHERgt - ltLENGTHgt620lt/LENGTHgt
- ltYEARgt1978lt/YEARgt
- ltARTISTgtVillage Peoplelt/ARTISTgt
- lt/SONGgt
xml file from http//www.ibiblio.org/xml/slides/xm
lone/london2002/advancedxml/06.html
20The DOM representation
Image from ibiblio.org
21Java example books.xml
- ltcataloggt
- lt!Sample gt
- ltbook id"101"gt
- lttitlegtXML in a Nutshelllt/titlegt
- ltauthorgtElliotte Rusty Harold, W. Scott
Meanslt/authorgt ltpricegt39.95lt/pricegt - lt/bookgt
- ltbook id"121"gt
- lttitlegtWho Moved My Cheeselt/titlegt
- ltauthorgtSpencer, M.D. Johnson, Kenneth H.
Blanchardlt/authorgt ltpricegt19.95lt/pricegt - lt/bookgt
- lt/cataloggt
Example from http//www.oracle.com/technology/oram
ag/oracle/03-sep/o53devxml.html
22Java example continued
- DOMParser parser new DOMParser()
- parser.parse("books.xml")
- Document document parser.getDocument()
- NodeList nodes document.getElementsByTagName("ti
tle") - while(int i 0 i lt nodes.length() i )
- Element titleElem (Element)nodes.item(i)
- Node childNode titleElem.getFirstChild()
- if (childNode instanceof Text)
- System.out.println("Book title is "
childNode.getNodeValue()) -
-
Example from http//www.oracle.com/technology/oram
ag/oracle/03-sep/o53devxml.html
23Advantages
- Easy to use
- Rich set of APIs for easy navigation
- Entire tree loaded into memory, allowing random
access to XML document
24Disadvantages
- Entire XML document must be parsed at one time
- Expensive to load entire tree into memory
- Generic DOM node not ideal for object-type
binding, must create objects for all nodes
25Summary
Image from http//www.developerlife.com/domintro/
The DOM specifications are a set of interfaces
that define a logical structure of a document and
the methods of accessing the information in the
document.
26Summary
Image from http//www.developerlife.com/domintro/
Almost all specific interfaces inherit from the
Node interface, which provides generic methods
for accessing information in the object model
tree.
27Summary
- The DOM is an object model, meaning that all
data is encapsulated and has methods associated
with it. - The DOM is represented as a hierarchical or tree
structure, with the root being the document node
and a single root-element node.
Image from http//www.developerlife.com/domintro/
28Summary
- Each node is a specific type holding the related
data and the methods to enumerate that data. - Some nodes have children and some nodes are
leaves. - All nodes have knowledge of its parents,
children, and siblings thus allowing the use of
well known tree-traversal methods.
29 30Work Cited
- XML in a Nutshell, 3rd Edition
- http//www.w3.org/TR/DOM-Level-3-Core/
- http//www.developerlife.com/domintro/
- http//www.ibiblio.org/xml/slides/xmlone/london200
2/advancedxml/ - http//www.oracle.com/technology/oramag/oracle/03-
sep/o53devxml.html