Dynamic XML DOM and SAX
John Campbell
White Star Software
Tony Lavinio PSC Labs
- Two XML Models
- DOM ? Document Object Model
- SAX ? Simple API for XML
- Differences in Objective
- Differences in Process
- To create dynamic, flexible XML
- document reading and writing algorithms
- for distinct source and target structures.
- To compare the two models for handling XML
6Objective Detail
- Read an incoming document with one structure and
covert it to a known DB structure (DOM and SAX) - Create an outgoing document from a DB structure
in the recipients format (DOM)
7DOM Overview
- Document for populating a DB
8The Progress DOM
- A Widget Hierarchy
- Begin with the document
- Start at the root
- Work to the leaf nodes
- Examples
- DB to table to field to data (dynamic buffers)
- Window to frame to fields (widgets)
9DOM Syntax
- Widget types
- x-document
- The container node
- x-noderef
- For pointing to all nodes within the document
- Always pointed to by handles
10Dynamic Widget Syntax
- Begin with Create of the object
- Use methods to
- Navigate
- Use attributes to
- Query the node
- Populate
- Extract information
11Creating DOM Objects
define variable hDocument as handle. define
variable hRoot as handle. create
x-document hDocument. create x-noderef hRoot.
- Create the document
- Create a node for the root
- Primary ones for normal documents
- Several others exist
load(ltdocument-namegt) get-document-element(lthandl
egt) get-child(lthandlegt,ltintegergt) get-attribute(
13Top-level Method
- load(lttypegt,
- ltdocument-namegt ltaddressgt,
- ltlogical-vargt)
- Loads an XML file, and creates a document
hierarchy - Type file or memptr
- Name character, address mempointer
- Logical validate the document or not
14Retrieving the First Node
- get-document-element(lthandlegt)
- Retrieves the root element of the document
15Loading a Document
define variable hDocument as handle. define
variable hRoot as handle. define variable
OK as log. create x-document
hDocument. create x-noderef hRoot. ok
hDocumentload("file",XMLFile,false). hDocumentge
- This loads the document but does not validate it
- Then gets the root node
16Other Primary Navigation Methods
- get-child(lthandlegt, ltintegergt)
- Gets the nth child of the current node
- Some nodes have only one child
- Most have many children some have none
17Getting the Attributes
- get-attribute(ltcharactergt)
- A node may have multiple attributes
- Each name can be derived from the node
- See following slides
18Attributes of XML Objects
- num-children
- Number of child nodes to this node
- attribute-names
- Names of all the attributes for this node
- name
- Name of a current node
- node-value
- Value of a current node
19DB Model
- Designed to map from incoming XML doc to DB
- As well as from DB to external document model
20DB Structure
Field-Name Label Type Format
Purpose -------------- ------ ---- -------
--------------- ObjectType Type char
x(25) Type of Object NameIn Name
char x(25) Incoming Name ObjectName
Object char x(25) Outgoing Name Parent
Parent char x(25) Parent Active
Active log yes/no Is it active?
21Sample of Data Content
- Type ObjectName
NameIn - --------------------- ---------------
-------------- - Table Customer
Client - Field City
ClientCity - Field CustNum
CustNo - KeyField CustNum
Customer - XMLSchema Customer
- XMLTable Client
Customer - XMLField ClientCity
City - XMLField CustNo
Custnum - XMLQuery for each customer
no-lock - XMLSField State
ProvState - XMLSfield Name
Name - XMLSfield Address
- Translate an incoming document into Progress data
- Convert Progress data to an outgoing document
- Read the schema for the mapping from the root
(the document name) to the table to the fields - This example is the inbound document
24Supporting Code
define variable hTTSchema as handle
no-undo. define variable hDBTable as handle
no-undo. define variable hDBField as handle
no-undo. define variable TableName as char
no-undo. create temp-table hTTSchema. find first
mapping no-lock where ObjectType "xmlschema"
and active. TableName ObjectName. create buffer
hDBTable for table TableName. / e.g. customer
/ for each mapping no-lock where ObjectType
"XMLSField" / schema field / and parent
TableName / point to the DB field in the base
record / hDBField hDBTablebuffer-field(Objec
tName). / add the temp table field like the
base field / hTTSchemaadd-like-field(hDBField
name,hDBField). end. hTTSchematemp-table-prepare(
TableName). / set the schema / run xmlparser
(input-output table-handle HTTSchema,"Address","te
25The Parser
- The main component is an internal procedure which
is recursively called - It gets the name of the node, checks it to see if
it is a field - If it gets a flag indicating its a new record,
a DB record is created - It then populates the field with the incoming data
26The Code
- define input parameter phNode as handle.
- define variable nodepointer as integer.
- define variable hCurrentNode as handle.
- define variable hText as handle.
- define variable NodeName as char.
- define variable AttrCounter as int.
- create x-noderef hCurrentNode.
- create x-noderef hText.
27Looping Through the Doc
repeat nodepointer 1 to phNodenum-children
/ get the name to see if it is a field in
the T-T / if hCurrentNodeattribute-names
ltgt "" then do AttrCounter 1 to
NodeName entry(AttrCounter,
/ is it really a field ? / if
valid-handle(hTTBuffer) then hTTField
httBufferbuffer-field(NodeName) no-error.
28Prepare to Populate the DB Field
- if valid-handle(hTTField) then
- hTTFIeldbuffer-value
- hCurrentNodeget-attribute(
- entry(AttrCounter,hCurrentNodeAttribute-names))
. - / next, see if it is time to create a new record
/ - NodeName hCurrentNodename
- if NodeName CreateNodeName / from mapping
table / - then hTTBufferBuffer-create().
- else if valid-handle(hTTBuffer) then
- / check to see if we have a field to populate /
- hTTField hTTBufferBuffer-field(NodeName)
29Populate the DB Field
- if valid-handle(hTTField) and NodeName
hTTFIeldname - then do
- / get the text value of the current node and
put it in the field / - hCurrentNameget-child(hText,1).
- hTTFieldbuffer-value hTextnode-value.
- end.
- / go through all the children of the current
node / - run GetLevels (hCurrentNode).
- end. / repeat /
- end procedure.
30Some Notes
- The value of this model is the ability to move
back and forth in the document if need be. - Documents
31A Sample Incoming Doc
- lt?xml version "1.0" encoding "UTF-8"?gt
- ltMessageSet xmlnsxsi"http//www.w3.org/2001/XMLS
chema-instance" xsinoNamespaceSchemaLocation"mes
sage.xsd" Version"1.0"gt - ltMessagegt
- ltMessageCreategt
- ltDategt06262002lt/Dategt
- ltTimegt113818lt/Timegt
- ltSource Participant "D"gt
- ltDlrCodegt9261lt/DlrCodegt
- lt/Sourcegt
- ltTarget Participant "F"gt
- ltMgmtCodegtTALlt/MgmtCodegt
- lt/Targetgt
- lt/MessageCreategt
- ltNetwork NtwrkID "FSRV"gt
- ltDategt06032002lt/Dategt
- ltTimegt113819lt/Timegt
- lt/Networkgt
32The Data to be stored
- Note names used by mapping table
ltMessageTypegt ltRequest SourceID
"N0626113897098" MessageID "20993"gt ltBusProcess
gt ltAddModAddressgt ltAddress AddressCode"1"gt
ltNamegtCirque du Soleillt/Namegt ltAd
dressLinegt455 Cote de Neigeslt/AddressLinegt ltAddr
essLinegtBasementlt/AddressLinegt ltAddressLinegtRoom
1lt/AddressLinegt ltCitygtQuebeclt/Citygt ltProvStat
egtQClt/ProvStategt ltCountrygtCANlt/Countrygt ltPosta
lZipgtH8J3I4lt/PostalZipgt lt/Addressgt ltTaxCodegtQC
lt/TaxCodegt lt/AddModAddressgt lt/BusProcessgt lt/Re
questgt lt/MessageTypegt lt/Messagegt
33DOM Summary
- Using the Progress DOM, it is possible to parse
incoming files and create outgoing files with
relative ease. - You must know dynamic widgets rather well.
- You need to understand the DOM structure
reasonably well. - You need know only a modest amount about XML.
34DOM Plusses and Minuses
- Always have full context available
- Can go backwards and forwards in document
- Can read and write through DOM
- Uses large amount of memory
- May be overkill for just loading
35SAX (Simple API for XML)
- SAX implemented as callbacks.
- Start the parser, and it calls internal
procedures as needed as each node is read. - Pick the data out of the stream as it flows by.
Once its passed, its gone. - Dont call us, well call you model.
36SAX Plusses and Minuses
- Very thrifty for memory
- Callback interface makes for modular code
- Must maintain own context
- Cant go backwards
- Current implementation read-only
37SAX Strategy
- When we see a ltCustomergt node, well start a new
customer record. - Otherwise, if we see an ltelementgt that has the
same name as a field, well put the content of
that element into the like-named field.
38DOM Customer XML example
- define temp-table Custt like Customer.
- create x-document hDoc.
- create x-noderef hRoot.
- create x-noderef hTable.
- create x-noderef hField.
- create x-noderef hText.
- hBuf buffer Custthandle.
- hDocload("file", "cust.xml", false).
- hDocget-document-element(hRoot).
39DOM Customer XML cont
- repeat nodepointer 1 to hRootnum-children
- hRootget-child(hTable,nodepointer).
- if hTablename ltgt "Customer" then next.
- create Custt.
- do counter 1 to hTablenum-children
- hTableget-child(hField,counter).
- if hFieldnum-children ltgt 1 then next.
- hDBField hBufbuffer-field(hFieldname)
no-error. - hFieldget-child(hText,1).
- if hTextname ltgt "text" then next.
- if not error-statuserror then
- hDBFieldbuffer-value hTextnode-value.
- end.
- end.
40SAX Customer XML Example
- define temp-table Custt like Customer.
- create sax-reader hParser.
- / the callbacks are in this same file /
- hParserhandler this-procedure.
- hBuf buffer Custthandle.
- hParserset-input-source("file", "cust.xml").
- hParsersax-parse().
41SAX startElement Callback
- procedure startElement
- define input parameter namespaceURI as
character. - define input parameter localName as
character. - define input parameter qName as
character. - define input parameter attributes as handle.
- if localName "Customer" then create Custt.
- else do
- hField hBufbuffer-field(localName) no-error.
- if error-statuserror then hField ?.
- end.
- end procedure.
42SAX characters Callback
- procedure characters
- define input parameter charData as memptr.
- define input parameter numChars as integer.
- if valid-handle(hField) then
- hFieldbuffer-value get-string(charData,1).
- end procedure.
43SAX endElement Callback
- procedure endElement
- define input parameter namespaceURI as
character. - define input parameter localName as
character. - define input parameter qname as
character. - / If we dont do this, white space between
- elements will be picked up /
- hField ?.
- end procedure.
44Loading a Document
- With SAX, you dont load a document, you aim the
parser at it and let it rip. - create sax-reader hParser.
- hParserhandler this-procedure.
- hParserset-input-source("file", "cust.xml").
- hParsersax-parse().
- You can also parse from the incoming stream in
WebSpeed, or from a MEMPTR.
45Which to Use?
- It depends
- DOM for navigation.
- SAX for scalability.
46Further Resources
- Progress External Program Interfaces
- Chapter 11, XML Support
- Chapter 12, Simple API for XML
- Progressions, published by White Star Software
- xml_at_peg.com email group
47Side Benefits of Using XML
- You dont have to worry about internationalization
. - Convenient way of packaging schema and data
together. - Lots of good tools available, such asSonic
Softwares Stylus Studio
