Title: Federal CIO Council XML Web Services Working Group
1Summary of Native XML Databases Thread
Joseph M. Chiusano Booz Allen Hamilton
Federal CIO Council XML Web Services Working
Group Washington, DC July 22, 2003
2The Native XML Databases thread began on 5/12/03
- Brand sent information on Federal Computer Week
article interview to listserv - This sparked a thread of 40 e-mails, with main
sub-threads of - Native XML Databases
- XML and Semantic Clarity
- XML and Programming Logic
- Each sub-thread is summarized here
3Native XML Databases
4A native XML database stores XML in native form
- Native XML databases are employed mostly for
document-centric applications - The XML Web Services Working Group currently has
a pilot project involving native XML databases - XML Data Exchange Across Multiple Levels of
Government Using Native XML Databases - Main players in native XML database arena
eXcelon and Tamino (commercial) and Xindice,
eXist, 4Suite, and ozone (Open Source) - Middleware products that transfer to/from
relational databases main players JAXB, .NET,
Delphi, and WebSphere (commercial) and Castor,
JXQuick, Zeus, and Zope (Open Source) - Ronald Bourret, XML and Databases, XML 2002
Conference Tutorial, 12/9/03
5A hybrid native XML/relational approach is
preferred in some cases
- "Native XML databases add indexing information to
XML repositories and they can be a good choice if
you want to retain the original XML
representation of the information, provided you
only need limited processing of the information
therein. Use a hybrid approach (native and
relational) for XML storage when you need to
perform indexed operations such as searching and
aggregation against the information, but also
need to retain the XML form in which the
information first entered the system." - XML Design Handbook, Wrox
Press, 2003
6The main disadvantage for native XML databases at
this time is a lack of maturity in mechanisms for
querying/reporting on XML data, compared to
relational mechanisms
- The W3C XQuery Requirements specification is
still a W3C Working Draft at this time - http//www.w3.org/TR/xquery
- Additionally, people have become comfortable with
their RDBMS systems and value the maturity level
of the major DB vendors - They may also be hesitant to migrate
fundamentally relational data (like catalog-type
info) into native XML stores - We will continue to see interest in products like
Oracle that manage both relational and
hierarchical sets - Recommended book XML Data Management Native
XML and XML-Enabled Database Systems, Chaudhri,
Rashid, and Zicari, Addison Wesley Professional,
March 2003
7XML and Semantic Clarity
8This sub-thread revolved around the notion of
semantic clarity of XML tags in various scenarios
- Example of XML generated directly from a
database - ltSTOCKITMgt
- ltNSNgt1234-12-123-1234lt/NSNgt
- lt/STOCKITMgt
- Fundamental question is this good XML?
- What is good XML?
- There were disparate opinions on this question
9One side represented the developer-oriented view
- Theres no such thing as better XML or worse
XML, as far as tag names and organization go
it all depends on the circumstances - Suppose we are writing XML messages to be passed
between two proprietary systems, with hundreds of
thousands of messages being passed back and forth
each day - Is this representation
- ltmessagegt
- ltSystemProtocolgt
- ltSystemMessagePartOnegt17lt/SystemMessagePartOne
gt - ltSystemMessagePartTwogt33lt/SystemMessagePartTwo
gt - ltSystemMessagePartThreegt22lt/SystemMessagePartT
hreegt - ltSystemMessagePartFourgt38lt/SystemMessagePartFo
urgt - ltSystemMessagePartFivegt41lt/SystemMessagePartFi
vegt - lt/SystemProtocolgt
- lt/messagegt
10One side represented the developer-oriented
view (contd)
- Suppose we are writing XML messages to be passed
between two proprietary systems, with hundreds of
thousands of messages being passed back and forth
each day (contd) - better than this representation?
- ltmgtltp1gt17lt/p1gtltp2gt33lt/p2gtltp3gt22lt/p3gtltp4gt38lt/p4
gtltp5gt41lt/p5gtlt/mgt - In this situation, the most important factor is
brevity of the message - If we get too fixated on the human-readability
aspects of XML, we can lose sight of the higher
goals of performance, reusability, scalability,
etc. - Context matters!!!
11The other side represented the business
analyst-oriented view
- Human readability and semantic clarity of tag
names are paramount - Earlier example would be better represented as
- ltSupplyMaterielItemgt
- ltFederalClassificationCodegt1234lt/gt
- ltNationalIDgt12-123-1234lt/gt
- lt/SupplyMaterielItemgt
- This version applies information analysis and ISO
11179 rules to produce a more semantically clear
representation - This is more inline with the 10 design principles
of XML - XML documents should be human-legible and
reasonably clear - Terseness in XML markup is of minimal
importance
12The design approach used with XML should fit the
task at hand
- Just like you wouldn't use Visual Basic to write
a device driver, you shouldn't try to use one
particular design approach to XML to solve every
problem (no matter what the designers originally
said) - The Scalable Vector Graphics (SVG) W3C
specification is an example of tags that are
meant to be machine-processable and not
human-readable - Examples
- ltrect id"RectElement" x"300" y"100"
width"300" height"100"
fill"rgb(255,255,0)"gt - ltanimate attributeName"x" attributeType"XML"
begin"0s" dur"9s" fill"freeze"
from"300" to"0"/gt - ltset attributeName"visibility"
attributeType"CSS" to"visible"
begin"3s" dur"6s" fill"freeze"/gt
13XML tags cannot by nature convey sufficient
semantic clarity
- What does a tag of LastName mean?
- There is no way for XML to indicate by use of a
LastName tag that the element is equivalent
to the concept of a family name - Order of names is culture-specific
- The only solution is to explicitly document the
meanings of elements and attributes - There are various approaches to this, such as
- XML Schema documentation
- Data Dictionaries
- Semantic Registries
- Metadata Registries
14Keeping tags human readable might be more
expensive initially, but
- It can lead to less maintenance downstream
- It can lead to greater reuse
- It can lead to cost savings
15XML and Programming Logic
16There are certain specifications that use XML in
a programming sense e.g. to represent methods
- Example from OASIS Directory Services Markup
Language (DSML) Version 2 - ltsearchRequest dnOUMarketing,DCExample,DCCOM
/gt - This appears to move XML toward being a full
object model rather than exclusively a
serialization - There are also cases in which schemata represent
IF/THEN constructs as XML - Is this a good idea?
- If so, there should be some common approach to
this use of XML
17This is a wave that began with XSLT and is
becoming more prevalent today
- More people are seeing the value of a
"declarative programming" approach in which much
of the processing is "pushed down" into
processing engines rather than being stipulated
command-by-command as with 3GL languages - Examples
- OASIS Business Process Execution Language For Web
Services (BPEL4WS) - Web Services Choreography Interface (WSCI)
- W3C XForms
18Questions?