Title: A Tool Kit for Implementing XML Schema Naming and Design Rules OASIS Symposium: The Meaning of Inter
1A Tool Kit for Implementing XML Schema Naming and
Design Rules OASIS Symposium The Meaning of
InteroperabilityMay 9, 2006
- Josh Lubell, lubell_at_nist.gov
- National Institute of Standards and
TechnologyManufacturing Systems Integration
Division
2XML Exchange Schemas are Bridges
3But Bridges Must Be Designed Properly
4A Solution Naming and Design Rules
- Encode XML schema best practices
- Enforce a particular modeling methodology
- Ensure common naming conventions
- Use of camel case
- Allowable acronyms
-
- But NDRs can be difficult to apply
5Barriers to NDR Usefulness
- Proliferation
- How do I decide which NDR set to adopt?
- Should I develop my own NDR?
- Lack of structure
- NDR documents usually in proprietary word
processor formats - Inhibits rule reuse
- Limited versioning and traceability
- Ambiguity
- Rules written in English rather than
computer-interpretable language - NDR enforcement not automatic
6Schematron as an NDR Implementation Method
- Advantages
- XML-native (based on XPath)
- Rule-based
- Can test for co-occurrence constraints
- User-configurable diagnostic messages
- ISO standard
- Disadvantage
- Less versatile than a general purpose programming
language
7Example from Universal Business Language NDR
ELD1 Each UBLDocumentSchema MUST identify
one and only one global element declaration that
defines the document cctsAggregateBusinessInforma
tionEntity being conveyed in the Schema
expression. That global element MUST include an
xsdannotation child element which MUST further
contain an xsddocumentation child element that
declares This element MUST be conveyed as the
root element in any instance document based on
this Schema expression.
8Implementation Observations
Rule label
Namespace dependence
Subrule 1
ELD1 Each UBLDocumentSchema MUST identify
one and only one global element declaration that
defines the document cctsAggregateBusinessInforma
tionEntity being conveyed in the Schema
expression. That global element MUST include an
xsdannotation child element which MUST further
contain an xsddocumentation child element that
declares This element MUST be conveyed as the
root element in any instance document based on
this Schema expression.
Context 1
Context 2
Namespace dependence
Subrule 2
9UBL Lessons Learned
- Implementation non-trivial even for a seemingly
simple rule - Some rules require a general purpose programming
language for implementation - GNR1 UBL XML element, attribute and type names
MUST be in the English language, using the
primary English spellings provided in the Oxford
English Dictionary. - GNR7 UBL XML element, attribute and type names
MUST be in singular form unless the concept
itself is plural. - Some rules cannot be implemented at all
- NMS6 UBL published namespaces MUST never be
changed. - VER10 UBL Schema and schema module minor
version changes MUST not break semantic
compatibility with prior versions. - MUST versus SHOULD versus MAY
- More on MAY later
10Dept. of Navy (DON) NDR Case Study
- 128 rules
- Based on UBL NDR
- Why choose the DON NDR?
- Help developers write better schemas for Federal
government applications - Gain insight into best practices for NDR
development (particularly reuse of existing NDRs) - Publicly available
- A Navy standard
11DON NDR Testability (using Schematron)
12Issue Use of MAY
- A rule saying that something MAY occur, strictly
speaking, will always pass - But this may not be the rule creators intent
- Example CTD8 Code and ID cctsBBIE Property
complex types MAY use the xsdchoice element to
reference global elements defined in standardized
ID Scheme or Code List Schema modules. - Approaches
- Consider rule as guidance only (dont implement)
- Interpret MAY as discouragement, e.g. warning
referencing global element using xsdchoice
13Issue Requirement for External Resources
- GNR1 UBL XML element, attribute and type names
MUST be in the English language, using the
primary English spellings provided in the Oxford
English Dictionary. - Implementation requires access to electronic OED
- And the DON adaptation of this rule has
additional requirements - GNR1 XML element, attribute, and type names
MUST be in the English language, using the Oxford
English Dictionary for Writers and Editors
(Latest Ed.). Where both American and English
spellings of the same word are provided, the
American spelling MUST be used. - Electronic OED must be fully up to date
14Issue Rule Proliferation
- Illustrated by UBL rule GNR1 versus DON rule GNR1
- DON rule same as UBL rule, but with added
contraints - American spelling favored
- Latest OED edition required
- But no explicit relationship specified in DON
NDR! - Both rules have same ID, even though they are
different rules - Improved traceability and reusability would
reduce the confusion
15Issue Ambiguous Terminology
- More rigor needed in NDR definitions
- Example xsdSchemaExpression
- Not defined in W3C XML Schema recommendation
- Used but not defined in DON NDR
- Defined in UBL NDR to mean a concept
16Issue Mixed Content
- Essential for representing semi-structured data
- But allowing it makes the NDR more complicated
- UBL NDR forbids mixed content
- DON NDR allows it, but only if defined by a
namespace from a Navy-approved standard (e.g.
XHTML) - But XHTML element and attribute names violate
rule GNR1!
17Quality of Design (QoD) Tool
- Contains rules based on naming and design
guidelines (NDRs) from a number of sources - Stores executable test cases written in
Schematron and Java Expert System Shell (Jess) - Executes tests against user-provided schemas and
reports results - Rules grouped into test profiles
18Why QoD?
- Addresses proliferation of NDRs
- Overlapping NDR standards
- Supports reusability of rules
- Highlights ambiguous rules
- Provides an explicit structure for rules in NDRs
- Automates rule enforcement
- Enables versioning and traceability of rules
19Candidate NDRs
- OASIS Universal Business Language (UBL)
- US Department of the Navy (DON)
- Korean Institute for Electronic Commerce
- Open Applications Group (OAGIS)
- US Air Force
- US Federal CIO Council XML Working Group
- ASC X12 (CICA)
- FIATECH (capital facilities industry)
20Architecture of QoD Web Application
21Characteristics of Rules
- Coverage full, partial, none
- Applicability indicates type of schema
(document, low, or aggregate) the rule applies to - Rationale reason for rule from a list of
justifications - Requirement text from the NDR document
- Implementation File URI of the file containing
the implementation of the rule
22Example XML Description of a Ruleusing QoD
Exchange Schema
- lttestProfilegt
- ltsource id"ubl"gt
- ltorganizationgtOASISlt/organizationgt
- ltorgURLgthttp//www.oasis-open.orglt/orgURLgt
- lttitlegtUniversal Business Language (UBL)
Naming and Design Ruleslt/titlegt - ltversiongt1.0lt/versiongt
- ltdategt2004-11-15lt/dategt
- ltdocURLgthttp//docs.oasis-open.org/ubl/cd-UBL-N
DR-1.0.1lt/docURLgt - lt/sourcegt
- ltruleSet id"ELD"gt
- ltnamegtElement Declaration Ruleslt/namegt
- ltrule id"ELD1"gt
- ltcoveragegtfulllt/coveragegt
- ltschemagtDlt/schemagt
- ltrationalegtstructural claritylt/rationalegt
- ltrequirementgtEach UBLDocumentSchema MUST
identify one and ... - lt/requirementgt
- ltimplementation file"example.scmteld1"
type"schematron"/gt - lt/rulegt
23QoD Test Profile Exchange
24(No Transcript)
25(No Transcript)
26(No Transcript)
27(No Transcript)
28(No Transcript)
29(No Transcript)
30(No Transcript)
31(No Transcript)
32Application to Developing XML Schemas
- Currently a limited set of rules are implemented
- Recently implemented subset of DON NDR in
Schematron - Tested with a small but varied set of sample
schemas - Navy IETM Schema Q70IETM (Interactive
Electronic Technical Manual) - Grants.gov
- AEX (building and construction industry)
- US Dept. of Defense
- Provided meaningful results to schema developers
33Examples of types of warnings found in developing
XML Schemas
- Global elements declared in non-desirable places
- Anonymous/local types defined in non-desirable
places - Global schemas that do not declare a default
namespace - Document/Transaction level schemas that define
multiple global elements - Re-declaration of elements and types (e.g.
programType) in different namespaces
34Lesson Learned in coding NDRs
- NDR documents need to be regarded as rigorous
technical documentation - More review needed
- Better authoring tools needed
- Rules that cannot be implemented are
non-enforceable - Definition of NDRs is non-trivial
- Many rules cannot be tested
- Many rules are more difficult to implement than
thought - Difficult to reuse rules due to namespace
definitions - Often rules are ambiguous or unclear
- Implementation of rules is non-trivial
- Testing of rules is complex
- All boundary conditions need to be thought of and
covered - Legacy data and 3rd party schemas need to be
addressed in NDRs
35Whats Next
- Continue to expand our NDR rule-base
- Continue to enhance software based on user
requirements - Produce a tool kit for NDR developers
- Enhance QoD schema to represent entire NDR
document - Provide authoring templates
- Identify collaborators for future work
- If interested, contact me!
36Summary
- A process for XML schema development is necessary
- Tools can automate the process, thereby reducing
labor and deployment time - Definition and implementation of NDRs is
non-trivial but necessary to support reuse of
schemas - Enforcing NDRs will ultimately make XML schemas
more interoperable
37For More Information
- Lubell, et al., Implementing XML Schema Naming
and Design Rules, submitted to Extreme Markup
Languages 2006 - QoD information page http//www.nist.gov/msid/QOD
- QoD SourceForge project http//qod.sourceforge.ne
t