Title: A Document Standards Interoperability Framework for DocBook, DITA, ODF and More
1A Document Standards Interoperability Framework
for DocBook, DITA, ODF and More!
www.oasis-open.org
Scott Hudson
2Fraternal Twins?
- Optimized for book-oriented deliverables
- Widely supported
- Well-documented
- Large user community
- Designed for multiple outputs
- Designed for topic-based authoring and re-use
- Designed for interoperability
- Highly flexible specialization
- Designed for multiple outputs
Both designed for technical documentation!
3In Search of the Holy Grail
- Content Sharing, Reuse and Re-Purposing has
always been the Holy Grail of Publishing
4Interoperability is Needed!
- Sharing information between trading partners or
OEMs who use different standards - Sharing information between areas in the same
organization that have implemented different
standards - Sharing content between non-XML users and XML
Content developers (Knowledge Transfer) - Transitional strategy as an organization moves
between content standards, but has existing
applications that share data - Compatibility with tools that are optimized for
a different standard - Syndication!
- Leverage, Repurpose content from many sources
5Todays Reality
- Old and New Forms of Collaboration
- OEMs and Partners
- Intra-Organizational
- Mergers/Acquisitions
- Syndication
- Multiple XML Document Standards
- DocBook
- DITA
- TEI
- ODF
- Microsoft Office Open XML
- How do I reconcile/leverage/reuse content from
disparate standards?
6Interoperability Current State
- Because DocBook, DITA and other standards will
co-exist, these standards need to be
interoperable, but theyre not Yet. - A common set of element definitions and models as
a base for each standard will require much more
collaboration between standards. - Both standards contain methods to extend their
respective standard - Specializations in DITA
- Customization layers in DocBook
7Insight Parallel View of DITA and DocBook
DITABookMap
DocBookMaster Document
Text
Front Matter
Topic(Front Matter)
Chapter
Topic
Section(Topic)
Chapter
Section(Topic)
Topic(Sub-Topic)
Section(Sub-Topic)
Section(Sub-Sub-Topic)
Topic(Sub-Sub-Topic)
Topic
Section(Topic)
Topic(Back Matter)
Back Matter
DITATopic
DITATopic
DITATopic
DocBookSection
DocBookSection
DocBookSection
8A Closer Look At DITA Structures
Topics
Publications
Topic (section)
Topic (section)
Book (bookmap)
Book (book)
Topic (topic)
Topic (topic)
Book (bookmap)
Book (bookmap)
Title
Title
Title
Title (title)
Title
Title
Title
Title (booktitle)
Meta (sectioninfo)
Meta (topicmeta)
Meta (topicmeta)
Meta (topicmeta)
Meta (bookinfo)
Meta (bookinfo)
Meta (bookmeta)
Meta (bookmeta)
Publication-Level Assembly
TopicBody
TopicBody
Topic Chunk
TopicBody
Body (topicbody)
ChapterRef (chapter)
ChapterRef
Front Matter (frontmatter)
ChapterRef (chapter)
Media
Nested Topics
MediaRef
Media (image, object)
MediaRefltmediaobjectgt,ltinlinemediaobjectgt
Topic Reference (topicref)
XInclude
XInclude
Nested Topics
Nested Topics
Chapters
Chapter (chapter, appendix)
section
Chapter (chapter, appendix)
conref (href)
Nested Topics
XInclude
topicref (href)
conref
XInclude
conref
Media
TopicRef (topicref)
Chapter (chapter, appendix)
MediaRefltmediaobjectgt,ltinlinemediaobjectgt
TopicRef (topicref) 1
Topic
Topic (topic)
Topic Chunk
Chapter (chapter, appendix, etc.)
Chapter-Level Assembly
Topic Reference (topicref)
href
Media (image, object)
Media Object (jpg, gif, mpeg, wav, etc.)
Media Object
9A Closer Look At DocBook Structures
Topics
Publications
Topic (section)
Topic (section)
Book (bookmap)
Book (book)
Topic (topic)
Topic (section)
Book (bookmap)
Book (book)
Title
Title
Title
Title (title)
Title
Title
Title
Title (title)
Meta (sectioninfo)
Meta (topicmeta)
Meta (topicmeta)
Meta (info sectioninfo)
Meta (bookinfo)
Meta (bookinfo)
Meta (bookmeta)
Meta (info bookinfo)
Publication-Level Assembly
TopicBody
TopicBody
Topic Chunk
TopicBody
Body (IMPLIED)
ChapterRef (chapter)
ChapterRef
Front Matter
ChapterRef (chapter)
Media
Nested Topics
MediaRef
Media (mediaobject)
MediaRefltmediaobjectgt,ltinlinemediaobjectgt
Topic Reference (Entity Reference)
XInclude
XInclude
Nested Topics
Nested Topics
Chapters
Chapter (chapter, appendix)
section
Chapter (chapter, appendix)
conref (href)
Nested Topics
XInclude
Entity Reference XInclude
Entity Reference XInclude
XInclude
conref
Media
TopicRef (topicref)
Chapter (chapter, appendix)
MediaRefltmediaobjectgt,ltinlinemediaobjectgt
TopicRef (topicref) 1
Topic
Topic (section)
Topic Chunk
Chapter (chapter, appendix, etc.)
Chapter-Level Assembly
Topic Reference (Entity Reference)
href
Media (image, object)
Media Object (jpg, gif, mpeg, wav, etc.)
Media Object
10Strategies for Interoperability
- Content Interoperability
- Shoehorning
- Processing Interoperability
- Uni-directional
- Roundtrip Interoperability
- Bi-Directional
11Content Interoperability
- DITA elements can be specialized with DocBook
element names so authors can create content as
valid simplified DocBook sections and
refsections, while also being valid DITA
specialized topics! - Likewise, DITA elements can be added to a DocBook
customization layer to allow DITA topics as
siblings of DocBook sections. - Drawback No official specialization or
customization layer exists. If you customize
DocBook, its not DocBook anymore! Is there a
better way?
12Processing Interoperability
- DocBook books should be able to include DITA
topics by reference. Using a transform, this
content could be preprocessed into DocBook and
finally output with the DocBook stylesheets. - Similarly, DITA maps or topics should be able to
reference DocBook articles or sections. Using a
transform, this content could be preprocessed to
produce DITA and then processed using the DITA
Open Toolkit. - Drawback How do you handle validation? how do
you address different versions of the standards?
Is there a better way?
13Roundtrip Interoperability
- Roundtrip interoperability uses transforms to
convert content directly from DocBook to DITA and
vice versa - Drawback Requires multiple stylesheets to
address different versions of the standards. Does
not allow roundtrip to other standards. Is there
a better way?
14Version Chaos
DITA
DocBook
5.0
1.1
4.5
4.4
4.3
4.2
1.0
S1.1
S1.0
15What a Tangled Web we Weave
DITA
DocBook
ODF
1.0
5.0
1.1
4.5
4.4
4.3
4.2
1.0
S1.1
More?
S1.0
16Version Control
DITA
DocBook
ODF
1.0
5.0
1.1
4.5
4.4
InteropFramework
4.3
4.2
1.0
S1.1
More?
S1.0
17Doc Standards Interoperability Framework
- Yes, there is a better way!
- Roundtrip interoperability via content transforms
to a neutral interchange format - Maintain stylesheet transforms to and from this
neutral interchange format - Allows interoperability between more than just
DocBook and DITA (for example, Open Document
Format and more!) - And without inventing a new grammar
18Doc Standards Interoperability Framework
Bridging content standards in a meaningful way!
19Why Include ODF?
- People already use it
- Proof point for providing interoperability to
other standards - Microsoft OpenXML
- LegalXML
- TEI
- ?
- Potential use of ODF as a surrogate authoring
application that doesnt know about DocBook or
DITA!
20Choosing an Interoperability Framework Standard
- Not another new grammar! Please?!?
- Hard to keep up with existing standards
- Re-Use/Re-purpose an existing standard
- Leverage existing tools, technology
- Shorter learning curve, faster adoption rate
- What DITA, DocBook, ODF (and others) have in
common? - Designed for producing Content!
- Common Structural Components
- Headings (Sections)
- Paragraphs
- Lists
- Tables
- Images
- etc.
21Whats Behind the Interoperability Framework?
- XHTML Microformats
- http//microformats.org/about/
- Semantic HTML
- Humans first, machines second
- Provides basic structural abstractions common to
content-based XML standards (DocBook, DITA,
ODF, etc.) - All of these standards have HTML renditions
anyway - Transitional XHTML 1.0
- Well-formed XML with a DTD using HTML markup
22Implementing Microformats
- Use XHTML elements for common structures
- table, p, ul, ol, img, code, pre, abbr, acronym
- Use generic structural elements as abstraction
layer - div, span
- Use title attribute to store original element
name - Use class attribute to specify semantic
category - Preserve additional semantic data
- Use ltobjectgt tag
- Valid virtually anywhere
- Attributes as ltparamgt elements
- Enable introspection
- ltdiv titlenote classadmonitionblo
ckgt - ltobject classelement-definitiongt
- ltparam nametype
valueimportant/gt - ltparam nameaudience valuedev
advanced/gt - lt/objectgt
- ...
- lt/divgt
- Use namespaced attribute/value pairs
23Preserving Source Mappings
- The Goal Preserve Source Semantics In the
Interchange - DITA - outputclass is the appropriate mapping
extension to preserve semantics - DocBook - remap attribute designed for
preserving source transforms - DocStandards Interoperability Framework
utilizes a combination of title attribute and an
object child or namespaced attributes to store
source metadata - ltheadgt element contains metadata (origin format,
Interop declarations) - ltbodygt element contains content payload
24Mapping the Standards
- To develop the interoperability framework, a
mapping of content elements between the standards
will be needed
- DITA Element
- title
- steps
- step
- substeps
- p
- ul
- ol
- note typenote
- note typecaution
- note typewarning
- b
- u
- i
- DocBook Element
- title
- procedure
- step
- substeps
- para
- itemizedlist
- orderedlist
- note
- caution
- warning
- emphasis rolebold
- emphasis roleunderline
- emphasis roleitalic
DocStandards Interoperability Framework Elements
Other Doc Standards (ODF, etc.)
25Mapping Rules
- Not all elements have a 11 mapping
- Some markup will be implied to and from
framework
Interoperability Framework (XHTML)
ltimg srcfoo.png/gt
DocBook
ltmediaobjectgt ltimageobjectgt ltimagedata
hreffoo.png/gt lt/imageobjectgtlt/mediaobjectgt
DITA
ltimage hreffoo.png/gt
26Example DITA Source
27Example Interoperability Format
28Interoperability Format to DocBook
29Interoperability Format to DITA
30Interoperability Format to ODF
31How High is the Fidelity?
32Roundtrip Fidelity
DocBook
ltemphasis roleboldgt
ltb titleemphasisgt ltobject classelement-defi
nitiongt ltparam namerole valueboldgt
lt/objectgt
Interoperability Framework
ltb outputclassemphasis otherpropsrole,bold
/gt
DITA
ltb titlebgt ltobject classelement-definition
gt ltparam nameotherprops valuerole,boldgt
ltparam nameoutputclass valueemphasisgt
lt/objectgt
Interoperability Framework
DocBook
ltemphasis rolebold remapbgt
33DITA Roundtrip Example
ltconceptgt
Interoperability Framework
ltdiv titleconcept classtopicblockgt
ltobject classelement-definition/gt
DocBook
ltsection remapconceptgt
Interoperability Framework
ltdiv titlesection classtopicblockgt
ltobject classelement-definitiongt ltparam
nameremap valueconceptgt lt/objectgt lt/divgt
DITA
lttopic outputclasssectiongt OR ltconcept
outputclasssectiongt
34Purity vs. Practicality
- Its not pure Insert XML Standard Here
35What Can You Do With the Framework?
- Enable interoperability between two or more
standards - Enable interoperability between different
versions of each applied standard - Unlock content in proprietary formats for
initial migration to XML Document Standards - Apply the 80/20 rule to semantic accuracy
36Proposed OASIS TC
- Flatirons has proposed an OASIS Technical
Committee to continue evolving the
Interoperability Framework - docstandards-interop-discuss_at_lists.oasis-open.org
- Standards members include
- Michael Priestley (DITA)
- Scott Hudson (DocBook, DITA)
- Don Harbison (ODF)
- Jim Earley (DocBook, DITA)
37Proposed OASIS TC charter
- The Doc Standards Interoperability TC is intended
to address - the development and documentation of scenarios
for cross-standard content sharing - a specification for an interoperability
framework, including mappings from participating
standard formats to the framework - and requirements on participating standards to
improve interoperability.
38A Call to Arms
- We need your
- Awareness
- Support
- Participation!
- For more information
- Email thoughtleader_at_flatironssolutions.com to
subscribe to our whitepaper mailing list - Download our whitepaper at flatironssolutions.com
- docstandards-interop-discuss_at_lists.oasis-open.or
g