RELAX NG - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

RELAX NG

Description:

Attributes. Attributes are defined practically the same way as elements: ... Still more about attributes attribute name='attributeName' text/ /attribute ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 31
Provided by: davidma75
Category:
Tags: relax | attribute

less

Transcript and Presenter's Notes

Title: RELAX NG


1
RELAX NG
2
Caveat
  • I did not have a RELAX NG validator when I
    wrote these slides.Therefore, if an example
    appears to be wrong, it probably is.

3
What is RELAX NG?
  • RELAX NG is a schema language for XML
  • It is an alternative to DTDs and XML Schemas
  • It is based on earlier schema languages, RELAX
    and TREX
  • It is not a W3C standard, but is an OASIS
    standard
  • OASIS is the Organization for the Advancement of
    Structured Information Standards
  • ebXML (Enterprise Business XML) is a joint effort
    of OASIS and UN/CEFACT (United Nations Centre for
    Trade Facilitation and Electronic Business)
  • OASIS developed the highly popular DocBook DTD
    for describing books, articles, and technical
    documents
  • RELAX NG has recently been adopted as an ISO/IEC
    standard

4
Design goals
  • Simple and easy to learn
  • Uses XML syntax
  • But there is also a concise (non-XML) syntax
  • Does not change the information set of an XML
    document
  • (Im not sure what this means)
  • Supports XML namespaces
  • Treats attributes uniformly with elements so far
    as possible
  • Has unrestricted support for unordered content
  • Has unrestricted support for mixed content
  • Has a solid theoretical basis
  • Can make use of a separate datatyping language
    (such W3C XML Schema Datatypes)

5
RELAX NG tools
  • Jing
  • An open source validator written in Java
  • Suns MSV
  • Another validator
  • DTDinst
  • Translates from DTDs into RNG (RELAX NG) syntax
    or RNG compact syntax
  • Trang
  • Translates RNG compact syntax into RNG syntax
  • Translates RNG or RNG compact syntax into DTDs
  • Suns RELAX NG Converter
  • Translates DTDs into RNG syntax (but not well)
  • Translates an XML Schema subset into RNG syntax
    (imperfectly)

6
Basic structure
  • A RELAX NG specification is written in XML, so it
    obeys all XML rules
  • The RELAX NG specification has one root element
  • The document it describes also has one root
    element
  • The root element of the specification is element
  • If the root element of your document is book,
    then the RELAX NG specifications begins
  • ltelement name"book" xmlns"http//relaxng.org/
    ns/structure/1.0"gt
  • and ends
  • lt/elementgt

7
Data elements
  • RELAX NG makes a clear separation between
  • the structure of a document (which it describes)
  • the datatypes used in the document (which it gets
    from somewhere else, such as from XML Schemas)
  • For starters, we will use the two (XML-defined)
    elements
  • lttextgt ... lt/textgt (usually written lttext/gt)
  • Plain character data, not containing other
    elements
  • ltemptygtlt/emptygt (usually written ltempty/gt)
  • Does not contain anything
  • Other datatypes, such as ltdoublegt...lt/doublegtare
    not defined in RELAX NG
  • To inherit datatypes from XML Schemas,
    usedatatypeLibrary"http//www.w3.org/2001/XMLSc
    hema-datatypes"as an attribute of the root
    element

8
Defining tags
  • To define a tag (and specify its content),
    use ltelement name"myElement"gt lt!-- Content
    goes here --gt lt/elementgt
  • Example The DTD lt!ELEMENT name (firstName,
    lastName)gt lt!ELEMENT firstName
    (PCDATA)gt lt!ELEMENT lastName (PCDATA)gt
  • Translates to ltelement name"name"gt
    ltelement name"firstName"gt lttext/gt lt/elementgt
    ltelement name"lastName"gt lttext/gt
    lt/elementgt lt/elementgt
  • Note As in the DTD, the components must occur in
    order

9
RELAX NG describes patterns
  • Your RELAX NG document specifies a pattern that
    matches your valid XML documents
  • For example, the pattern
  • ltelement name"name"gt ltelement
    name"firstName"gt lttext/gt lt/elementgt ltelement
    name"lastName"gt lttext/gt lt/elementgtlt/elementgt
  • Will match the XML
  • ltnamegt ltfirstNamegtDavidlt/firstNamegt
    ltlastNamegtMatuszeklt/lastNamegtlt/namegt

10
Easy tags
  • ltzeroOrMoregt ... lt/zeroOrMoregt
  • The enclosed content occurs zero or more times
  • ltoneOrMoregt ... lt/oneOrMoregt
  • The enclosed content occurs one or more times
  • ltoptionalgt ... lt/optionalgt
  • The enclosed content occurs once or not at all
  • ltchoicegt ... lt/choicegt
  • Any one of the enclosed elements may occur
  • lt!-- An XML comment - not a container, and may
    not contain two consecutive hyphens --gt

11
Example
  • ltelement name"addressList"gt ltzeroOrMoregt
    ltelement name"name"gt ltelement
    name"firstName"gt lttext/gt lt/elementgt
    ltelement name"lastName"gt lttext/gt lt/elementgt
    lt/elementgt ltelement name"address"gt
    ltchoicegt ltelement name"emailgt
    lttext/gt lt/elementgt ltelement
    name"USPost"gt lttext/gt lt/elementgt
    lt/choicegt lt/elementgt lt/zeroOrMoregtlt/elem
    entgt

12
Enumerations
  • The ltvaluegt...lt/valuegt pattern matches a
    specified value
  • Exampleltelement name"gender"gt ltchoicegt
    ltvaluegtmalelt/valuegt
    ltvaluegtfemalelt/valuegt lt/choicegtlt/elementgt
  • The contents of ltvaluegt are subject to whitespace
    normalization
  • Leading and trailing whitespace is removed
  • Internal sequences of whitespace characters are
    collapsed to a single blank

13
More about data
  • Remember To inherit datatypes from XML Schemas,
    add this attribute to the root elementdatatypeLi
    brary "http//www.w3.org/2001/XMLSchema-dat
    atypes"
  • You can access the inherited types with the
    ltdatagt tag, for instance, ltdata type"doublegt
  • The ltdatagt pattern must match the entire content
    of the enclosing tag, not just part of it
  • ltelement name"illegalUse"gt lt!-- Don't do this!
    --gt ltdata type"double"/gt ltelement
    name"moreStuff"gt lttext/gt lt/elementgtlt/elementgt
  • If you don't specify a datatype library, RELAX NG
    defines the following for you (along with lttext/gt
    and ltempty/gt)
  • ltstring/gt No whitespace normalization is done
  • lttoken/gt A sequence of characters containing no
    whitespace

14
ltgroupgt
  • ltgroupgt...lt/groupgt is used as fat parentheses
  • Example
  • ltchoicegt ltelement name"name"gt lttext/gt
    ltelementgt ltgroupgt ltelement
    name"firstName"gt lttext/gt
    lt/elementgt ltelement name"lastName"gt
    lttext/gt lt/elementgt
    lt/groupgtlt/choicegt

15
Attributes
  • Attributes are defined practically the same way
    as elements
  • ltattribute name"attributeName"gt...lt/attributegt
  • Example
  • ltelement name"name"gt ltattribute name"title"gt
    lttext/gt lt/attributegt ltelement
    name"firstName"gt lttext/gt lt/elementgt ltelement
    name"lastName"gt lttext/gt lt/elementgtlt/elementgt
  • Matches
  • ltname title"Dr."gt ltfirstNamegtDavidlt/firstNamegt
    ltlastNamegtMatuszeklt/lastNamegtlt/namegt

16
More about attributes
  • With attributes, as with elements, you can use
    ltoptionalgt, ltchoicegt, and ltgroupgt
  • It doesnt make sense to use ltoneOrMoregt or
    ltzeroOrMoregt with attributes
  • In keeping with the usual XML rules,
  • The order in which you list elements is
    significant
  • The order in which you list attributes is not
    significant

17
Still more about attributes
  • ltattribute name"attributeName"gt lttext/gt
    lt/attributegt can be (and usually is)
    abbreviated asltattribute name"attributeName"/gt
  • However,ltelement name"elementName"gt lttext/gt
    lt/elementgt can not be abbreviated asltelement
    name"elementName"/gt
  • If an element has no attributes and no content,
    you must use ltempty/gt explicitly

18
ltlistgt
  • ltlistgt pattern lt/listgt matches a
    whitespace-separated list of tokens, and applies
    the pattern to those tokens
  • Examplelt!-- A floating-point number and some
    integers --gtltelement name"vector"gt ltlistgt
    ltdata type"float"/gt ltoneOrMoregt
    ltdata type"int"/gt lt/oneOrMoregt
    lt/listgtlt/elementgt

19
ltinterleavegt
  • ltinterleavegt ... lt/interleavegt allows the
    contained elements to occur in any order
  • ltinterleavegt is more sophisticated than you might
    expect
  • If a contained element can occur more than once,
    the various instances do not need to occur
    together

20
Interleave example
  • ltelement name"contactInformation"gt
    ltinterleavegt ltzeroOrMoregt
    ltelement name"phone"gt lttext/gt lt/elementgt
    lt/zeroOrMoregt ltoneOrMoregt
    ltelement name"email"gt lttext/gt lt/elementgt
    lt/oneOrMoregt lt/interleavegtlt/elementgt
  • ltcontactInformationgt ltemailgtdave_at_acm.orglt/ema
    ilgt ltphonegt215-898-8122lt/phonegt
    ltemailgtmatuszek_at_central.cis.upenn.edult/emailgtlt/co
    ntactInformationgt

21
ltmixedgt
  • ltmixedgt allows mixed content, that is, both text
    and patterns
  • If pattern is a RELAX NG pattern, then ltmixedgt
    pattern lt/mixedgtis shorthand for ltinterleavegt
    lttext/gt pattern lt/interleavegt

22
Example of ltmixedgt
  • Pattern
  • ltelement name"words"gt ltmixedgt
    ltzeroOrMoregt ltchoicegt
    ltelement name"bold"gt lttext/gt lt/elementgt
    ltelement name"italic"gt lttext/gt
    lt/elementgt lt/choicegt
    lt/zeroOrMoregt lt/mixedgtlt/elementgt
  • Matches
  • ltwordsgtThis is ltitalicgtnotlt/italicgt a
    ltboldgtgreatlt/boldgt example, ltitalicgtbutlt/italicgt
    it should suffice.lt/wordsgt

23
The need for named patterns
  • So far, we have defined elements exactly at the
    point that they can be used
  • There is no equivalent of
  • lt!ELEMENT person (name)gtlt!ELEMENT name
    (firstName, lastName)gt...use person several
    places in the DTD...
  • With the RELAX NG we have discussed so far, each
    time we want to include a person, we would need
    to explicitly define both person and name at that
    point
  • ltelement name"person"gt ltelement
    name"firstName"gt lttext/gt lt/elementgt ltelement
    name"lastName"gt lttext/gt lt/elementgtlt/elementgt
  • The ltgrammargt element solves this problem

24
Syntax of ltgrammargt
  • ltgrammar xmlns"http//relaxng.org/ns/structure/1
    .0"gt
  • ltstartgt
  • ...usual RELAX NG elements, which may include
  • ltref name"DefinedName"/gt
  • lt/startgt
  • lt!-- One or more of the following --gt
  • ltdefine name"DefinedName"gt
  • ...usual RELAX NG elements, attributes, groups,
    etc.
  • lt/definegt
  • lt/grammargt

25
Use of ltgrammargt
  • To write a ltgrammargt,
  • Make ltgrammargt the root element of your
    specification
  • Hence it should say xmlns"http//relaxng.org/ns/s
    tructure/1.0"
  • Use, as the ltstartgt element, a pattern that
    matches the entire (valid) XML document
  • In each ltdefinegt element, write a pattern that
    you want to use other places in the specification
  • Wherever you want to use a defined element,
    putltref name"NameOfDefinedElement"gt
  • Note that defined elements may be used in
    definitions, not just in the ltstartgt element
  • Definitions may even be recursive, but
  • Recursive references must be in an element, not
    an attribute

26
Long example of ltgrammargt
  • lt!ELEMENT name (firstName, lastName)gt
  • ltgrammar xmlns"http//relaxng.org/ns/structure/1
    .0"gt ltstartgt ltref name"Name"/gt
    lt/startgt ltdefine name"Name"gt
    ltelement name"name"gt ltelement
    name"firstName"gt lttext/gt lt/elementgt
    ltelement name"lastName"gt ltref
    name"LastName"gt lt/elementgt
    lt/elementgt lt/definegt ltdefine
    name"LastName"gt ltelement
    name"lastName"gt lttext/gt lt/elementgt
    lt/definegtlt/grammargt

XML is case sensitive--Note that defined terms
are capitalized differently
27
Common usage I
  • A typical way to use RELAX NG is to use a
    ltgrammargt with just the root element in ltstartgt
    and every element described by a ltdefinegt
  • ltgrammar xmlns"http//relaxng.org/ns/structure/1
    .0"gt ltstartgt ltref name"NOVEL"gt
    lt/startgt ltdefine name"NOVEL"gt
    ltelement name"novel"gt ltref
    name"TITLE"/gt ltref name"AUTHOR"/gt
    ltoneOrMoregt ltref
    name"CHAPTER"/gt lt/oneOrMoregt
    lt/elementgt lt/definegt ...more...

28
Common usage II
  • ltdefine name"TITLE"gt ltelement name"title"gt
    lttext/gt lt/elementgtlt/definegt
  • ltdefine name"AUTHOR"gt ltelement
    name"author"gt lttext/gt
    lt/elementgtlt/definegt
  • ltdefine name"CHAPTER"gt ltelement
    name"chapter"gt ltoneOrMoregt
    ltref name"PARAGRAPH"/gt lt/oneOrMoregt
    lt/elementgtlt/definegt
  • ltdefine name"PARAGRAPH"gt ltelement
    name"paragraph"gt lttext/gt
    lt/elementgt lt/definegt
  • lt/grammargt

29
Replacing DTDs
  • With ltgrammargt and multiple ltdefinegts, we can do
    essentially the same things as a DTD
  • Advantages
  • RELAX NG is more expressive than a DTD we can
    interleave elements, specify data types, allow
    specific data values, use namespaces, and control
    the mixing of data and patterns
  • RELAX NG is written in XML
  • RELAX NG is relatively easy to understand
  • Disadvantages
  • RELAX NG is extremely verbose
  • But there is a compact syntax that is much
    shorter
  • RELAX NG is not (yet) nearly as well known
  • Hence there are fewer tools to work with it
  • This situation seems to be changing

30
The End
So by this maxim be impressed, USE THE TOOLS THAT
WORK THE BEST. Do not yield your sovereign
judgment, To any sort of political fudgement. The
criterion of sound design Should be, must be,
your guideline. And if you're designing
documents, Try RNG. We charge no rents.
--
John Cowan
Write a Comment
User Comments (0)
About PowerShow.com