XML Validation III Schemas RELAX NG - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

XML Validation III Schemas RELAX NG

Description:

Any use of complexType can be turned into a user-defined type ... baroque. unreadable. RELAX NG. readable. esp. compact syntax. more expressive than Schema ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 45
Provided by: rbu
Category:

less

Transcript and Presenter's Notes

Title: XML Validation III Schemas RELAX NG


1
XML Validation IIISchemas RELAX NG
  • Robin Burke
  • ECT 360

2
Outline
  • Types
  • Built-in
  • Named
  • Anonymous
  • Type Derivation
  • Schema Organization
  • Break
  • RELAX NG

3
Built-in types
  • Part of the schema language
  • Base types
  • 19 fundamental types
  • Examples string, decimal
  • Derived types
  • 25 more types that use the base types
  • Examples ID, positiveInteger

4
Built-in types, cont'd
5
User-defined types
  • Any use of complexType can be turned into a
    user-defined type
  • usually called "standalone"
  • Simple types can be derived from the built-in
    types

6
Standalone types
  • A type can stand outside of an element definition
  • must have a name
  • ltxscomplexType name"bar-n-baz"gt
  • ltxssequencegt
  • ltxselement ref"bar" /gt
  • ltxselement ref"baz" /gt
  • lt/xssequencegt
  • lt/xscomplexTypegt
  • Used in element definition
  • ltxselement name"foo" type"bar-n-baz" /gt

7
Mixed content
  • Can specify that an element has mixed content
  • ltxscomplexType name"bar-n-baz" mixed"true"gt
  • ltxssequencegt
  • ltxselement ref"bar" /gt
  • ltxselement ref"baz" /gt
  • lt/xssequencegt
  • lt/xscomplexTypegt

8
Mixed content, cont'd
  • Schema cannot control where the text appears
  • If this is legal
  • ltfoogttext here ltbargtthudlt/bargtltbazgtgruntlt/bazgtlt/fo
    ogt
  • So is this
  • ltfoogtltbargtthudlt/bargtmore textltbazgtgruntlt/bazgtstill
    morelt/foogt

9
Deriving types
  • DTDs do not allow types restrictions
  • beyond enumeration, CDATA, token
  • for attributes
  • PCDATA
  • for content
  • Schemas have built-in types
  • also capability to create your own

10
Derivation operations
  • list
  • sequence of values
  • union
  • combine two types
  • allowing either
  • restriction
  • placing limits on the legal values

11
List
  • ltxselement name"partList"gt
  • ltxssimpleTypegt
  • ltxslist itemType"partNo" /gt
  • lt/xssimpleTypegt
  • lt/xselementgt
  • ltpartListgtPN334-04 PN223-89 PQ1112-03lt/partListgt
  • Must be separated by spaces
  • probably more useful to do this with document
    structure
  • partList -gt partNo

12
Union
  • Allows data of either type to be used
  • Example
  • ltxssimpleType name"partNumberField"gt
  • ltxsunion memberTypes"partNumberType noPartNum"
    /gt
  • lt/xssimpleTypegt
  • Database situation
  • null is a possible value

13
Restriction
  • Most useful
  • Allow design to state exactly what values are
    legal
  • prices must be non-negative
  • SSN must follow a certain pattern
  • in-stock must yes or no
  • etc.

14
Restriction, cont'd
  • Restrict a base type
  • according to "facets"
  • Different facets available for different data
    types

15
Facets
16
Example enumeration
  • ltxssimpleType name"grade"gt
  • ltxsrestriction base"xsstring"gt
  • ltxsenumeration value"A"/gt
  • ltxsenumeration value"B"/gt
  • ltxsenumeration value"C"/gt
  • ltxsenumeration value"D"/gt
  • ltxsenumeration value"F"/gt
  • ltxsenumeration value"I"/gt
  • lt/xsrestrictiongt
  • lt/xssimpleTypegt

17
Example numeric
  • ltxssimpleType name"drinkingAge"gt
  • ltxsrestriction base"xspositiveInteger"gt
  • ltxsminInclusive value"21"/gt
  • lt/xsrestrictiongt
  • lt/xssimpleTypegt

18
Example pattern
  • Regular expressions again
  • derived from perl
  • ltxssimpleTypegt
  • ltxsrestriction base"xsstring"gt
  • ltxspattern value"(A-DFI)(\\-)?" /gt
  • lt/xsrestrictiongt
  • lt/xssimpleTypegt

19
Inheritance
  • facet restrictions are inherited
  • new type derivations must honor them
  • but can restrict them further
  • but new derivations can alter other facets
  • For example
  • monetary type
  • fractionDigits facet 2
  • loan amount type
  • monetary type
  • maxValue 100000
  • car loan amount
  • loan amount type
  • maxValue 30000

20
Fixed Facets
  • Possible to prevent users from changing certain
    facet in any way
  • fixed"true" in facet declaration
  • similar to "final" keyword in Java
  • Example
  • ltsimpleType name"atLeastOneHundred"gt
  • ltrestriction base"integer"gt
  • ltminInclusive value"100" fixed"true"gt
  • lt/restrictiongt
  • lt/simpleTypegt
  • minInclusive cannot be changed when inherited
  • lower would be illegal anyway
  • the "fixed" attribute means it cannot be altered
    upward

21
Complex Types
  • (not discussed in book)
  • Possible to derive from complex types
  • i.e. elements
  • Use complexContent
  • Possibilities
  • extension
  • restriction
  • elements
  • attributes

22
Complex Type Extension
  • can add elements to existing complex type
  • only at the end

23
Complex Type Restriction
  • Adding additional attributes
  • Odd syntax
  • entire element definition must be repeated
  • Not much benefit to inheritance
  • validation checks for consistency with supertype

24
Example
  • grades schema

25
Schema design
  • Questions to ask
  • what kind of document?
  • narrative
  • data-centric
  • what kind of processing?
  • web page output
  • complex queries

26
Document modeling
  • Get examples
  • Get style guides / rules
  • For each data element
  • ask how many
  • ask what legal values
  • ask about sub-parts
  • ask about exceptions

27
Design decisions
  • Attribute vs element
  • Level of granularity
  • Naming
  • Schema structure

28
Attribute vs element
  • Some specific rules
  • ID must be attribute
  • General principle
  • data vs metadata
  • Element
  • for document content
  • Attribute
  • for information about content
  • Not always easy to tell!

29
Element
  • Consists of document content
  • Will be shown to a human user
  • Contains substructure
  • Sequence may be important
  • Could be very long
  • Presence depends on other values

30
Attribute
  • (Opposite of above)
  • Must be from an enumeration of values
  • Also
  • consistency

31
Level of granularity
  • How detailed to model the data?
  • Very detailed
  • more work to markup
  • more detail in expressing the schema
  • exceptions must be handled
  • Less detailed
  • easier to mark up
  • easier to schematize
  • document contents less accessible

32
Element content granularity
  • Fine grained model
  • salutation, first name, middle name, last name,
    appellation
  • Coarse grained model
  • name
  • Tradeoff
  • search / sort / organized
  • document creation

33
Levels vs recursion
  • Named levels
  • ltchaptergt
  • ltsectiongt
  • ltsubsectiongt
  • ltsubsubsectiongt
  • Recursion
  • ltsectiongt
  • ltsectiongt
  • ltsectiongt
  • ltsectiongt
  • Tradeoff
  • ability to rearrange
  • transparency of markup

34
Naming
  • Case convention
  • uppercase is bad
  • lowercase better
  • Multiple words
  • CapCase
  • camelCase
  • Underline_Convention

35
Structure
  • Nested
  • "russian doll"
  • schema looks like the document
  • small schema only
  • Flat
  • elements defined at global level
  • references used in complex type definitions
  • Type-based
  • "venetian blind"
  • all schema complex in type defintions
  • one global element

36
Break
37
RELAX NG
  • XML Schemas are big
  • a lot of the page consists of
  • lt gt /
  • repeated element names
  • RELAX NG
  • created as an alternate validation language
  • compact, non-XML syntax
  • also XML syntax

38
Example
  • element grades
  • element grade
  • element student text ,
  • element assigned-grade text
  • Equivalent to
  • lt!ELEMENT grades (grade)gt
  • lt!ELEMENT grade (student, assigned-grade)gt
  • lt!ELEMENT student (PCDATA)gt
  • lt!ELEMENT assigned-grade (PCDATA)gt

39
Attributes
  • element grades
  • element grade
  • element student
  • text,
  • attribute id text
  • ,
  • element assigned-grade (text)
  • attribute assignment text

40
Types
  • instead of text
  • use appropriate built-in data type
  • attribute age xsdpositiveInteger
  • facets
  • qualify with name / value pair
  • attribute drinkingAge
  • xsdpositiveInteger
  • minInclusive"21"

41
What does this one say?
  • element grade
  • element student ....,
  • element assigned-grade
  • text pattern"(A-D(\\-)?F)"
  • ( element assigned-grade
  • text "I" ,
  • element reason text
  • )

42
The point
  • A schema language has two purposes
  • lets the language designer state a design
  • lets the system validate documents against that
    design
  • Any language that serves this purposes can be used

43
Validation languages
  • DTD
  • SGML holdover
  • ugly
  • fairly simple to express
  • Schema
  • complete
  • extensible
  • baroque
  • unreadable
  • RELAX NG
  • readable
  • esp. compact syntax
  • more expressive than Schema
  • fewer tools

44
Next week
  • Presentations
Write a Comment
User Comments (0)
About PowerShow.com