Processing of structured documents - PowerPoint PPT Presentation

1 / 85
About This Presentation
Title:

Processing of structured documents

Description:

references xmlns:qa='http://joker.com/2000/star-rating' ... qa:rating xmlns:qa='http://joker.com/2000/star-rating' 3 stars /qa:rating /ref:references ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 86
Provided by: helenaah
Category:

less

Transcript and Presenter's Notes

Title: Processing of structured documents


1
Processing of structured documents
  • Spring 2002, Part 2
  • Helena Ahonen-Myka

2
XML Namespaces
  • An XML document may contain multiple markup
    vocabularies
  • reuse of existing markup, e.g. including HTML
    markup in some document type
  • An XML namespace is a collection of names,
    identified by a URI reference, which are used in
    XML documents as element types and attribute names

3
Author A writes a document
lt?xml version1.0?gt ltreferencesgt
ltnamegtMacmillanlt/namegt ltlink
hrefhttp//www.mcp.com/gt ltnamegtABC
Newslt/namegt ltlink hrefhttp//www.abcnews.com
/gt lt/referencesgt
4
Author B adds some rating.
lt?xml version1.0?gt ltreferencesgt
ltnamegtMacmillanlt/namegt ltlink
hrefhttp//www.mcp.com/gt ltratinggt5
starslt/ratinggt ltnamegtABC Newslt/namegt
ltlink hrefhttp//www.abcnews.com/gt
ltratinggt3 starslt/ratinggt lt/referencesgt
5
Also Author C wants to add some rating...
lt?xml version1.0?gt ltreferencesgt
ltnamegtMacmillanlt/namegt ltlink
hrefhttp//www.mcp.com/gt
ltratinggtGlt/ratinggt ltnamegtABC Newslt/namegt
ltlink hrefhttp//www.abcnews.com/gt
ltratinggtPGlt/ratinggt lt/referencesgt
6
Author D would like to combine the documents...
lt?xml version1.0?gt ltreferencesgt
ltnamegtMacmillanlt/namegt ltlink
hrefhttp//www.mcp.com/gt ltratinggt5
starslt/ratinggt ltratinggtGlt/ratinggt
ltnamegtABC Newslt/namegt ltlink
hrefhttp//www.abcnews.com/gt ltratinggt3
starslt/ratinggt ltratinggtPGlt/ratinggt lt/reference
sgt
7
Which rating? -gt different names
lt?xml version1.0?gt ltreferencesgt
ltnamegtMacmillanlt/namegt ltlink
hrefhttp//www.mcp.com/gt ltqa-ratinggt5
starslt/qa-ratinggt ltpa-ratinggtGlt/pa-ratinggt
ltnamegtABC Newslt/namegt ltlink
hrefhttp//www.abcnews.com/gt ltqa-ratinggt3
starslt/qa-ratinggt ltpa-ratinggtPGlt/pa-ratinggt lt/
referencesgt
8
Namespaces give a disciplined method for naming
lt?xml version1.0?gt ltreferences
xmlnsqahttp//joker.com/2000/star-rating
xmlnspahttp//penguin.xmli.com/2
000/review
xmlnshttp//pineapplesoft.com/1999/refgt
ltnamegtMacmillanlt/namegt ltlink
hrefhttp//www.mcp.com/gt ltqaratinggt5
starslt/qaratinggt ltparatinggtGlt/paratinggt
... lt/referencesgt
9
Namespaces
  • xmlnsqahttp//joker.com/2000/star-rating
  • qa prefix
  • http//joker.com/2000/star-rating
  • the namespace
  • a unique name (URI guarantees) no need to
    retrieve anything from the address
  • xmlns http//pineapplesoft.com/1999/refgt
  • the default namespace
  • elements without prefixes belong to this
    namespace
  • references, name, link

10
Namespaces
  • qarating
  • a qualified name (Qname)
  • scoping
  • The namespace is valid for the element where it
    is declared and all the elements within its
    content

11
Scoping
lt?xml version1.0?gt ltrefreferences
xmlnsrefhttp//pineapplesoft.com/1999/refgt
ltrefnamegtMacmillanlt/refnamegt ltreflink
hrefhttp//www.mcp.com/gt ltparating
xmlnspahttp//penguin.xmli.com/2000/reviewgtGlt/
paratinggt ltrefnamegtABC Newslt/refnamegt
ltreflink hrefhttp//www.abcnews.com/gt
ltqarating xmlnsqahttp//joker.com/2000/star-r
atinggt 3 starslt/qaratinggt lt/refrefer
encesgt
12
Namespaces and DTD
  • XML 1.0 DTDs are not namespace-aware
  • all the elements and attributes that are in some
    namespace have to be declared using the
    corresponding prefix
  • for elements with prefix pre
  • an attribute xmlnspre has to be declared

13
Namespaces and DTD
lt?xml version1.0?gt lt!DOCTYPE refreferences
lt!ELEMENT refreferences
(refname, reflink, (parating
qarating))gt lt!ATTLIST refreferences xmlnsref
CDATA REQUIREDgt lt!ELEMENT refname
(PCDATA)gt lt!ELEMENT reflink EMPTYgt lt!ATTLIST
reflink href CDATA REQUIREDgt lt!ELEMENT
parating (PCDATA)gt lt!ATTLIST parating xmlnspa
CDATA REQUIREDgt lt!ELEMENT qarating
(PCDATA)gt lt!ATTLIST qarating xmlnsqa CDATA
REQUIREDgt gt
14
DTD external and internal subsets
  • external and internal subset make up the DTD
    internal has higher precedence
  • syntax
  • lt!DOCTYPE root-type-name SYSTEM ex.dtd
    lt!-- external subset in file ex.dtd --gt
    lt!-- internal subset may come here
    --gt gt
  • internal subset may declare new elements (with
    attributes) or new attributes for existing
    elements
  • namespaces facilitate the control of name
    conflicts

15
Namespaces and XML Schema
  • An XML Schema document contains declarations of
    namespaces that are used in the document
  • e.g. xmlnsxsdhttp//www.w3.org/2001/XMLSchema
    for the elements with special XML Schema
    semantics
  • Target namespace these definitions included in
    this schema give definition to this namespace
  • targetNamespaceurimywork

16
Namespaces and XML Schema
  • In XML Schema, schema components from different
    target namespaces can be used together
  • -gt enables the schema validation of instance
    content defined across multiple namespaces

17
XML Information set
  • An XML documents information set consists of a
    number of information items
  • an information item is an abstract description of
    some part of an XML document
  • mainly to be used in other specifications
  • each information item has a set of associated
    named properties

18
XML Information set
  • Tree structure provided by the processor (no
    special interface is specified)
  • e.g. entities expanded to their replacement text,
    attributes with their default values
  • properties e.g. for each element its child
    elements and attributes

19
Information items
  • document information item
  • element information items
  • attribute information items
  • processing instruction information items
  • unexpanded entity reference information items
  • character information items

20
Information items (cont.)
  • comment information items
  • document type declaration information item
  • unparsed entity information items
  • notation information items
  • namespace information items

21
Example document information item
  • There is exactly one document information item in
    the information set
  • all information items are accessible from the
    properties of the document information item,
    either directly or indirectly through the
    properties of other information items

22
Example document information item
  • Properties
  • children
  • document element
  • notations
  • unparsed entities
  • base URI
  • character encoding scheme
  • standalone
  • version
  • all declarations processed

23
Example element information items
  • There is an element information item for each
    element appearing in the XML document
  • one of the element information items is the value
    of the document element property of the document
    information item (root element)
  • all other element information items are
    accessible recursively

24
Example element information items
  • An element information item has the following
    properties
  • namespace name
  • local name
  • prefix
  • children
  • attributes
  • namespace attributes
  • in-scope namespaces
  • base URI
  • parent

25
Example
lt?xml version1.0?gt ltmsgmessage
docdate19990421 xmlnsdochttp//doc.example
.org/namespaces/doc xmlnsmsghttp//message.ex
ample.org/ gtPhone home!lt/msgmessagegt
26
The information set for the sample document
  • A document information item
  • an element information item with namespace name
    http//message.example.org/, local part
    message, and prefix msg

27
The information set for the sample document
(cont.)
  • an attribute information item with the namespace
    name http//doc.example.org/namespaces/doc,
    local part date, prefix doc, and normalized
    value 19990421
  • three namespace information items for the
    http//www.w3.org/XML/1998/namespace,
    http//doc.example.org/namespaces/doc,
    http//message.example.org namespaces

28
The information set for the sample document
(ctnd.)
  • Two attribute information items for the namespace
    attributes
  • eleven character information items for the
    character data

29
XML 1.0 reporting requirements
  • For instance
  • an XML processor must always provide all
    characters in a document that are not part of
    markup to the application
  • a validating XML processor must inform the
    application which of the character data in a
    document is white space appearing within element
    content
  • an XML processor must normalize line-ends to LF
    before passing them to the application

30
XML 1.0 reporting requirements (ctnd.)
  • A validating XML processor must include the
    replacement text of an entity in place of an
    entity reference
  • an XML processor must supply the default value of
    attributes declared in the DTD for a given
    element type but not appearing in the elements
    start tag

31
What is not in the information set?
  • For instance,
  • the document type name
  • the difference between the two forms of an empty
    element ltfoo/gt and ltfoogtlt/foogt
  • the order of attributes within a start-tag
  • white space within start-tags (other than
    significant white space in attribute values) and
    end-tags
  • the difference between CR, CR-LF, and LF line
    termination

32
XML Schema
  • DTDs have drawbacks
  • they can only define the element structure and
    attributes
  • they cannot define any database-like constraints
    for elements
  • Value (min, max, etc.)
  • Type (integer, string, etc.)
  • DTDs are not written in XML and cannot thus be
    processed with the same tools as XML documents,
    XSL(T), etc.
  • difficult to combine different vocabularies
    (namespaces)
  • XML Schemas
  • are written in XML
  • avoid most of the DTD drawbacks

33
XML Schema
  • XML Schema Part 1 Structures
  • Element structure definition as with DTD
    Elements, attributes, also enhanced ways to
    control structures
  • XML Schema Part 2 Datatypes
  • Primitive datatypes (string, boolean, float,
    etc.)
  • Derived datatypes from primitive datatypes (time,
    recurringDate)
  • Constraining facets for each datatype (minLength,
    maxLength, pattern, precision, etc.)
  • The following is based on
  • XML Schema Part 0 Primer (2.5.2001)

34
Reminder DTD declarations
  • lt!ELEMENT name (fname, lname)gt
  • lt!ELEMENT address (name, street, (city, state,
    zipcode) (zipcode, city))gt
  • lt!ELEMENT contact
    (address, phone, email?)gt
  • lt!ELEMENT fname (PCDATA)gt

35
A sample document
lt?xml version1.0?gt ltpurchaseOrder
orderDate1999-10-20gt ltshipTo countryUSgt
ltnamegtAlice Smithlt/namegt ltstreetgt123 Maple
Streetlt/streetgt ltcitygtMill Valleylt/citygt ltstategt
CAlt/stategt ltzipgt90952lt/zipgt lt/shipTogt
36
Continues...
ltbillTo countryUSgt ltnamegtRobert
Smithlt/namegt ltstreetgt8 Oak Avenuelt/streetgt ltcity
gtOld Townlt/citygt ltstategtPAlt/stategt ltzipgt95819lt/z
ipgt lt/billTogt ltcommentgtHurry, my lawn is going
wild!lt/commentgt
37
continues
ltitemsgt ltitem partNum"872-AA"gt
ltproductNamegtLawnmowerlt/productNamegt
ltquantitygt1lt/quantitygt ltpricegt148.95lt/pricegt
ltcommentgtConfirm this is
electriclt/commentgt lt/itemgt ltitem
partNum"926-AA"gt ltproductNamegtBaby
Monitorlt/productNamegt ltquantitygt1lt/quantitygt
ltpricegt39.98lt/pricegt
ltshipDategt1999-05-21lt/shipDategt lt/itemgt
lt/itemsgt lt/purchaseOrdergt
38
DTD
lt!ELEMENT purchaseOrder (shipTo, billTo,
comment?, items) gt lt!ATTLIST purchaseOrder
orderDate CDATA REQUIREDgt lt!ELEMENT shipTo
(name, street, city, state, zip)gt lt!ATTLIST
shipTo country CDATA REQUIREDgt lt!ELEMENT billTo
(name, street, city, state, zip)gt lt!ATTLIST
billTo country CDATA REQUIREDgt lt!ELEMENT comment
(PCDATA)gt lt!ELEMENT items (item)gt lt!ELEMENT
name (PCDATA)gt lt!ELEMENT street (PCDATA)gt
39
DTD continues
lt!ELEMENT city (PCDATA)gt lt!ELEMENT state
(PCDATA)gt lt!ELEMENT zip (PCDATA)gt lt!ELEMENT
item (productName, quantity, USPrice, (comment

shipDate))gt lt!ATTLIST item partNum CDATA
REQUIREDgt lt!ELEMENT productName
(PCDATA)gt lt!ELEMENT quantity (PCDATA)gt lt!ELEMENT
USPrice (PCDATA)gt lt!ELEMENT shipDate (PCDATA)gt
40
Complex and simple types
  • Schema defines types for elements and attributes
  • complex types allow elements in their content
    and may have attributes
  • simple types cannot have element content and
    cannot have attributes
  • elements can have complex or simple types,
    attributes can have simple types

41
XML Schema structure
ltxsdschema xmlnsxsdhttp//www.w3.org/2
001/XMLSchemagt ltxsdannotationgt
lt/xsdannotationgt ltxsdelement namepurchaseOrder
typePurchaseOrderType/gt ltxsdelement
namecomment typexsdstring/gt ltxsdcomplexTyp
e namePurchaseOrderTypegt ltxsdsequencegt
lt/xsdsequencegt ltxsdattribute
nameorderDate typexsddate/gt lt/xsdcomplexTy
pegt lt/xsdschemagt
42
USAddress type
ltxsdcomplexType nameUSAddress gt
ltxsdsequencegt ltxsdelement namename
typexsdstring /gt ltxsdelement
namestreet typexsdstring /gt
ltxsdelement namecity typexsdstring /gt
ltxsdelement namestate typexsdstring
/gt ltxsdelement namezip
typexsddecimal /gt lt/xsdsequencegt
ltxsdattribute namecountry typexsdNMTOKEN
fixedUS /gt lt/xsdcomplexTypegt
43
PurchaseOrderType
ltxsdcomplexType namePurchaseOrderTypegt
ltxsdsequencegt ltxsdelement
nameshipTo typeUSAddress /gt
ltxsdelement namebillTo typeUSAddress
/gt ltxsdelement refcomment
minOccurs0 /gt ltxsdelement
nameitems typeItems /gt
lt/xsdsequencegt ltxsdattribute
nameorderDate typexsddate
/gt lt/xsdcomplexTypegt
44
Shared types, references
  • element declarations for shipTo and billTo
    associate different element names with the same
    complex type
  • attribute declarations must reference simple
    types
  • element comment declared on the top level of the
    schema (here reference only)

45
Occurrence constraints
  • minOccurs, maxOccurs (defaults 1)
  • minOccurs minimun number of times an element may
    appear
  • element is optional, if minOccurs 0
  • maxOccurs maximum number of times an element may
    appear
  • attributes may appear once or not at all

46
Attributes use, default and fixed (in attribute
declarations)
  • Attribute use is used in an attribute
    declaration to indicate whether the attribute is
    required, optional or prohibited
  • default value may be provided if optional is
    set
  • if the instance does not give the value the
    default is used

47
Attributes use, default and fixed (in attribute
declarations)
  • Attribute fixed
  • the value of the attribute is the value of fixed

ltxsdattribute nametemp1 typexsddecimal
useoptional default37 /gt ltxsdattribute
nametemp2 typexsddecimal useoptional
fixed37 /gt ltxsdattribute nametemp2
typexsddecimal userequired fixed37 /gt
48
Items
ltxsdcomplexType name"Items"gt ltxsdsequencegt
ltxsdelement name"item" minOccurs"0"
maxOccurs"unbounded"gt ltxsdcomplexTypegtltxsds
equencegt ltxsdelement nameproductName
typexsdstring /gt ltxsdelement
name"quantity"gt ltxsdsimpleTypegt
ltxsdrestriction base"xsdpositiveInteger"gt
ltxsdmaxExclusive value"100"/gt
lt/xsdrestrictiongt lt/xsdsimpleTypegt
lt/xsdelementgt ltxsdelement name"USprice"
type"xsddecimal"/gt ltxsdelement
ref"comment" minOccurs"0"/gt ltxsdelement
name"shipDate" type"xsddate
minOccurs"0"/gt lt/xsdsequencegt
ltxsdattribute name"partNum" type"Sku
userequired/gt lt/xsdcomplexTypegt
lt/xsdelementgtlt/xsdsequencegt lt/xsdcomplexTypegt
49
Anonymous type definitions
  • Schemas can be constructed by defining sets of
    named types such as PurchaseOrderType on the top
    level and then declaring elements such as
    purchaseOrder
  • if a type is used only once, it is more compactly
    defined as an anonymous type

50
Anonymous type definitions
  • You can define anonymous types by the lack of
    type in an element declaration and by the
    presence of an unnamed (simple or complex) type
    definition following the element name
  • see the Items type definition

51
Global elements and attributes
  • Global elements and attributes have declarations
    that appear as the children of the schema element
  • global elements and attributes can be referenced
    in one or more declarations using the ref
    attribute

52
Global elements and attributes
  • global elements can appear in the instance
    document in the place where they have been
    referenced, or at the top level of the document
  • global declarations cannot contain references
  • global declarations cannot contain occurrence
    constraints

53
Simple types
  • Built-in types
  • e.g. string, integer, positiveInteger, decimal,
    float, boolean, time, date, recurringDay,
    uriReference, language, ID, IDREF
  • must have XML Schema namespace prefix
  • derived types
  • derived from built-in and other derived types
  • by defining restrictions to the base type
  • each base type has a set of facets that can be
    used for restrictions

54
Facets
  • XML Schema defines 15 facets
  • e.g. string has facets length, minLength,
    maxLength, pattern, enumeration
  • e.g. integer has facets pattern, enumeration,
    maxInclusive, maxExclusive, minInclusive,
    minExclusive, precision, scale

55
Defining a new type of integer
  • New type whose range of values is between 10000
    and 99999

ltxsdsimpleType namemyIntegergt
ltxsdrestriction basexsdintegergt
ltxsdminInclusive value10000/gt
ltxsdmaxInclusive value99999/gt
lt/xsdrestrictiongt lt/xsdsimpleTypegt
56
Patterns
ltxsdsimpleType nameSkugt ltxsdrestriction
basexsdstringgt ltxsdpattern
value"\d3-A-Z2"/gt ltxsdrestrictiongt lt/xsd
simpleTypegt
  • three digits followed by a hyphen followed by
    two upper-case ASCII letters

57
Enumeration facet
  • Limits values to a set of distinct values

ltxsdsimpleType nameUSStategt
ltxsdrestriction basexsdstringgt
ltxsdenumeration valueAK/gt
ltxsdenumeration valueAL/gt
ltxsdenumeration valueAR/gt lt!--
and so on --gt lt/xsdrestrictiongt lt/xsdsimp
leTypegt
58
List types
  • List types are comprised of sequences of simple
    types

ltxsdelement namelistOfMyInt
typelistOfMyIntTypegt ltxsdsimpleType
namelistOfMyIntTypegt ltxsdlist
itemtypemyInteger/gt lt/xsdsimpleTypegt instance
ltlistOfMyIntgt20003 15037 95977
95945lt/listOfMyIntgt
59
Union types
  • Type can be chosen from a set

ltxsdelement namezips typezipUniongt ltxsdsim
pleType namezipUniongt ltxsdunion
memberTypesUSState listOfMyIntType/gt lt/xsdsimp
leTypegt ltzipsgtCAlt/zipsgt ltzipsgt95630 95977
95945lt/zipsgt
60
Element content
  • How to define attributes for elements with simple
    type content?
  • In instance ltinternationalPrice
    currencyEURgt423.45lt/internationalPricegt
  • in the sample schema ltxsdelement nameUSPrice
    typexsddecimal/gt comes close
  • but simple types cannot have attributes
  • -gt a complex type has to be defined

61
Element content
  • New complex type is derived from type decimal

ltxsdelement nameinternationalPricegt
ltxsdcomplexTypegt ltxsdsimpleContentgt
ltxsdextension basexsddecimalgt
ltxsdattribute namecurrency typexsdstring
/gt lt/xsdextensiongt
lt/xsdsimpleContentgt lt/xsdcomplexTypegt lt/xsdel
ementgt
62
Mixed content
  • Element contains both character data and
    subelements

ltletterBodygt ltsalutationgtDear Mr.ltnamegtRobert
Smithlt/namegt.lt/salutationgt Your order of
ltquantitygt1lt/quantitygt ltproductNamegtBaby Monitorlt/
productNamegt shipped from our warehouse
on ltshipDategt1999-05-21lt/shipDategt
lt/letterBodygt
63
Mixed content
ltxsdelement nameletterBodygt
ltxsdcomplexType mixedtruegt
ltxsdsequencegt ltxsdelement
namesalutationgt ltxsdcomplexType
mixedtruegt ltxsdsequencegt
ltxsdelement namename
typexsdstring/gt lt/xsdsequencegt
lt/xsdcomplexTypegt
lt/xsdelementgt ltxsdelement
namequantity typexsdpositiveInteger/gt
lt/xsdsequencegtlt/xsdcomplexTypegtlt/xsde
lementgt
64
Empty content
  • Assume we want the internationalPrice element to
    have both the unit of currency and the price as
    attribute values
  • ltinternationalPrice currencyEUR value423.45
    /gt
  • i.e. the element has no content
  • solution no elements defined in the content model

65
Empty content
ltxsdelement nameinternationalPrice
ltxsdcomplexTypegt ltxsdcomplexContentgt
ltxsdrestriction basexsdanyTypegt
ltxsdattribute namecurrency
typexsdstring /gt
ltxsdattribute namevalue typexsddecimal
/gt lt/xsdrestrictiongt
lt/xsdcomplexContentgt lt/xsdcomplexTypegt lt/xsd
elementgt
66
Shorthand for empty complex type
ltxsdelement nameinternationalPrice
ltxsdcomplexTypegt
ltxsdattribute namecurrency typexsdstring
/gt ltxsdattribute namevalue
typexsddecimal /gt lt/xsdcomplexTypegt lt/xsd
elementgt
67
anyType
  • The anyType seen in the definition for an empty
    content model represents an abstraction which is
    the base type from which all simple and complex
    types are derived
  • anyType does not constrain its content in any way
  • can be used like other types
  • is a default if no type is specified
  • ltxsdelement nameanything /gt

68
Building content models
  • ltxsdsequencegt fixed order
  • ltxsdchoicegt (1) choice of alternatives
  • ltxsdgroupgt grouping (also named)
  • ltxsdallgt no order specified

69
Nested choice and sequence groups
ltxsdcomplexType namePurchaseOrderTypegt
ltxsdsequencegt ltxsdchoicegt
ltxsdgroup refshipAndBill /gt
ltxsdelement namesingleUSAddress
typeUSAddress
/gt lt/xsdchoicegt ltxsdelement
nameitems typeItems /gt
lt/xsdsequencegt
70
Nested choice and sequence groups
ltxsdgroup nameshipAndBillgt
ltxsdsequencegt ltxsdelement
nameshipTo typeUSAddress /gt
ltxsdelement namebillTo typeUSAddress /gt
lt/xsdsequencegt lt/xsdgroupgt
71
An all group
  • An all group all the elements in the group may
    appear once or not at all, and they may appear in
    any order
  • limited to the top-level of any content model
  • has to be the only child at the top
  • groups children must all be individual elements
    (no groups), and no element in the content model
    may appear more than once

72
An all group
ltxsdcomplexType namePurchaseOrderTypegt
ltxsdallgt ltxsdelement nameshipTo
typeUSAddress /gt ltxsdelement
namebillTo typeUSAddress /gt
ltxsdelement refcomment minOccurs0 /gt
ltxsdelement nameitems typeItems /gt
lt/xsdallgt ltxsdattribute
nameorderDate typexsddate /gt
lt/xsdcomplexTypegt
73
Attribute groups
  • Also attribute definitions can be grouped and
    named

ltxsdelement nameitem gt ltxsdcomplexTypegt
ltxsdsequencegt lt/xsdsequencegt
ltxsdattributeGroup refItemDelivery /gt
lt/xsdcomplexTypegtlt/xsdelementgt ltxsdattributeGr
oup nameItemDeliverygt ltxsdattribute
namepartNum typeSKU /gt
lt/xsdattributeGroupgt
74
Namespaces and XML Schema
  • An XML Schema document contains declarations of
    namespaces that are used in the document
  • e.g. xmlnsxsdhttp//www.w3.org/2001/XMLSchema
    for the elements with special XML Schema
    semantics
  • Target namespace these definitions included in
    this schema give definition to this namespace
  • targetNamespaceurimywork

75
Namespaces and XML Schema
  • In XML Schema, schema components from different
    target namespaces can be used together
  • -gt enables the schema validation of instance
    content defined across multiple namespaces

76
Importing schema declarations
  • Every top-level schema component is associated
    with a target namespace (or, explicitly, with
    none, if the target namespace is not defined)
  • a component may refer to another component that
    is in a different namespace, using an import
    element

77
Import
ltschema xmlnshttp//www.w3.org/2001/XMLSchema
xmlnshtmlhttp//www.w3.org/1999/x
html targetNamespaceurimywork
xmlnsmyurimyworkgt ltimport
namespacehttp//www.w3.org/1999/xhtmlgt ltcompl
exType namemyTypegt ltsequencegt
ltelement refhtmlp minOccurs0/gt
lt/sequencegt lt/complexTypegt ltelement
namemyElt typemymyType/gt lt/schemagt
78
Type libraries
  • As XML schemas become more widespread, schema
    authors will want to create simple and complex
    types that can be shared and used as the basic
    building blocks for building new schemas
  • XML Schemas already provide types that play this
    role the simple types
  • other examples currency, units of measurement,
    business addresses

79
Example currencies
ltschema targetNamespacehttp//www.example.com/Cu
rrency xmlnschttp//www.example
.com/Currency xmlnshttp//www.w3
.org/2000/08/XMLSchemagt ltcomplexType
nameCurrencygt ltsimpleContentgt
ltextension basedecimalgt ltattribute
namenamegt ltsimpleTypegt
ltrestriction basestringgt
ltenumeration valueAED/gt
ltenumeration valueAFA /gt ltenumeration
valueALL /gt
80
Extending content models
  • Mixed content models
  • an element can contain, in addition to
    subelements, also arbitrary character data
  • import
  • an element can contain elements whose types are
    imported from external namespaces
  • e.g. this element may contain an HTML p element
    here
  • more flexible way
  • any element, any attribute

81
Example
ltpurchaseReport xmlnshttp//www.example.com/Rep
ortgt ltregionsgt lt!-- part sales by regions --gt
lt/regionsgt ltpartsgt lt!-- part descriptions --gt
lt/partsgt lthtmlExamplegt lttable
xmlnshttp//www.w3.org/1999/xhtml
border0 width100gt lttrgt ltth
alignleftgtZip Codelt/thgt ltth
alignleftgtPart Number lt/thgt ltth
alignleftgtQuantitylt/thgt lt/trgt
lttrgtlttdgt95819lt/tdgtlttdgt lt/tdgt lttdgt lt/tdgtlt/trgt
lttrgtlttdgt lt/tdgtlttdgt872-AAAlt/tdgtlttdgt1lt/tdgtlt/trgt
...
82
Including an HTML table
  • To permit the appearance of HTML in the instance
    document we modify the report schema by declaring
    the content of the element htmlExample by the any
    element
  • in general, an any element specifies that any
    well-formed XML is permissible in a types
    content model
  • in the example, we require the XML to belong to
    the namespace http//www.w3.org/1999/xhtml
  • -gt the XML should be XHTML

83
Schema declaration with any
ltelement namepurchaseReportgt ltcomplexTypegt
ltsequencegt ltelement nameregions
typerRegionsType/gt ltelement
nameparts typerPartsType/gt ltelement
namehtmlExamplegt ltcomplexTypegt
ltsequencegt ltany
namespacehttp//www.w3.org/1999/xhtml
minOccurs1 maxOccursunbounded
processContentsskip/gt
lt/sequencegt ...
84
Schema validation
  • The attribute processContents
  • skip no validation
  • strict an XML processor is obliged to obtain the
    schema associated with the required namespace and
    validate the HTML appearing within the
    HTMLExample element

85
anyAttribute
ltelement namehtmlExamplegt ltcomplexTypegt
ltsequencegt ltany namespacehttp//w
ww.w3.org/1999/xhtml
minOccurs1 maxOccursunbounded
processContentsskip/gt
lt/sequencegt ltanyAttribute
namespacehttp//www.w3.org/1999/xhtml/gt
lt/complexTypegt lt/elementgt
Write a Comment
User Comments (0)
About PowerShow.com