Title: XML Technology in E-Commerce
1XML Technology in E-Commerce
- Lecture 2
- Logical and Physical Structure, Validity, DTD,
XML Schema
2Lecture Outline
- Logical and Physical Structure of XML Documents
- Validity
- DTD
- Element declarations
- Attribute declarations
- XML Schema
- Element and Attribute declarations
- Simple types definitions
- Complex types definitions
3Logical and Physical Structure
- By definition each XML document has logical and
physical structure - Markups are used to describe the structures
- Two structures must be properly nested according
to the specification rules - See Logical and Physical Structure of XML
Documents
4Logical Structure
- An XML Document is an information item
- Document Logical Structure represents the
information in the way perceived by the user
(application)
5Physical Structure
- An XML Document is also a physical entity
- The content that we logically perceive can be
distributed across several physical entities.
They form the physical structure
ltstudentgt John Smith lt/studentgt
ltstudentsgt ltstudentgt John Smith
lt/studentgt ltstudentgt John Smith Jr.
lt/studentgt lt/studentsgt
ltstudentsgt lt/studentsgt
Entity 2
ltstudentgt John Smith Jr. lt/studentgt
Entity 1
Logical View
Entity 3
6Valid XML Documents
- Well-formedness constraints dont specify element
and attribute names and types and the instance
document structure - Validity Constraints - specify element and
attribute names and types and the document
structure - DTD based validation and Schema based validation
- Parsers
- Non-validating parsers check documents for
well-formedness - Validating parsers check documents for
well-formedness and validity constraints
7DTD Validation
8DTD
- DTD - Document Type Definition
- DTD is a grammar for a class of XML documents
- Document Type Declaration
- Contains the DTD for an XML document
- External subset
- lt!DOCTYPE root SYSTEM myDTD.dtd gt
- Internal subset
- lt!DOCTYPE root
- markup declarations
- gt
9DTDMarkup Declarations
- Element type declarations
- Attribute list declarations
- Entity declarations - declare the entities that
form the document physical structure. See
Logical and Physical Structure of XML
Documents - Notation declarations
- Document Type Declaration can also contain
Processing Instructions and Comments
10DTDElement Type Declaration
- Specifies the element type and content
- lt!ELEMENT Name contentSpecgt
- Elements Content
- Empty
- lt!ELEMENT homepage EMPTYgt
- Any
- lt!ELEMENT container ANYgt
- Only elements (element content)
- Mixed
11DTD Elements Content Content Model
- Content Model Building Blocks
- Choice
- (p list table form )
- Sequence
- (street, zip, city, country)
- Occurrence Specifiers
- ?
- Example
- lt!ELEMENT person (name, address,
homepage?, note)gt - See also Deitel 6.4.1, page 139
12DTD Elements Content Mixed Content
- Elements with mixed content can contain other
elements and character data or only character
data - lt!ELEMENT note (PCDATA em strong
abbr)gt - lt!ELEMENT p (PCDATA em i b a
ul)gt - lt!ELEMENT street (PCDATA)gt
- lt!ELEMENT city (PCDATA)gt
- Other examples - Deitel 6.4.2, page 143
13DTDAttribute List Declaration
- Attributes are always associated with a
particular element - Attribute list declaration format
- lt!ATTLIST elName
- attrName1 attrType1 attrDefault1
- attrName2 attrType2 attrDefault2
- gt
- Attribute types
- String type
- Tokenized type
- Enumerated type
14DTD Attribute DeclarationsAttribute Types
- String type
- lt!ATTLIST person age CDATA REQUIREDgt
- Tokenized types
- ID, IDREF, IDREFS (Deitel 6.6.1 page 147)
- ENTITY, ENTITIES(Deitel 6.6.1 page 150, Logical
and Physical Structure of XML Documents) - NMTOKEN, NMTOKENS (Deitel 6.6.1 page 152)
- lt!ATTLIST person id ID REQUIREDgt
- Enumerated type
- lt!ATTLIST person gender (M F) IMPLIEDgt
15DTD Attribute DeclarationsAttribute Defaults
- Provide information about the attributes
presence - REQUIRED
- Attribute must always be present.
- IMPLIED
- The attribute may be absent. There is no default
value. - Default value
- lt!ATTLIST list type (olul) ulgt
- lt!ATTLIST list type (olul) FIXED ulgt
16Summary on DTD validation
- DTD is a grammar that specifies element and
attributes types and names - DTD contains declarations for Entities and
Notations that are used in the document physical
structure (see Logical and Physical Structure
of XML Documents) - Mixed element content can not constrain the order
of sub-elements - Attribute value type set doesnt contain
primitive data types like integer, date, time,
etc. - Demo - DTD validation with XML Spy
Read Deitel 6, Logical and Physical Structure
of XML Documents Assignment Deitel Ex 6.6 and
Ex 6.7, page 164
17Lecture Outline
- Logical and Physical Structure of XML Documents
- Validity
- DTD
- Element declarations
- Attribute declarations
- XML Schema
- Element and Attribute declarations
- Simple types definitions
- Complex types definitions
18Schema Validation
19XML Schema
- XML Schema constrains the structure, element and
attributes names and types of XML documents - There are several schema proposals. We will
discuss W3C Schema - Schema specification defines an abstract data
model for schemas and the correspondent XML
representation - A schema is a set of components
- There are 13 schema components divided into three
groups - Primary components
- Secondary components
- Helper components
20Schema XML Representation
- schema element
- ltxsschema xmlnsxs"http//www.w3.org/2000/10/XM
LSchema" version1.0gt - ltxsattribute gt
- lt/xsschemagt
- Current namespace URI (30 March, no support in
XML Spy 3.5) - http//www.w3.org/2001/XMLSchema
- Components
- Element declarations
- Attribute declarations
- Simple type definitions
- Complex type definitions
21SchemaElement Declaration
- Syntax
- ltelement namemyElement typemyType /gt
- ltelement refmyElement/gt
- Occurrence
- minOccurs and maxOccurs attributes
- ltelement refmyElement
- minOccurs2
- maxOccurs12/gt
- ltelement refmyElement
- minOccurs0
- maxOccursunbounded/gt
22SchemaAttribute Declaration (1)
- Syntax
- ltattribute namemyAttr typemyAttrType/gt
- ltattribute refmyAttr/gt
- Defaults
- use and value attributes
- ltattribute refmyAttr userequired/gt
- ltattribute refmyAttr usedefault
- value37/gt
- ltattribute refmyAttr usefixed
- value37/gt
23SchemaAttribute Declaration (2)
- Changes in attribute occurrence constraints
syntax (made on 30 March, currently not supported
by XML Spy 3.5) - Defaults
- use, default, fixed attributes
- ltattribute refmyAttr userequired/gt
- ltattribute refmyAttr useoptional
- default37/gt
- ltattribute refmyAttr fixed37/gt
24SchemaType Definitions
- XML Schema provides two kinds of type definition
- Simple types - specify constraints on strings
that can be used as values of attributes and
elements with only character data content - Complex types - specify attributes and content
model of document elements - Type definition hierarchy
- Types defined by restriction
- Types defined by extension
- Root type - anyType
25SchemaSimple Types
- Usage - for attribute values and content of
elements without attributes and children - ltphonegt222-33-22-444-1lt/phonegt
- ltagegt23lt/agegt
- Set of built-in simple datatypes defined in XML
Schemas Datatypes specification (see XML Primer,
Appendix B, Table b1.a) - Each simple type is a restriction of another
simple type
26SchemaSimple Type Definition
- Syntax
- ltsimpleType namemySimpleTypegt
- content (restriction union list)
- lt/simpleTypegt
- Restrictions
- ltsimpleType namemySimpleTypegt
- ltrestriction baseintegergt
- ltminInclusive value25/gt
- ltmaxInclusive value100/gt
- lt/restrictiongt
- lt/simpleTypegt
- Facets (see XML Schema Primer, Appendix B)
- List and Union Types (see XML Schema Primer 2.3.1
and 2.3.2)
27SchemaComplex Types
- Complex type definition contains a set of
attribute declarations and content model that
specify the content and attributes of a set of
elements - Complex type can be
- a restriction of another complex type
- an extension of a simple or complex types
- a restriction of the anyType type
- Extension mechanism adds additional content parts
at the end of the content model of the base
definition and/or adds new attribute declarations
28SchemaComplex Type Definition
- Elements with text-only content and attributes.
Extension of simple types - ltheight unitsmgt125lt/heightgt
- ltcomplexType namemeasurementgt
- ltsimpleContentgt
- ltextension basedecimalgt
- ltattribute nameunits typestring/gt
- lt/extensiongt
- lt/simpleContentgt
- lt/complexTypegt
- ltelement nameheight typemeasurement/gt
29SchemaElement Content Model
- Model Group Elements (see XML Schema Primer 2.7)
- sequence
- choice
- all
- group
- Mixed Content
- ltcomplexType namenoteType mixedtruegt
- ltchoice maxOccursunboundedgt
- ltelement nameem typestring/gt
- ltelement nameb typestring/gt
- ltelement namei typestring/gt
- lt/choicegt
- lt/complexTypegt
- Empty Elements (see XML Schema Primer 2.5.3)
30SchemaAdditional Features
- Anonymous Types (Primer 2.4)
- Attribute Groups (Primer 2.8)
- Namespace (Primer 3.1)
- Deriving types by extension (Primer 4.2)
- Schema modularization (Primer 4.1)
- Annotations (Primer 2.6)
- Relating schema and document instances (Primer
5.6, Deitel 7.6) - Demo Schema validation with XML Spy
31Summary on XML Schema
- Expressed in XML
- Based on the explicit notion of types for
elements and attribute values - Provides namespace control
- Uses extension and restriction for type
derivation - Lacks of support for entities
Read Deitel 7, XML Schema Primer (24.10.2000
version) Skip Deitel 7.3..7.5, Primer 5.1..5.3,
5.5 AssignmentWrite schema for planner.xml
(Deitel 5.9, page 126) and compare with the
syntax in Deitel 7.7. Validate with XML-Spy. Use
Chapter 2 and Appendix B from the Primer, Deitel
7.6