Title: Introduction to XML and DTD
1Unit 1
- Introduction to XML and DTD
2Section Objectives
- Roots of XML
- XML Document Structure
- Document Type Definition (DTD)
3Mark-up Languages
- HTML Familiar
- Purely Sequential
- Paragraph Tag ltPgt
- Problem Content Wrapped in Mark-up
3
4Roots of XML
Roots of XML
- SGML
- HTML
- XML
- Document Type Definitions
- Valid documents
4
5Roots of XML
5
6Introducing XML
Example1.xml
Example1.html
6
7Introducing XML
XML in IE5 browser
7
8Exercise 1
- Using a simple text editor (ex. notepad.exe),
modify the structure of the XML document from
Example1.xml so that the cars have their own
enclosing ltcargtlt/cargt tags. - Edit the file Exercise1.xml
- View the result in IE5.
- How do you know if it is well-formed?
8
9Section Objectives
- Roots of XML
- XML Document Structure
- Document Type Definition (DTD)
10Structure of XML Document
Root 0
A child has only ONE parent Child 0.1.2 for
example has parent 0.1
10
11Structure of XML Document
Book
Chapter 1
Section1 for Chapter 1
Chapter 1 Heading
Section 1 Heading
Paragraph 1
11
12Structure of XML Document
Book
Chapter
Section
Chapter Heading
Paragraph
Section Heading
Entries
Text
Figure
12
13Sample Book XML
- ltbookgt
- ltchaptergt
- ltchapter.headinggtCONTENT
- lt/chapter.headinggt
- ltsectiongt
- ltsection.headinggtCONTENT
- lt/section.headinggt
- ltparagraphgt
- lttextgtCONTENTlt/textgt
- ltimagegtCONTENTlt/imagegt
- lttextgtCONTENTlt/textgt
- ...
- lt/paragraphgt
- lt/sectiongt
- lt/chaptergt
- lt/bookgt
Example2.xml
13
14XML Elements
- Opening tag
- Content
- Closing tag
- Empty Element
14
15XML Attributes
- Element can have 0 or more
- Values must be in quotes
- Associate data with element
15
16Well Formed XML
- There must be a single root element
- All elements must be properly nested
- Element tags must match (lttaggtlt/taggt)
- Empty element may have a single tag definition
(lttag/gt) - Valid names start with alphabetic or a _
(underscore) followed by any combination of
letters, digits, hyphens, full stops or colons.
Colons have a special meaning in allowing a name
to be made unique through namespaces - There must be no white space characters (space,
tab or new line characters) - The content between tags must not contain
explicitly the following characters lt gt
16
17Other XML Elements and References
- CDATA
- Processing Instructions
- Comments
17
18Other XML Elements and References
- XML Declaration
- Version
- External DTD
- Character encoding
18
19Other XML Elements and References
- Entities
- Character References
- Elements or Attributes
- White space
19
20Section Objectives
- Roots of XML
- XML Document Structure
- Document Type Definition (DTD)
21Introduction to DTD
- Structured Declarations
- Well formed XML prior discussion
- Valid XML current discussion
- Do not conform to XML syntax
- Knowing the DTD allows you to create valid XML
documents
21
22Advantages of DTD
- Obeys DTD - it is valid and well formed
- Specification known by developer
- Reliable Communication between applications
- Specify default values
22
23XML Document Structure
23
24The Basic Declaration
lt!DOCTYPE root_name . declarations of
elements, attributes and entities . gt
lt!DOCTYPE root_name SYSTEM locationgt
24
25The Basic Declaration
lt!DOCTYPE rootname lt!ELEMENT rootname
gt lt!ELEMENT gt lt!ELEMENT gt lt!ATTLIST
gt lt!ATTLIST gt lt!ATTLIST gt lt!ENTITY
gt lt!ENTITY gt lt!NOTATION gt lt!NOTATION
gt etc. etc. gt
25
26Declaring Elements
- Syntax
- Sequence
- Selection
26
27Declaring Elements
27
28Content Definitions
- Elements
- PCDATA
- EMPTY
- ANY
28
29Entities and Entity Declarations
- Standard Entities
- General Entities
29
30Entities and Entity Declarations
- Character Entities
- Escaping Characters
- Parameter Entities
30
31Notations
ltNOTATION jpg SYSTEM D\Program
Files\Plus!\Microsoft Internet\iexplore.exegt
31
32Defining Attributes
lt!ATTLIST element_name attribute_name
attribute type default values/requirements gt
ltbook price "12.45" currency "GBP"gtlt/bookgt
32
33Attribute Requirements
33
34Attribute Types
- CDATA
- Enumerated Types
- ID and IDREF
34
35Attribute Types
- NMTOKEN
- lt!ATTLIST book date_in NMTOKEN IMPLIEDgt
-
- ltbook date_in "12-09-1999"gtlt/bookgt
- lt!ATTLIST book owners NMTOKENS REQUIREDgt
-
- ltbook owners "Shakespeare Bacon"gtlt/bookgt
35
36Attribute Types
- ENTITY
- lt!ENTITY fig5.jpg SYSTEM "E\XML Course\fig
5.0.jpg" NDATA jpggt - ltfigure image "fig5.jpg"/gt
- lt!ELEMENT figure EMPTYgt
- lt!ATTLIST figure image ENTITY REQUIREDgt
- lt!ENTITY fig5.jpg SYSTEM "E\XML Course\fig
5.0.jpg" NDATA jpggt
36
37Attribute Types
- NOTATION
- lt!NOTATION jpg SYSTEM "D\Program
Files\Plus!\MicrosoftInternet\iexplore.exe"gt
37
38Normalisation and White Space
- Quotes Stripped
- Entities Replaced
- xmlspace
38