XML Syntax Writing XML and Designing DTD's - PowerPoint PPT Presentation

About This Presentation
Title:

XML Syntax Writing XML and Designing DTD's

Description:

XML Syntax - Writing XML and Designing DTD's. HTML 1st Example ... value='Blue Peter' (character data) value = 'blue' (single token) ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 43
Provided by: alanjro
Category:

less

Transcript and Presenter's Notes

Title: XML Syntax Writing XML and Designing DTD's


1
XML Syntax - Writing XML and Designing DTD's
2
HTML 1st Example
  • lthtmlgtltheadgtlttitlegtChocolate Cakelt/titlegtltbodygt
  • ltbgtIngredient Listlt/bgtlthr /gt
  • ltbrgt2 cups flour
  • ltbrgt1 cup sugar
  • ltbrgt2 bars chocolate
  • ltbrgt1 cup milk
  • ltbrgtltbrgtltbgtInstructionslt/bgt
  • lthrgtltbrgtMix flour, sugar and milk
  • ltbrgtEat chocolate
  • ltbrgtBake at 400 degrees
  • lt/bodygtlt/htmlgt

3
XML Document Structure
  • Text file containing Elements, Attributes Text
  • lt?xml version1.0 ?gt
  • ltRecipe nameChocolate Cake typeDesert gt
  • ltIngredientListgt
  • ltIngredientgt2 cups flourlt/Ingredientgt
  • ltIngredientgt1 cup sugarlt/Ingredientgt
  • lt/IngredientListgt
  • ltInstructiongtSift the flourlt/Instructiongt
  • lt/Recipegt

4
XML Document Structure
  • Text file containing Elements, Attributes Text
  • lt?xml version1.0 ?gt
  • ltRecipe nameChocolate Cake typeDesert gt
  • ltIngredientListgt
  • ltIngredientgt2 cups flourlt/Ingredientgt
  • ltIngredientgt1 cup sugarlt/Ingredientgt
  • lt/IngredientListgt
  • ltInstructiongtSift the flourlt/Instructiongt
  • lt/Recipegt

5
10 Rules Well Formed XML1. Must start with XML
declaration
  • lt?xml version1.0 ?gt

6
2. Must be only one document element
  • Valid Example(s)
  • lt?xml version1.0 ?gt
  • ltrecipegt
  • lt/recipegt
  • or
  • ltrecipeBookgt
  • ltrecipegtlt/recipegt
  • ltrecipegtlt/recipegt
  • lt/recipeBookgt
  • Invalid Example
  • lt?xml version1.0?gt
  • ltrecipegt
  • lt/recipegt
  • ltrecipegt
  • lt/recipegt

7
3. Match opening closing tags
  • Carry over from html origins
  • lthrgt ltpgt or ltboldgtltitalicgtlt/boldgtlt/italicgt
  • Browsers forgive, XML Parsers do NOT
  • ltpgtlt/pgt or ltbr /gt
  • ltboldgtltitalicgtlt/italicgtlt/boldgt
  • ltrecipegtlt/recipegt

8
4. Comments allowed, but not inside attribute or
element tag
  • lt!-- Isnt XML really cool? --gt
  • lt!-- Just like being a student!!! --gt

9
5. Elements and Attributes must start with a
letter
  • ltRecipegt OK
  • ltSecond thirdfalsegt OK
  • lt2ndgt INVALID
  • ltRecipe 2ndtruegt INVALID

10
6. Attributes must go in the opening tag
  • Valid
  • ltrecipe nameChocolate Cake
  • categoryDesertgtlt/recipegt
  • Invalid
  • ltrecipegtlt/recipe nameChocolate Cakegt

11
7. Attributes must be enclosed in matching quotes
  • Can use either single or double quotes but must
    use same type to start and end attribute value
  • NameAustralian Computer Society
  • NameAustralian Computer Society

12
Lets finish these rules!
  • 8. Only simple text for attributes, no nested
    values. Nesting is allowed in elements, not in
    attributes.
  • 9. Use lt amp gt quot and apos for
    special characters. lt gt
  • 10. Write empty elements using ltrecipe /gt syntax
    if no nested values, can still have attributes in
    tag ltrecipe typedesert /gt.

13
With these 10 rules, we have a Well Formed xml
document
  • It means the xml can be read, processed or
    parsed.
  • Doesnt mean the structure makes sense.
  • ltrecipe modelHoldengt
  • ltchaptergtlt/chaptergt
  • ltengine cylinders4gtlt/enginegt
  • ltrecipegt

14
Examples
  • Buggy dictionary
  • Non-buggy dictionary
  • FIDA

15
DTD Document Type Definition
  • Allows us to define the exact elements and
    attributes for the document
  • These effectively become the rules of our own
    markup language, the extensible part of xml
  • DTD really only defines the structure, limited
    in what you can validate in regards to the text
    values of the element or attribute.

16
Recipe DTD
  • lt!ELEMENT Recipe (Name, Description?,
    Ingredients?, Instructions?)gt
  • lt!ELEMENT Name (PCDATA)gt
  • lt!ELEMENT Ingredient (Qty, Item)gt
  • lt!ELEMENT Qty (PCDATA)gt
  • lt!ATTLIST Qty unit CDATA REQUIREDgt
  • lt!ELEMENT Item (PCDATA)gt
  • lt!ATTLIST Item optional CDATA 0 isVegetarian
    CDATA truegt

17
Elements
  • Basic rules
  • Start tag lttag_namegt and end tag lt/tag_namegt
  • Tags must be nested
  • lttag1gtlttag2gtlt/tag2gtlt/tag1gt
  • Tags may be empty (no enclosed data)
  • ltempty_tag/gt
  • Whitespace in element content usually ignored
  • ltsectiongtltpgt lt/pgtlt/sectiongt
  • ltsectiongt ltpgt lt/pgtlt/sectiongt

18
Element Declarations
  • Used to define new elements and their content
  • lt!ELEMENT name (PCDATA)gt ? ltnamegt lt/namegt
  • Empty element has no content
  • lt!ELEMENT name EMPTYgt ? ltname/gt
  • When children allowed - any or model group
  • lt!ELEMENT name ANYgt
  • lt!ELEMENT person (name, e-mail)gt

19
Model Groups
  • Used to define content of elements
  • lt!ELEMENT person (name, e-mail)gt
  • Used to define hierarchies of elements
  • lt!ELEMENT name (fname, surname)gt lt!ELEMENT
    fname (PCDATA)gtlt!ELEMENT surname
    (PCDATA)gtlt!ELEMENT e-mail (PCDATA)gt
  • Control organisation of elements
  • Sequence connector - ',' - (A, B, C) then
  • Choice connector - '' - (A B C) or

20
Model Group Quantity Indicators
  • Describe constraints on elements in DTD A? May
    occur 0..1 A Must occur 1.. A May
    occur 0.. A B Either A or B A, B A
    followed by B (A, B) ((A,B?) C)

21
Attributes
  • Provide additional information about an element
  • Enclosed by quotes - either " or '
  • Case-sensitive
  • May be character data or tokenized
  • value"Blue Peter" (character data)
  • value "blue" (single token)
  • value "red green blue" (tokens)
  • Values may be enumerated or defaulted (DTD)

22
Attribute Declarations
  • Attributes can be attached to elements
  • Declared separately in ATTLIST declaration
  • lt!ATTLIST tag gt
  • Rest of definition specifies
  • attribute name
  • attribute type
  • default value

23
Attribute Names and Types
  • Attribute name
  • lt!ATTLIST tag nme type defaultgt
  • lt!ATTLIST tag first_attr
    secon_attr third_attr gt
  • Attribute types

CDATA NMTOKEN NMTOKENS ENTITY ENTITIES
ID IDREF IDREFS NOTATION name group
24
Attribute Types
  • CDATA
  • Character data
  • NMTOKEN
  • Single token
  • NMTOKENS
  • Multiple tokens
  • ENTITY
  • Attribute is entity ref
  • ENTITIES
  • Multiple entity ref's
  • ID
  • Unique ID
  • IDREF
  • Match to ID
  • IDREFS
  • Match to multiple ID's
  • NOTATION
  • Describe non-XML data
  • Name group
  • Restricted list

25
Attribute Types
  • CDATA
  • lt!ATTLIST person name CDATA gt
  • NMTOKEN
  • lt!ATTLIST mug color NMTOKEN gt
  • NMTOKENS
  • lt!ATTLIST temp values NMTOKENS gt
  • ENTITY
  • lt!ATTLIST person photo ENTITY gt
  • ENTITIES
  • lt!ATTLIST album photos ENTITIES gt
  • ID
  • lt!ATTLIST person id ID gt
  • IDREF
  • lt!ATTLIST person father IDREF gt
  • IDREFS
  • lt!ATTLIST person children IDREFS gt
  • NOTATION
  • lt!ATTLIST image format NOTATION (TeXTIFF) gt
  • Name group
  • lt!ATTLIST point coord (XYZ) gt

26
Attribute Types
  • CDATA
  • name "Tom Jones"
  • NMTOKEN
  • color"red"
  • NMTOKENS
  • values"12 15 34"
  • ENTITY
  • photo"MyPic"
  • ENTITIES
  • photos"pic1 pic2"
  • ID
  • ID "P09567"
  • IDREF
  • IDREF"P09567"
  • IDREFS
  • IDREFS"A01 A02"
  • NOTATION
  • FORMAT"TeX"
  • Name group
  • coord"X"

27
Default Attribute Values
  • Can specify a default attribute value for when
    its missing from XML document, or state that
    value must be entered
  • REQUIRED Must be specified
  • IMPLIED May be specifed
  • "default" Default value if unspecified
  • FIXED Only one value allowed

ltATTLIST tag name type
defaultgt lt!ATTLIST seqlist sepchar NMTOKEN
REQUIRED type (alphanum)
"num"
28
Declarations
  • Instructions for the XML processor
  • Format - lt! gt or lt! lt! gtgt
  • Document type - lt!DOCTYPE gt
  • Character data - lt!CDATA gt
  • Entities - lt!ENTITY gt
  • Notation - lt!NOTATION gt
  • Element - lt!ELEMENT gt
  • Attributes - lt!ATTLIST gt
  • lt!INCLUDEgt and lt!IGNOREgt

29
Document Type Declaration
  • Identifies the name of the document root element
  • lt!DOCTYPE My_XML_Docgt
  • May also add entity definitions and DTD
  • lt!DOCTYPE My_XML_Doc gtltMy_XML_Docgt
    ...lt/My_XML_Docgt

30
Comment Declaration
  • Comments are not considered part of XML document
    and should not be published
  • lt!-- A comment --gt
  • Cannot have additional '--' in comment
  • Cannot embed inside other declarations

31
Character Data Declaration
  • For occasions when text must contain
    uninterpreted markup characters
  • Press ltltltENTERgtgtgt
  • lt!CDATAPress ltltltENTERgtgtgtgt

32
Processing Instructions
  • Information required by an external application
  • Processing Instructions
  • Format - lt? ?gt
  • XML PI - lt?xml version'1.0 ?gt
  • Confusingly, this is called the XML declaration,
    but is a processing instruction

33
Entities
  • XML document may be distributed among a number of
    files
  • Each unit of information is called an entity
  • Each entity has a name to identify it
  • Defined using an entity declaration
  • Used by calling an entity reference

34
When to use Entities
  • Use an entity when the information
  • Is used in several places
  • May be represented differently
  • Is part of a larger document that needs to be
    split up to be manageable
  • Conforms to a data format other than XML

35
Types of Entity
  • General Entity
  • Referred to in XML document
  • Parameter Entity
  • Referred to in markup declarations in DTD
  • Internal Entity
  • Stored in main document
  • Text content only
  • External Entity
  • Stored externally to the main document
  • Text or binary
  • Can use to group many internal entities together

36
General Entities
  • Declared in 'Document Type Declaration'
  • lt!DOCTYPE My_XML_Doc lt!ENTITY name
    "replacement"gt gt
  • lt!ENTITY xml "eXtensible Markup Language"gt
  • The xml includes entities
  • The eXtensible Markup Language includes entities

37
Parameter Entities
  • Declared in 'Document Type Declaration'
  • lt!DOCTYPE My_XML_Doc lt!ENTITY name
    "replacement"gt gt
  • lt!ENTITY param "(para list)"gt
  • lt!ELEMENT section (param)gt

38
External Entities
  • External Text Entities
  • Location specified with SYSTEM keyword
  • lt!ENTITY ent SYSTEM "/ENTS/MYENT.XML"gt
  • May specify with public identifier
  • lt!ENTITY ent PUBLIC "-//EBI//ENTITIES ents//EN"
    gt
  • External Binary Entities
  • Need to identify format of data - NDATA
  • lt!ELEMENT pic EMPTYgtlt!ATTLIST pic name ENTITY
    REQUIREDgtlt!ENTITY photo SYSTEM
    "/ENTS/photo.tif" NDATA TIFFgt
  • Referenced by empty element
  • A photograph ltpic name"photo"/gt.

39
Restrictions on Entities
  • General text entities
  • Can appear in element content
  • ltparagt ent lt/paragt
  • Can appear in attribute value
  • ltpara name"ent"gt lt/paragt
  • Can appear in internal entity content
  • lt!ENTITY cod "ent"gt
  • Cannot appear in other parts of DTD

40
Restrictions on Entities (2)
  • Binary entities
  • If entity content is not XML, the entity cannot
    be used as a textual reference
  • Error - lt!ELEMENT sec (paraphoto)gt
  • Error - ltparagt photo lt/paragt
  • Binary entity can only appear as an attribute of
    type ENTITY
  • lt!ENTITY photo SYSTEM "photo.tif" NDATA
    TIFFgtlt!ELEMENT pic (PCDATA)gtlt!ATTLIST pic
    name ENTITY REQUIREDgt

41
Parameter Entities
  • Use parameter entities within DTD
  • lt!ENTITY common "(paralisttable)"gtlt!ELEMENT
    chapter ((common), section)gtlt!ELEMENT
    section (common)gt
  • Safest to include parentheses in entity
    definition and around entity reference

42
Putting it all together...
  • Have now been introduced to the main components
    and rules of XML and DTDs
  • Entities, elements, declarations, processing
    instructions, attribute lists
  • Use all these components in the 'Document
    Definition Type' (DTD) to specify the rules about
    the format of the XML document
Write a Comment
User Comments (0)
About PowerShow.com