Document Type Definitions (DTD) - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Document Type Definitions (DTD)

Description:

Document Type Definitions (DTD) Definition xml is static. It has to be used by applications. Applications can work only with specific format of data. – PowerPoint PPT presentation

Number of Views:130
Avg rating:3.0/5.0
Slides: 26
Provided by: metadesign5
Category:

less

Transcript and Presenter's Notes

Title: Document Type Definitions (DTD)


1
Document Type Definitions (DTD)
2
Definition
  • xml is static. It has to be used by applications.
  • Applications can work only with specific format
    of data. For example, an editor uses a specific
    format, say text,bold, italic etc. However, xml
    is by nature extensible. Parsers only check that
    an xml document is well formed.
  • There is a need to constrain xml optionally when
    required.
  • It is easier to build applications that use xml
    if it is easy to validate that xml is valid.

3
Definition
  • DTDs allows us to validate XML. It allows us to
    specify
  • what elements are allowed in a document
  • depict relationships between the elements
    (children of an element, one to one, one to many
    relationships etc)
  • the attributes that an element can have
  • type of data that can be present in an attribute
    (but not elements)

4
A simple DTD example
  • Consider the following xml
  • ltpersongt
  • ltnamegtltfirst_namegtBill lt/first_namegt
  • ltlast_namegtGateslt/last_namegt
  • lt/namegt
  • ltprofessiongt CEOlt/professiongt
  • lt/persongt
  • The DTD for the xml is given below
  • lt!ELEMENT person(name,profession)gt
  • lt!ELEMENT name(first_name,last_name)gt
  • lt!ELEMENT first_name(PCDATA)gt
  • lt!ELEMENT last_name(PCDATA)gt
  • lt!ELEMENT profession(PCDATA)gt

5
DTD, explained
  • Each line is an element declaration. The first
    line says that a person element consists of a
    name and zero or more profession elements. The
    second line says that name consists of one first
    name and one last name. The last three lines say
    that first_name,last_name, and profession contain
    PCDATA (parsed character data). In other words,
    this contains data and no child elements.

6
Referencing DTDs
  • Including the DTD in the xml file
  • lt!DOCTYPE person SYSTEM "person.dtd"gt
  • This means that the dtd for the document is
    person.dtd and it is located in the same
    directory/url as the xml file.
  • Some dtds are well known and are bundled along
    with the XML parser. In this case, the PUBLIC ID
    can be used instead of the SYSTEM ID.
  • lt!DOCTYPE rss PUBLIC "-//Netscape
    Communications/DTD RSS 0.91/EN "http//my.netscape
    .com/publish/formats/rss-0.91.dtd"

7
Internal DTDs
  • DTD definitions can be embedded in the xml file
    itself
  • lt!DOCTYPE person
  • lt!ELEMENT person(name,profession)
    gt
  • lt!ELEMENT name(first_name,last_nam
    e)gt
  • lt!ELEMENT first_name(PCDATA)gt
  • lt!ELEMENT last_name(PCDATA)gt
  • lt!ELEMENT profession(PCDATA)gt
  • gt

8
Mixed DTDs
  • lt!DOCTYPE person SYSTEM name.dtd
  • lt!ELEMENT person(name,profession)gt
  • lt!ELEMENT profession(PCDATA)gt
  • gt

9
Validating Documents using DTD
  • Parsers may or may not check for validation. When
    parsing apis are invoked from a program, it is
    necessary to turn on validation by means of some
    flag provided in the API.
  • Browsers do not check xml files against dtds
  • Validation can be tested using a sample utility
    from the xerces open source API.
  • java sax.SAXCount -v person.xml

10
DTD constructs
  • Element declaration
  • Every element used in the xml file should be
    declared using the statement
  • lt!ELEMENT element_name (content_model)gt
  • Here element_name is the name of the element. The
    content model can either be a simple type
    (PCDATA) or a sequence of other elements.

11
DTD constructs
  • Element types
  • PCDATA
  • This says that element may contain any parsed
    character data, but not any child elements.
  • Child Element
  • If an element has exactly one child then it is
    represented by the child element
  • lt!ELEMENT fax (phone_number)gt
  • Sequences
  • The most elementary sequence is one where an
    element has more than one child in a particular
    sequence but only one of each type.
  • lt!ELEMENT name (first_name,last_name)

12
DTD Constructs
  • Previous example indicates
  • a name should contain both first name and last
    name
  • if either of them is missing, it is invalid
  • if last_name precedes first_name it is invalid
  • Specifying n-n relationships
  • ? - zero or 1 child
  • - zero or many children
  • - 1 or many children
  • Exercise What do the following declarations
    mean?
  • lt!ELEMENT name (first_name,middle_name?,last_name)
  • lt!ELEMENT person (name,profession,hobbies)

13
DTD Constructs
  • Choices
  • If an element can contain either one or the other
    element as a child, then a choice symbol is
    used.
  • lt!ELEMENT circle (centre,(radiusdiameter))gt
  • Empty elements
  • If an xml element is going to contain no data, or
    child elements, it will be marked as empty
  • ltimage src"photo.jpg" width"10" height"10"/gt
  • This will have a type EMPTY
  • lt!ELEMENT image EMPTYgt
  • An element marked as empty can still have
    attributes

14
DTD Constructs
  • ANY
  • Some elements can be marked with a type of ANY.
    This means that there are no type constraints on
    this element. It can contain any attributes,
    children or data.
  • lt!ELEMENT page ANYgt

15
DTD Constructs
  • Attribute Declarations
  • The valid attributes of an element can be
    designed as follows
  • lt!ATTLIST person
  • born CDATA REQUIRED
  • died CDATA IMPLIEDgt
  • This can also be written as
  • lt!ATTLIST person born CDATA REQUIREDgt
  • lt!ATTLIST person died CDATA IMPLICITgt
  • Attribute Types
  • Unlike elements, different types can be specified
    for attributes. The valid attribute types are
  • CDATA character data
  • NMTOKEN same constraints as those for XML names
  • Enumerations - this is not a keyword. Instead it
    is a set of tokens separated by sign which form
    valid values for an attribute
  • ex lt!ATTLIST date month
    (Januaryfebruarymarchapril) REQUIREDgt

16
Including a DTD within another DTD
  • lt!ENTITY names SYSTEM "names.dtd"gt
  • names
  • Entity References
  • Entity references allow you to define a token
    that stands for some other text. For example, an
    entity can be defined as
  • lt!ENTITY pcd PCDATAgt
  • It can be used as follows
  • lt!ELEMENT name pcdgt
  • References to the entity are enclosed within
    and . Wherever, the entity is encountered, it is
    substituted with the corresponding text.

17
Exercise
  • Write a DTD for the organization chart created in
    the previous exercise
  • Write a DTD for the library xml file created in
    the previous exercise
  • Test the xml files with the DTDs.

18
Namespaces
  • Lets begin with an exercise!
  • In the previous exercises, an organization chart
    and a library xml files were created. A dtd was
    created for the organization chart and the
    library xml files too. In this exercise
  • Create an xml file called organization.xml that
    contains both the organization chart xml and the
    library xml. Also, these should be validated, so
    ensure that a dtd is created for this file. Since
    the dtds for the two xml files have been created
    already, ensure that these are reused.

19
Namespaces the need
  • XML elements can be reused and assembled to
    create other xml files.
  • When such assembling is done, there can be name
    clashes. Common terms like name, age etc tend to
    appear in different applications. Sometimes same
    terms can have different meanings in different
    documents, for example table could mean an html
    table, a multiplication table or a piece of
    furniture based on the application
  • Different operations need to identify the element
    uniquely to perform operations such as
    validation, display etc. An html table should be
    displayed and validated differently from a
    multiplication table.

20
Namespace Definition contd
  • Namespaces can be defined using the following
    syntax
  • ltogemployee ognameBill Gates
    xmlnsoghttp//abc.comgt
  • ltogdesignationgtCEOlt/ogdesignationgt
  • ltogsubordinatesgt
  • ltogemployee ognameABCgt
  • ltogdesignationgtVPlt/ogdesignationgt
  • lt/ogemployeegt
  • lt/ogsubordinatesgt
  • lt/ogemployeegt
  • ltprefixlttaggt xmlnsltprefixgturigt
  • Syntax consists of two steps
  • Define the prefix
  • Xmlnsprefixhttp//www.abc.com
  • Use the prefix
  • ltogemployeegt
  • The Syntax looks a little odd because both of
    them appear on the same line and the prefix
    appears to be used before declaration!

21
Namespace definition (contd)
  • The uri can be any valid url, it need not
    actually exist. The only requirement is that it
    is unique within the xml document.
  • Namespaces are always set at element level. A
    namespace applies only to the element it is set.
    Its subordinates must be prefixed with the same
    prefix to indicate that they belong to the
    namespace.
  • Sub elements inside a main element can belong to
    a different namespace, in which case the
    namespace for the sub element can be explicitly
    declared.

22
Declaring a default namespace
  • If all the subelements from an element come from
    the same namespace, then a default namespace can
    be declared as follows
  • ltemployee xmlnshttp//www.abc.comgt
  • ltsubordinatesgt
  • lt/subordinatesgt
  • lt/employeegt

23
Namespaces and DTDs
  • Namespaces and DTDs are orthogonal. Xml documents
    can have DTDs or namespaces or both.
  • If an element is qualified by a prefix in the xml
    document, then the DTD should also declare an
    element with the same prefix and name. Example
  • lt!ELEMENT dctitle (PCDATA)gt
  • Sometimes two DTDs may contain the same prefix.
    In such a case, one of the DTDs should be
    changed. It is easier to change a DTD if the
    prefix is defined as an entity reference

24
Namespace prefixes in DTD
  • lt!ENTITY dc-prefix dcgt
  • lt!ENTITY dc-colon gt
  • lt!ENTITY dc-title dc-prefixdc-colontitlegt
  • lt!ENTITY dc-creator dc-prefixdc-coloncreato
    rgt
  • lt!ELEMENT dc-title (PCDATA)gt
  • lt!ELEMENT dc-creator (PCDATA)gt

25
Exercise
  • Create an xml file that shows both the
    organization chart and the library. Ensure that
    they are validated using proper DTDs
Write a Comment
User Comments (0)
About PowerShow.com