Extensible Markup Language XML - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Extensible Markup Language XML

Description:

Container elements are marked by and signs ... image:file filename = 'funny.jpg' image:description A funny picture /image:description ... – PowerPoint PPT presentation

Number of Views:76
Avg rating:3.0/5.0
Slides: 25
Provided by: web50
Category:

less

Transcript and Presenter's Notes

Title: Extensible Markup Language XML


1
Extensible Markup Language (XML)
2
(No Transcript)
3
Introduction
  • Extensible Markup Language (XML) was developed in
    1996 by the World
  • Wide Web Consortiumss (W3Cs) XML Working Group.
  • XML is a portable, widely supported, open
    technology for describing data.
  • It is a standard for storing data that is
    exchanged between applications.
  • It is human and machine-readable.
  • Examples
  • Mathematical formulae
  • Software configuration instructions
  • Music
  • Recipes
  • Financial reports

4
XML Documents
  • An XML document

lt?xml version "1.0"?gt lt!-- Fig. 18.1
article.xml --gt lt!-- Article structured with
XML --gt ltarticlegt lttitlegtSimple
XMLlt/titlegt ltdategtDecember 6, 2001lt/dategt
ltauthorgt ltfirstNamegtTemlt/firstNamegt
ltlastNamegtNietolt/lastNamegt lt/authorgt
ltsummarygtXML is pretty easy.lt/summarygt
ltcontentgtIn this chapter, we present a wide
variety of examples that use XML.
lt/contentgt lt/articlegt
  • What to look for
  • XML declaration line 1
  • Version information parameter
  • Comments
  • Tags ltgt
  • Start tag, End Tag
  • Elements
  • Root element
  • Element hierarchies
  • Filename extension .xml
  • XML parser msxml, Xerces
  • Style sheet (next slide)

5
XML Documents
  • An XML document style sheet
  • What to look for
  • Container elements are marked by and signs
  • A minus sign indicates the containers (parent)
    child elements are being displayed.
  • Clicking on the sign collapses the display
  • XML data have a tree-node structure format

6
XML Documents
lt?xml version "1.0"?gt lt!-- Fig. 18.3
letter.xml --gt lt!-- Business letter
formatted with XML --gt ltlettergt ltcontact type
"from"gt ltnamegtJane Doelt/namegt
ltaddress1gtBox 12345lt/address1gt ltaddress2gt15
Any Ave.lt/address2gt ltcitygtOthertownlt/citygt
ltstategtOtherstatelt/stategt
ltzipgt67890lt/zipgt ltphonegt555-4321lt/phonegt
ltflag gender "F" /gt lt/contactgt
ltcontact type "to"gt ltnamegtJohn
Doelt/namegt ltaddress1gt123 Main
St.lt/address1gt ltaddress2gtlt/address2gt
ltcitygtAnytownlt/citygt ltstategtAnystatelt/stategt
ltzipgt12345lt/zipgt ltphonegt555-1234lt/pho
negt ltflag gender "M" /gt lt/contactgt
ltsalutationgtDear Sirlt/salutationgt
ltparagraphgtIt is our privilege to inform you
about our new database managed with
lttechnologygtXMLlt/technologygt. This new
system allows you to reduce the load on
your inventory list server by having the client
machine perform the work of sorting and
filtering the data. lt/paragraphgt
ltparagraphgtPlease visit our Web site for
availability and pricing.
lt/paragraphgt ltclosinggtSincerelylt/closinggt
ltsignaturegtMs. Doelt/signaturegt lt/lettergt
  • An XML document
  • What to look for
  • Root element letter.
  • Child elements
  • contact
  • salutation
  • paragraph
  • closing
  • signature
  • 3. Placement of data in attributes
  • ltcontact type fromgt
  • 4. Empty elements
  • ltflag gender Fgtlt/flaggt

7
XML Documents
8
XML Namespaces
  • Object-oriented programming languages such as C
    provide large
  • class libraries that group their features into
    namespaces. They prevent
  • naming collisions between programmer defined
    identifiers and class
  • library identifiers.
  • XML provides namespaces which provide a means of
    uniquely identifying
  • XML elements.
  • XML-based languages called vocabularies such as
    XML Schema, Extensible Stylesheet Language, and
    BizTalk use namespaces to identify their
    elements.
  • Elements are differentiated via namespace
    prefixes, which identify a namespace to which an
    element belongs ltbank_of_americacheckgtAmountlt/ba
    nk_of_americacheckgt

lt?xml version "1.0"?gt lt!-- Fig. 18.4
namespace.xml --gt lt!-- Demonstrating namespaces
--gt lttextdirectory xmlnstext
"urndeiteltextInfo" xmlnsimage
"urndeitelimageInfo"gt lttextfile filename
"book.xml"gt lttextdescriptiongtA book
listlt/textdescriptiongt lt/textfilegt
ltimagefile filename "funny.jpg"gt
ltimagedescriptiongtA funny picturelt/imagedescript
iongt ltimagesize width "200" height
"100"/gt lt/imagefilegt lt/textdirectorygt
9
XML Namespaces
  • What to look for
  • Attribute xmlns creates two namespace prefixes
    text and image
  • Each namespace prefix is bound to a URI
  • Document authors must ensure uniqueness
  • Use URLs for URIs.
  • Specify default namespaces (next slide).

10
XML Namespaces
lt?xml version "1.0"?gt lt!-- Fig. 18.5
defaultnamespace.xml --gt lt!-- Using default
namespaces --gt ltdirectory xmlns
"urndeiteltextInfo" xmlnsimage
"urndeitelimageInfo"gt ltfile filename
"book.xml"gt ltdescriptiongtA book
listlt/descriptiongt lt/filegt ltimagefile
filename "funny.jpg"gt ltimagedescriptiongtA
funny picturelt/imagedescriptiongt
ltimagesize width "200" height "100"/gt
lt/imagefilegt lt/directorygt
  • What to look for
  • ltdirectorygt declares a default namespace using
    attribute xmlns with a URI as its value.
  • Child elements need not be qualified by a
    namespace prefix
  • Element file is in the namespace
    urndeiteltextinfo
  • Compare with the preceding code where file and
    description were prefixed with text.
  • Noticed that the default namespace is overridden
    with the image prefix.

11
Document Object Model (DOM)
  • XML documents are text files and thus data can be
    retrieved from
  • them using file I/O.
  • XML parsers store document data as tree
    structures in memory.
  • The hierarchical tree structure is called a
    Document Object Model (DOM) tree and this tree
    was created by a DOM parser.

article
  • Parent nodes
  • Child nodes
  • Sibling nodes
  • Ancestor nodes
  • Root node
  • C System.XML

title
date
author
firstName
summary
lastName
contents
12
Document Object Model (DOM)
  • XML Node Reader Code Example
  • Data used by this program

13
Document Object Model (DOM)
  • What to look for
  • Code
  • Data
  • 3. Use of class TreeView
  • 4. Method BuildTree how the tree is built and
    displayed.
  • 5. Use of the switch statement
  • 6. How the tree depth increases.
  • 7. XmlTextWriter stream.
  • 8. XmlTextReader
  • 9. Improving efficiency using XPathNavigator.

14
Document Object Model (DOM)
  • Code
  • Data
  • Program uses a TreeView control and TreeNode
    objects to display XML nodes structure
  • TreeNode list is updated each time the
    XPathNavigator is positioned to a new node. Nodes
    are added to and deleted from the TreeView to
    reflect the XPathNavigators location in the DOM
    tree.
  • XPathDocument object
  • Method CreateNavigator
  • Traversal Methods
  • MoveToFirstChild
  • MoveToParent
  • MoveToNext
  • MoveToPrevious
  • 8. XPathNodeIterator
  • 9. Method DisplayIterator

15
Document Type Definitions
  • XML documents can reference optional documents
    that specify
  • how the XML documents should be structured.
    These optional documents
  • are called Document Type Definitions (DTDs) and
    Schemas.
  • When a DTD is provided, validating parsers read
    the schema and check the XML documents structure
    against it.
  • If the XML document conforms to the DTD, the XML
    document is valid.
  • DTDs provide a means for type checking XML they
    confirm that elements contain the proper
    attributes, elements, and are in proper sequence.
  • They used EBNF grammar to describe an XMLs
    content.

16
Document Type Definitions
lt!-- Fig. 18.12 letter.dtd --gt lt!-- DTD
document for letter.xml --gt lt!ELEMENT letter (
contact, salutation, paragraph,
closing, signature )gt lt!ELEMENT contact (
name, address1, address2, city, state,
zip, phone, flag )gt lt!ATTLIST contact
type CDATA IMPLIEDgt lt!ELEMENT name ( PCDATA
)gt lt!ELEMENT address1 ( PCDATA )gt lt!ELEMENT
address2 ( PCDATA )gt lt!ELEMENT city ( PCDATA
)gt lt!ELEMENT state ( PCDATA )gt lt!ELEMENT zip (
PCDATA )gt lt!ELEMENT phone ( PCDATA )gt lt!ELEMENT
flag EMPTYgt lt!ATTLIST flag gender (M F)
"M"gt lt!ELEMENT salutation ( PCDATA )gt lt!ELEMENT
closing ( PCDATA )gt lt!ELEMENT paragraph (
PCDATA )gt lt!ELEMENT signature ( PCDATA )gt
What to look for 1. Rules for element letter
a) one or more contact elements b) one
salutation c) one or more paragraph elements
d) one closing, and one signature 2. s
indicate optional data 3. Items are ordered 4.
ATTLIST defines attributes, i.e. type for the
contact. 5. IMPLIED specifies that the program
can provide a value or ignore a missing
attribute. 6. REQUIRED means mandatory. 7.
FIXED means immutable. 8. PCDATA element can
store parsed character data no markup. must
be replaced by amp, lt replaced by lt 9. EMPTY
element cannot contain character data.
17
Document Type Definitions
lt?xml version "1.0"?gt lt!-- Fig. 18.13
letter2.xml --gt lt!-- Business letter
formatted with XML --gt lt!DOCTYPE letter SYSTEM
"letter.dtd"gt ltlettergt ltcontact type
"from"gt ltnamegtJane Doelt/namegt
ltaddress1gtBox 12345lt/address1gt ltaddress2gt15
Any Ave.lt/address2gt ltcitygtOthertownlt/citygt
ltstategtOtherstatelt/stategt
ltzipgt67890lt/zipgt ltphonegt555-4321lt/phonegt
ltflag gender "F" /gt lt/contactgt
ltcontact type "to"gt ltnamegtJohn
Doelt/namegt ltaddress1gt123 Main
St.lt/address1gt ltaddress2gtlt/address2gt
ltcitygtAnytownlt/citygt ltstategtAnystatelt/stategt
ltzipgt12345lt/zipgt ltphonegt555-1234lt/pho
negt ltflag gender "M" /gt lt/contactgt
ltsalutationgtDear Sirlt/salutationgt
ltparagraphgtIt is our privilege to inform you
about our new database managed with XML.
This new system allows you to reduce the
load on your inventory list server by
having the client machine perform the work of
sorting and filtering the data.
lt/paragraphgt ltparagraphgtPlease visit our Web
site for availability and pricing.
lt/paragraphgt ltclosinggtSincerelylt/closinggt
ltsignaturegtMs. Doelt/signaturegt lt/lettergt
  • The document on the right conforms to letter.dtd.
  • Notice that it references letter.dtd.
  • Microsofts XML validator is available for free
    for download
  • Microsoft XML Validator

18
Microsoft XML Schemas
  • Alternatives to DTDs are Schemas.
  • DTDs cannot be manipulated programmatically
    (searched, modified) and they
  • do not provide a means for describing an
    elements data type.
  • Schemas do not use EBNF.
  • Schemas are xml documents.

lt?xml version "1.0"?gt lt!-- Fig. 18.17 book.xdr
--gt lt!-- Schema document
to which book.xml conforms --gt ltSchema xmlns
"urnschemas-microsoft-comxml-data"gt
ltElementType name "title" content "textOnly"
model "closed" /gt ltElementType name
"book" content "eltOnly" model "closed"gt
ltelement type "title" minOccurs "1"
maxOccurs "1" /gt lt/ElementTypegt
ltElementType name "books" content "eltOnly"
model "closed"gt ltelement type "book"
minOccurs "0" maxOccurs "" /gt
lt/ElementTypegt lt/Schemagt
lt?xml version "1.0"?gt lt!-- Fig. 18.16
bookxdr.xml --gt lt!-- XML file that
marks up book data --gt ltbooks xmlns
"x-schemabook.xdr"gt ltbookgt lttitlegtC
How to Programlt/titlegt lt/bookgt ltbookgt
lttitlegtJava How to Program, 4/elt/titlegt
lt/bookgt ltbookgt lttitlegtVisual Basic .NET
How to Programlt/titlegt lt/bookgt ltbookgt
lttitlegtAdvanced Java 2 Platform How to
Programlt/titlegt lt/bookgt ltbookgt
lttitlegtPython How to Programlt/titlegt
lt/bookgt lt/booksgt
  • What to look for
  • title cannot contain child elements
  • Attribute content specifies element contains
    parsed character data.
  • Model attribute closed implies conforming XML
    document can contain only elements specified in
    the schema.
  • eltonly means the element cannot contain mixed
    content such as text and other elements.
  • title is a child element for book. minoccurs,
    maxoccurs

19
W3C XML Schema
  • Like Microsoft, W3C has created its own W3C XML
    schema.
  • W3C Schema documents end in .xsd

lt?xml version "1.0"?gt lt!-- Fig. 18.19 book.xsd
--gt lt!-- Simple W3C XML Schema document
--gt ltxsdschema xmlnsxsd "http//www.w3.org/200
1/XMLSchema" xmlnsdeitel "http//www.deitel.
com/booklist" targetNamespace
"http//www.deitel.com/booklist"gt ltxsdelement
name "books" type "deitelBooksType"/gt
ltxsdcomplexType name "BooksType"gt
ltxsdsequencegt ltxsdelement name
"book" type "deitelBookType"
minOccurs "1" maxOccurs "unbounded"/gt
lt/xsdsequencegt lt/xsdcomplexTypegt
ltxsdcomplexType name "BookType"gt
ltxsdsequencegt ltxsdelement name
"title" type "xsdstring"/gt
lt/xsdsequencegt lt/xsdcomplexTypegt lt/xsdschem
agt
lt?xml version "1.0"?gt lt!-- Fig. 18.18
bookxsd.xml --gt lt!-- Document that
conforms to W3C Schema --gt ltdeitelbooks
xmlnsdeitel "http//www.deitel.com/booklist"gt
ltbookgt lttitlegte-Business and e-Commerce
How to Programlt/titlegt lt/bookgt ltbookgt
lttitlegtPython How to Programlt/titlegt
lt/bookgt lt/deitelbooksgt
20
W3C XML Schema
  • What to look for in the preceding slide
  • W3C XML Schema namespace
  • xsd prefix
  • Root element schema contains elements that
    define the XMLs documents structure.
  • Binding URI to namespace prefix
  • Targetnamespace namespace for elements and
    attributes that this schema defines.
  • Element defines an element
  • Name and data type of an element.
  • Any element that contains attributes of children
    is a complex type
  • Simple types such as xsdstring, xsddate, xsdint

21
Schema Validation in C
  • Classes from .NET FCL XmlValidatingReader
    performs XML validation
  • Code Example
  • XDR validator XSD Validator
  • XmlSchemaCollection
  • Adding elements to collection
  • XmlReader
  • Registration of ValidationEventHandler
  • Node-by-node validation

22
Extensible Stylesheet Language XslTransform
  • Extensible Stylesheet Language (XSL) is an XML
    vocabulary
  • for formatting XML data.
  • XSL Transformations (XSLT) creates formatted
    text-based documents from XML documents.
  • The process is called a transformation and needs
    two tree structures.
  • The source tree is the XML document being
    transformed.
  • The result tree is the result of the
    transformation
  • XSLT processors include Microsofts msxml and
    Apaches Xalan.

23
Extensible Stylesheet Language XslTransform
lt?xml version "1.0"?gt lt!-- Fig. 18.23
sorting.xml --gt lt!-- Usage of elements
and attributes --gt lt?xmlstylesheet type
"text/xsl" href "sorting.xsl"?gt ltbook isbn
"999-99999-9-X"gt lttitlegtDeitelaposs XML
Primerlt/titlegt ltauthorgt
ltfirstNamegtPaullt/firstNamegt
ltlastNamegtDeitellt/lastNamegt lt/authorgt
ltchaptersgt ltfrontMattergt ltpreface
pages "2"/gt ltcontents pages "5"/gt
ltillustrations pages "4"/gt
lt/frontMattergt ltchapter number "3" pages
"44"gt Advanced XMLlt/chaptergt
ltchapter number "2" pages "35"gt
Intermediate XMLlt/chaptergt ltappendix number
"B" pages "26"gt Parsers and
Toolslt/appendixgt ltappendix number "A"
pages "7"gt Entitieslt/appendixgt
ltchapter number "1" pages "28"gt XML
Fundamentalslt/chaptergt lt/chaptersgt ltmedia
type "CD"/gt lt/bookgt
XML Document
  • XSL Document that Transform XML
  • The line lt?xml is a processing instruction (PI)
    which contains application-specific information
    that is embedded in the XML document.
  • XSLT documents contain one or more xsltemplate
    elements that specify which information is output
    to the result tree.
  • The first template tag in the xsl document
    matches the documents root node. When the
    document root is encountered, the template is
    applied, and any text marked up by this element
    that is not in the namespace referenced by xsl is
    output to the result tree.
  • This xsl style sheet creates an XHTML document.
  • What to look for
  • Document title
  • Books author
  • Extracting child elements
  • Sorting of chapters
  • Use of XSL variable to store the value of a
    books page count
  • Code example to apply style sheet to XML document
  • Style sheet (sports.xsl)
  • sports.xml

24
Assignment
Write a Comment
User Comments (0)
About PowerShow.com