Introduction to XML - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

Introduction to XML

Description:

... elements and can have default values associated with them ... Defaulting of values. Shortcomings of DTD. a very limited capability for specifying datatypes. ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 54
Provided by: philipp76
Category:

less

Transcript and Presenter's Notes

Title: Introduction to XML


1
Introduction to XML
2
Contents
  • XML Motivation
  • XML DTD
  • XML Name Space
  • XML Schema

3
Quick Introduction
  • XML stands for
  • eXstensible Markup Language
  • It is a language for creating new languages
  • In particular, it is designed to create tagged
    languages
  • similar to HTML
  • It is considered extensible because it allows
    the
  • developer to create new tags
  • as compared to HTML where the set of tags
    has been fixed and new tags are ignored by
    browsers


4
An additional problem
  • An additional problem can be seen by viewing the
    HTML
  • source of the the CNN website
  • This page is filled with headlines and
    text/ images that support those headlines
  • A major headline looks like this
  • lt H3gt lt A href " . . . " class " t1" gt Earliest
  • certified election results in Florida
  • 6 p. m. ESTlt / Agt lt / H3gt
  • A minor headline looks like this
  • nbsp nbsp 149 nbsp lt a href " .
    . . " gt Bush sues
  • 4 counties over absentee ballotslt / agt lt brgt


5
The XML approach
  • Imagine if the source for CNN s webpage looked
    like this
  • lt storygt
  • lt headline class important gt Election returns
    due
  • at 6 PM EST. lt / headlinegt
  • lt supportingTextgt Blah Blah Blah lt /
    supportingTextgt
  • lt / storygt
  • Here, structure is preserved
  • It would be very easy to write a program to
    grab the headlines out of this document
  • How do we handle presentation?
  • XSLT


6
XML definitions
  • An XML document consists of the following parts
  • a Document Type Definition ( or DTD)
  • Data
  • The DTD defines the structure of the data that
    follows it. A
  • parser can thus read the DTD and know how to
    parse the
  • data that follows it
  • As such, XML documents are said to be self-
    describing , all the information for parsing the
    data is contained in the document itself


7
Well- Formed XML Documents
  • XML documents are considered well-formed if they
    conform to the XML Syntax rules
  • Well- formed documents can be parsed by any XML
    Parser without the need for a DTD
  • It can use the rules to parse the document
    cleanly, but without the DTD it does not know if
    the document is valid


8
Valid XML Documents
  • An XML document is considered valid if
  • ( 1) it is well- formed and
  • ( 2) it conforms to the rules specified in its
    associated DTD
  • That is, if the DTD says that a lt pgt tag can only
  • contain lt bgt tags and plain text, then a lt pgt tag
  • which contains an lt emgt tag is considered invalid


9
XML Elements
  • XML documents consist of one or more elements.
  • Elements consist of a pair of tags and
    (optionally) enclosed text.ltTITLEgtThe XML
    Companionlt/TITLEgt
  • Elements may have attributes.ltTITLE
    typebookgtThe XML Companionlt/TITLEgt
  • Elements may contain other elements.ltREFERENCEgt
    ltTITLE typebookgtThe XML Companionlt/TITLEgtlt/
    REFERENCEgt
  • Empty elements may be self closing.ltPICTURE
    srcmypic.jpggt lt/PICTUREgtltPICTURE
    srcmypic.jpg /gt

10
XML Rules (vs HTML)
  • All elements must be closed.
  • Elements cannot overlap.ltBgtbold ltIgtbold
    italiclt/Bgt italiclt/Igt this is illegal !
  • XML is case sensitiveltbgt and ltBgt are different
    tags
  • Attributes must be enclosed in inverted
    commasltPICTURE srcmypic.jpg /gt

11
A Simple XML Document
lt?xml version"1.0" ?gt ltbooklist title"Some
XML Books"gt lt/booklistgt
XML declaration
Root element (one per document)
12
A Simple XML Document
lt?xml version"1.0" ?gt lt!DOCTYPE booklist SYSTEM
"books.dtd" gt ltbooklist title"Some XML
Books"gt lt/booklistgt
Define root element and specify DTD.
13
A Simple XML Document
lt?xml version"1.0" ?gt lt!DOCTYPE booklist SYSTEM
"books.dtd" gt lt!-- This is a comment --gt
ltbooklist title"Some XML Books"gt lt/booklistgt
This is a comment (as SGML / HTML)
14
A Simple XML Document
lt?xml version"1.0" ?gt lt!DOCTYPE booklist SYSTEM
"books.dtd" gt lt!-- This is a comment
--gt lt?xml-stylesheet type"text/xsl"
hrefiti-xml2.xsl"?gt ltbooklist title"Some
XML Books"gt lt/booklistgt
This defines the XSL stylesheet
15
A Simple XML Document
lt?xml version"1.0" ?gt lt!DOCTYPE booklist SYSTEM
"books.dtd" gt lt!-- This is a comment
--gt lt?xml-stylesheet type"text/xsl"
href"books3.xsl"?gt lt?cocoon-process
type"xslt"?gt ltbooklist title"Some XML
Books"gt lt/booklistgt
This is a Cocoon processing directive (NB not
standard XML, but required by Cocoon 1.7.4).
16
Adding Content
ltbooklist title"Some XML Books"gt ltbookgt
ltauthorgt ltnamegtSt. Laurentlt/namegt ltinitialgtSlt/i
nitialgt lt/authorgt ltdategt1998lt/dategt
lttitle edition"Second"gtXML A
Primerlt/titlegt ltpublishergtMIS
Presslt/publishergt ltwebsite
href"http//www.simonstl.com/xmlprim/" /gt
ltrating stars"4"/gt lt/bookgt lt/booklistgt
17
Benefits of a DTD
  • DTDs are optional in XML
  • DTD allows validation of documents
  • DTD defines the application
  • Vital for collaborative development
  • IPR implications
  • DTD allows entity definitions (ie symbols,
    shortcuts, foreign characters etc.).

18
Document Declaration
  • The document declaration comes after the XML
    Declaration
  • Its tag name is DOCTYPE
  • There are two forms
  • internal
  • lt ! DOCTYPE greeting . . . . DTD Goes Here
    gt
  • external
  • lt ! DOCTYPE greeting SYSTEM greeting. dtd gt
  • We will cover the first form


19
DTD Syntax
  • The DTD is where you declare the elements ( a. k.
    a. tags) and attributes that will appear in your
    XML document
  • In defining elements, you use regular expressions
    to declare the order in which elements are to
    appear
  • Attributes can be associated with elements and
    can have default values associated with them


20
DTD for a Class Gradebook
  • lt ! DOCTYPE gradebook
  • lt ! ELEMENT gradebook ( class, student ) gt
  • lt ! ELEMENT class ( name, studentsEnrolled) gt
  • lt ! ATTLIST class semester CDATA REQUIREDgt
  • lt ! ELEMENT name ( PCDATA) gt
  • lt ! ELEMENT studentsEnrolled ( PCDATA) gt
  • lt ! ELEMENT student ( name, grade ) gt
  • lt ! ELEMENT grade ( PCDATA) gt
  • lt ! ATTLIST grade name CDATA REQUIREDgt
  • gt


21
A XML Example from the DTD
  • lt ? xml version 1. 0 ? gt
  • lt ! DOCTYPE gradebook insert DTD from slide 19
    here gt
  • lt gradebookgt
  • lt class semester Fall 2000 gt
  • lt namegt CSCI 3308lt / namegt
  • lt studentsEnrolledgt 117lt / studentsEnrolledgt
  • lt / classgt
  • lt studentgt
  • lt namegt Ken Andersonlt / namegt
  • lt grade name lab0 gt 10lt / gradegt
  • lt grade name lab1 gt 9lt / gradegt
  • lt / studentgt
  • lt gradebookgt


22
Schema Overview
  • An XML schema is an XML document containing a
    formal description of what comprises a valid XML
    document, it defines the elements of an XML
    document and how these are structured.
  • The following schema instructions and guidelines
    are referring to a schema written in the W3C XML
    schema language. http//www.w3.org/2001/XMLSchema
  • An XML document described by a schema is called
    an instance document, if this satisfies all the
    constraints specified by the schema, it is
    considered to be schema-valid.
  • Various methods are utilized to associate an
    instance document to a schema, here we use the
    xsischemaLocation attribute of the root element
    of the instance document.
  • To allow the exchange of XML documents between
    different organizations a proper use of
    namespaces is required to prevent
    misunderstandings.

23
Namespaces
  • Namespaces have two purposes in XML
  • To distinguish between elements and attributes
    from different vocabularies with different
    meanings that happen to share the same name
  • To group all the related elements and attributes
    from a single XML application together so that
    software can easily recognize them.
  • Namespaces are implemented by attaching a prefix
    to each element and attribute separated by a
    colon. Everything before the colon is called the
    prefix, after the colon is called the local part
    and the complete name, including the colon, is
    called the qualified name, QName, or raw name.
  • Example
  • ltfiInterestFisheriesgt
  • lticcatInterestFisheriesgt

24
Namespaces
  • Each prefix is mapped to a URI by an xmlnsprefix
    attribute, the URI is the real namespace while
    the prefix is only a conventional acronym.
    Examplesxmlnsiccat"http//www.iccat.es/sche
    ma"xmlnsfi"http//www.fao.org/fi/figis/devcon/
    "xmlnsxs"http//www.w3.org/2001/XMLSchema"
    xmlnsxsi"http//www.w3.org/2001/XMLSchema-inst
    ance" xsischemaLocation"http//www.iccat.es/sc
    hema iccat.xsd"

25
XML Namespaces
  • Namespaces in XML are optional
  • Namespaces ensure that elements are unique
    resolve conflict among names of elements
  • In different contexts a given tag might mean
    different things - eg consider ltBOOKgt
  • To me it might mean a book in a bibliography
  • To a bookshop it might contain stock details
  • To a travel agent it might contain information
    about flight bookings!
  • Namespaces attach unique labels to a given tag
    set.
  • URLs are usually used as namespace labels.

26
Document Sources More Information
  • References
  • Kenneth M. Anderson, http//www.cs.colorado.edu/us
    ers/kena/classes/3308/f00/lectures
  • Tim Brailsford,
  • http//www.cs.nott.ac.uk/tjb/iti-xml/
  • General XML Information
  • http//www.w3c.org/xml/, http//www.xml.com/
  • Free XML Parsers
  • http / / xml. apache. org/
  • Java and C parsers ( with bindings for Perl
    and COM)
  • http / / www. alphaworks. ibm. com/
    tech/ xml4j
  • IBM s Java parser for XML
  • http / / www. alphaworks. ibm. com/
    tech/ xml4c
  • IBM s C parser for XML
  • http / / www. opentext. com/ services/
    content_ management_ services/ xml_ sgml_
    solutions. html

27
Introduction to XML Schema
28
Resources
  • An Introduction to XML Schema
  • http//www.cs.colorado.edu/kena/classes/7818/f01/
    presentations/schema.ppt
  • XML Schema is a W3C Recommendation
  • http//www.w3.org/XML/Schema

29
Motivation
  • Purpose of DTD
  • Sharing grammar/data with others
  • Validation by the parser
  • Defaulting of values.
  • Shortcomings of DTD
  • a very limited capability for specifying
    datatypes.
  • incompatible set of datatypes with those found in
    databases
  • inconsistent syntax with XML

30
XML Schema Requirements
  • Structural Schemas
  • Besides analogizing DTD, there are specific goals
    beyond DTD
  • Integration with namespace
  • Definition of incomplete constraints on content
    of an element type
  • Integration of structural schemas with primitive
    data types
  • inheritance

31
XML Schema Requirements (2)
  • Primitive Data Typing
  • Based on experience with SQL, Java primitives.
  • Conformance
  • Define the relation of schemata to XML document
    instances, and obligations on schema-aware
    processors.

32
Example (DTD)
  • BookStore.dtd
  • lt!ELEMENT BookStore (Book)gt
  • lt!ELEMENT Book (Title, Author, Date, ISBN,
    Publisher)gt
  • lt!ELEMENT Title (PCDATA)gt
  • lt!ELEMENT Author (PCDATA)gt
  • lt!ELEMENT Date (PCDATA)gt
  • lt!ELEMENT ISBN (PCDATA)gt
  • lt!ELEMENT Publisher (PCDATA)gt

33
Example (Schema)
  • lt?xml version"1.0"?gt
  • ltxsdschema xmlnsxsd"http//www.w3.org/2001/XMLS
    chema"
  • targetNamespace"http//www.
    books.org"
  • xmlns"http//www.books.org"
  • elementFormDefault"qualifie
    d"gt
  • ltxsdelement name"BookStore"gt
  • ltxsdcomplexTypegt
  • ltxsdsequencegt
  • ltxsdelement ref"Book"
    minOccurs"1" maxOccurs"unbounded"/gt
  • lt/xsdsequencegt
  • lt/xsdcomplexTypegt
  • lt/xsdelementgt
  • ltxsdelement name"Book"gt
  • ltxsdcomplexTypegt
  • ltxsdsequencegt
  • ltxsdelement ref"Title"
    minOccurs"1" maxOccurs"1"/gt
  • ltxsdelement ref"Author"
    minOccurs"1" maxOccurs"1"/gt
  • ltxsdelement ref"Date"
    minOccurs"1" maxOccurs"1"/gt

ltxsdelement ref"ISBN" minOccurs"1"
maxOccurs"1"/gt ltxsdelement
ref"Publisher" minOccurs"1" maxOccurs"1"/gt
lt/xsdsequencegt
lt/xsdcomplexTypegt lt/xsdelementgt ltxsdelement
name"Title" type"xsdstring"/gt
ltxsdelement name"Author" type"xsdstring"/gt
ltxsdelement name"Date" type"xsdstring"/gt
ltxsdelement name"ISBN" type"xsdstring"/gt
ltxsdelement name"Publisher" type"xsdstring"/gt
lt/xsdschemagt
34
Example (vocabulary)
35
Data Types
  • A complex types allow elements in their content
    and may carry attributes
  • A simple types cannot have element content and
    cannot carry attributes, such integer.
  • A ur-type definition is present in each XML
    Schema, serving as the root of the type
    definition hierarchy for that schema.
  • Primitive datatypes are those that are not
    defined in terms of other datatypes
  • Derived datatypes are those that are defined in
    terms of other datatypes.

36
(No Transcript)
37
An Instance Document
lt?xml version"1.0"?gt ltBookStore xmlns
http//www.books.org -1
xmlnsxsihttp//www.w3.org/2001/XMLSchema-instanc
e -2 xsischemaLocation"http
//www.books.org
BookStore.xsd"gt -3
ltBookgt ltTitlegtMy Life and
Timeslt/Titlegt ltAuthorgtPaul
McCartneylt/Authorgt ltDategtJuly,
1998lt/Dategt ltISBNgt94303-12021-4389
2lt/ISBNgt ltPublishergtMcMillin
Publishinglt/Publishergt lt/Bookgt
... lt/BookStoregt
38
Multiple Level Checking
BookStore.XML
BookStore.xsd
Validator
XMLSchema.xsd
39
Web Service Summary
40
Web Service
  • Three Main Parts
  • Simple Object Access Protocol (SOAP)
  • Web Service Description Language (WSDL)
  • Universal Description, Discovery, and Integration
    (UDDI)

41
Web Service
Web Service Stack Diagram
  • S

42
Web Service
  • Introduction to SOAP
  • SOAP is to transfer information
  • Simple XML Message
  • Remote Procedure Call
  • Strong Point of SOAP
  • Lightweight Protocol
  • Text-based XML format
  • Can use HTTP Protocol

43
Web Service
  • Simple Object Access Protocol (SOAP)
  • SOAP Message
  • Envelope
  • Header client authentication, transaction
    management
  • Body include the information that a receiver
    should get finally
  • Fault element

44
Web Service
  • Simple Object Access Protocol (SOAP)
  • SOAP Message
  • Envelope Top element for SOAP message
  • Header client authentication, transaction
    management
  • actor and mustUnderstand attribute of auth
    element
  • Body include the information that a receiver
    should get finally,
  • Information RPC request, RPC result, Error in
    execution
  • Fault element
  • SOAP Encoding
  • How to processing data
  • Ex) String title Book ? lttitle
    xsitypexsdstringgtBooklt/titlegt
  • Encoding Style attribute
  • envencodingStyle http//schemas.xmlsoap.org/soa
    p/encoding/
  • Simple Type
  • Compound Type

45
Web Service
  • SOAP Encoding
  • Simple Type
  • int
  • float
  • negativeInteger
  • string
  • enumeration
  • Compound Type
  • Compound type value and structure
  • Array

46
Web Service
  • SOAP Encoding
  • Compound Type
  • ltns0addBook3gt
  • ltBook_1 hrefID1/gt
  • lt/ns0addBook3gt
  • ltbook idID1 xsitypens1Bookgt
  • lttitle xsitypexsdstringgtbook1lt/titlegt
  • ltprice xsitypexsdintgt29000lt/pricegt
  • lt/bookgt
  • deserialization by message receiver
  • Book1 Book_1 new Book()
  • Book_1.setTitle(book1)
  • Book_1.setPrice(29000)

47
Web Service
  • Example of SOAP Message Request of getting
    weather for the zip code

ltSOAP-ENVEnvelope xmlnsSOAP-ENV"http//schem
as.xmlsoap.org/soap/envelope/"
xmlnsxsi"http//www.w3.org/1999/XMLSchema-instan
ce" xmlnsxsd"http//www.w3.org/1999/XMLSchema
" gt ltSOAP-ENVBodygt ltns1getTemp
xmlnsns1"urnxmethods-Temperature"
SOAP-ENVencodingStyle"http//schemas.xmlsoap.org
/soap/encoding/"gt ltzipcode
xsitype"xsdstring"gt11211lt/zipcodegt
lt/ns1getTempgt lt/SOAP-ENVBodygt lt/SOAP-ENVEnvel
opegt
48
Web Service
  • Simple Object Access Protocol (SOAP)
  • SOAP Message Transport
  • Binding How to combine with transport protocol
  • HTTP Binding
  • HTTP Start line, Header, Body
  • SOAPAction For RPC

49
Web Service
  • Web Service Definition Language(WSDL)
  • Specification of Web Service Function
  • Document Structure

ltdefinitionsgt lttypesgt Complex types for
arguments and return types lt/typesgt
ltmessagegt Describe arguments and return
values lt/messagegt ltportTypegt // interface
ltoperationgt Describe remote
procedures lt/operationgt lt/ portType gt
ltbindinggt Protocol used for invoking
SOAP(Application client), HTTP(Web Client),
MIME(char binary data) lt/bindinggt
ltservicegt ltportgt URL of Web
Service (endpoint) lt/portgt
lt/servicegt lt/definitionsgt
50
Web Service
  • Universal Description, Discovery, and Integration
    (UDDI)
  • Create, Store, Search information
  • UDDI Data Structure
  • Information of White Page Company Name,
    Address, Tel. No., and Description
  • Information of Yellow Page According to
    Industry Classification(NAICS), According to
    Products(UNSPEC), and Area
  • Information of Green Page Technical information
    of company, ex) end point URL, URL of WSDL
    document

51
Web Service
UDDI Structure
Element Name Usage

Information Classification ltbusinessEntitygt
Company Name, Address
Correspond to White Page ltpublisherAssert
iongt Association among businessEntity
Correspond to White Page ltidentifierBaggt
Substitution ID for
businessEntity Correspond to
Yellow Page ltcategoryBaggt
Information for classification
Correspond to Yellow Page ltbusinessServicegt
Web Service name and description for
Correspond to Green Page
company ltbindingTemplategt
endpoint URL, tModel reference
Correspond to Green Page lttModelgt
URL of WSDL to define methods,
Correspond to Green Page
argument data types
for Web service
52
Web Service
lttModelgt lt/tModelgt
ltbusinessEntitygt ltbusinessServicegt
ltbindingTemplategt Reference
lt/bindingTemplategt lt/businessEntitygt
ltpublisherAssertionsgt ltpublisherAssertiongt
Association ltpublisherAssertionsgt
ltpublisherAssertiongt
53
Web Service Demonstration
  • Web Service in J2EE
Write a Comment
User Comments (0)
About PowerShow.com