XML in Brief

1 / 52
About This Presentation
Title:

XML in Brief

Description:

BOOK TITLE Electronic Commerce /TITLE QUANTITY 100 /QUANTITY AUTHOR Jones ... attribute-type: Indicates the list of values that an attribute may have. ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 53
Provided by: asuman9

less

Transcript and Presenter's Notes

Title: XML in Brief


1
XML in Brief
http//www.srdc.metu.edu.tr/ibrahim/courses/cse53
7/
2
XML Extensible Markup Language
  • Extensible Markup Language has become the
    universal standard for representing data and
    metadata
  • A subset of SGML Standard Generalized Markup
    Language
  • Defined by W3C http//www.w3.org/TR/REC-xml
  • Started out as a standard data exchange format
    for the Web (can be considered as a successor of
    HTML)
  • It has quickly become the fundamental instrument
    in the development of Web-based online
    information services and electronic commerce
    applications
  • Almost all recent electronic commerce standards
    are based on XML

3
XML
  • Provides the ability to describe data in an open
    text-based format and thus it can easily be
    delivered by using standard http protocol
  • Content-oriented tagging provides
    self-describing, both machine and human-readable
    data
  • HTML enables a universal method of displaying
    data XML provides a universal method for
    describing data
  • Many applications on the Web use XML for hosting
    large amounts of structured and semi-structured
    data
  • Representation of information in XML documents
    has been increasing at an astonishing pace
  • According to Meta Group, by 2005, about 65 of
    corporate data will be stored in XML format

4
Maturity of Web Infrastructure
Technology
Standard
Innovation
Browse the Web
Program the Web
5
XML helps address the challenge
  • The data is self-describing
  • e.g. the meaning of the data is included
    identifiers surround every bit of data,
    indicating what it means
  • Very flexible method of representing transmitted
    information
  • e.g. batch orders sent together can have
    different fields and format without breaking apps
    on each end
  • Open, standard technologies for moving,
    processing and validating the data
  • e.g. the XML parser can automatically parse,
    validate, and feed the information to an
    application, instead of every application having
    to include this functionality in Object-Oriented
    environment

6
XML An Example
Data stream in a typical interface
Electronic Commerce, 100, Jones, 25,
Addison-Wesley
Same data stream in XML
ltBOOKgt ltTITLEgt Electronic Commerce
lt/TITLEgt ltQUANTITYgt 100 lt/QUANTITYgt ltAUTHORgtJone
slt/AUTHORgt ltPRICEgt25lt/PRICEgt ltPUBLISHERgtAddison-
Wesleylt/PUBLISHERgt lt/BOOKgt
7
Mark up (or Tagging)
  • XML uses textual markups to define data
  • An XML document is comprised of a collection of
    tagged elements each containing
  • a start tag (lttagnamegt),
  • an end tag (lt/tagnamegt),
  • and the content between the two tags as follows
  • lttagnamegt content lt/tagnamegt
  • Example
  • ltPONumbergt 1234ABCD lt/PONumbergt

8
Tagging Data in XML
ltPONumbergt 1234ABCD lt/PONumbergt
  • Considering the content only, it is not possible
    to understand what 1234ABCD stands for
  • The tag name PONumber intuitively tells that the
    content is a purchase order number
  • Similarly, an XML element might be tagged as
    name, gender, birth date, salary, price,
  • XML is extensible in the sense that users can
    create their own tag names which are neither
    predefined nor limited

9
Structure of XML Documents
  • XML models tree type hierarchical
    semi-structured data
  • A valid XML document usually contains a single
    root element, which constitutes the top-level of
    nesting
  • Tagged elements may be nested to any depth to
    provide structured data, or may be repeated to
    represent a list of values, or a mixture of those
  • XML also provides attributes to describe
    additional information or properties of elements

10
Processing Instructions
  • Provides commands or information to an
    application that processes the document
  • lt?targetApp instructions?gt
  • targetApp is the name of the application that
    is expected to do the processing
  • instructions is a string of characters that
    contains the information to be passed to the
    processor

11
Processing Instruction Example
  • ltin.stock.quantitygt
  • lt?EXECUTE typePL/SQL
  • nameretrieve.in.stock.quantity
  • sourceSELECT avail_Qty
  • FROM inventory_table
  • WHERE prodID123 ?gt
  • lt/in.stock.quantitygt

12
XML Namespaces
  • Namespaces are a simple and straightforward way
    to distinguish names used in XML documents, no
    matter where they come from
  • It provides a way to give elements and attributes
    programmer-friendly names that will be unique
    across the whole Internet
  • The prefixes are linked to the full names using
    the attributes on the top element, whose names
    begin xmlns
  • The prefixes are just shorthand placeholders for
    the full names, which are indeed URLs (that is,
    Web addresses)

13
Example
  • lt?xml version"1.0" encoding"UTF-8"?gt
  • lt!ENTITY rdf.common.schema SYSTEM
    "http//www.srdc.metu.edu.tr/sc/common/common.sche
    ma.rdf"gt
  • ltrdfRDF
  • xmlnsrdf"http//www.w3.org/TR/WD-rdf-syntax
    "
  • xmlnsdc"http//purl.org/metadata/dublin_cor
    e"
  • rdf.common.schema
  • ltrdfDescription about"http//www.srdc.metu.edu
    .tr/sc/M1/M1.catalog.xml"gt
  • ltdcTitlegtM1's Electronic Cataloglt/dcTitlegt
  • ltdcSubjectgtManufacturerlt/dcSubjectgt
  • ltdcTypegtElectronic_Cataloglt/dcTypegt
  • ltdcFormatgtCBL 1.2 - catalog.dtdlt/dcFormatgt
  • ltdcIdentifiergt"www.srdc.metu.edu.tr/sc/M1/M1.
    catalog.rdf"lt/dcIdentifiergt
  • ltdcDategt1999-02-02lt/dcDategt
  • ltdcDescriptiongtThis resource contains the
    electronic catalog for
  • manufacturer M1 in XML conforming to
    CBL 1.2 lt/dcDescriptiongt
  • lt/rdfDescriptiongt
  • lt/rdfRDFgt

14
Document Type Definition
15
Example XML document
  • lt?xml version1.0 ?gt
  • ltPurchaseOrderRequestgt
  • ltPONumbergt1234ABCD lt/PONumbergt
  • ltPurchaseOrderDategt20030601lt/PurchaseOrderDategt
  • ltLineItemgt
  • ltItem_Identification No9344 /gt
  • ltQuantityOrderedgt 16 lt/QuantityOrderedgt
  • ltUnitPricegt 95 ltUnitPricegt
  • lt!-- this is the discount price we
    get --gt
  • lt/LineItemgt
  • ltLineItemgt lt/LineItemgt
  • lt/PurchaseOrderRequestgt

16
Giving Structure to Data
PurchaseOrderRequest
PurchaseOrderDate
LineItem
PONumber
Item_Identification
QuantityOrdered
UnitPrice
17
DTD Document Type Definition
  • The principle purpose is to declare the
    hierarchy of document elements
  • A document type definition defines
  • The name of the elements,
  • The content model of each element,
  • How often and in which order elements may appear,
  • If the end-tags can be shortcut,
  • The possible presence of attributes and their
    default values,
  • The names of the entities

18
An Example DTD
  • lt?xml version1.0 encodingUTF-8?gt
  • lt!DOCTYPE simple
  • lt!ELEMENT PurchaseOrderRequest (PONumber,

  • PurchaseOrderDate, LineItem)gt
  • lt!ELEMENT LineItem (Item_Identification,
  • QuantityOrdered,
    UnitPrice)gt
  • lt!ELEMENT Item_Identification(PCDATA)gt
  • lt!ELEMENT QuantityOrdered (PCDATA)gt
  • lt!ELEMENT UnitPrice (PCDATA)gt
  • lt!--This is a comment line to state that the
    other elements are skipped --gt ...gt

19
DTD Syntax Rules
  • A DTD specifies the structure of an XML element
    by declaring the names of its sub-elements and
    attributes
  • Sub-element structure is specified using the
    operators
  • , (comma) Sequence indicator
  • (asteriks) Set with zero or more elements
  • (plus sign) Set with one or more elements
  • ? (question mark) Optional
  • (verticle bar) Or
  • () (paranthesis) Grouping
  • Main constructs of DTDs
  • Elements
  • Attributes
  • Entities

20
Main Constructs of DTD
  • Elements
  • lt!ELEMENT element-name content-typegt
  • where
  • content-type EMPTY PCDATA
    (sub.element.list) ANY
  • EMPTY Element may not have a content.
  • PCDATA Element may have character data content,
    but may not
  • contain any tagged data (sub
    elements).
  • ANY The element may contain character
    data and/or any
  • tagged data, these may be mixed in
    any order.

21
Element Definition Examples
  • lt!ELEMENT male EMPTYgt
  • Use in XML document ltmalegt lt/malegt
  • or its shortcut ltmale/gt
  • lt!ELEMENT bold (PCDATA)gt
  • Use in XML document ltboldgt you can write a
    note herelt/boldgt
  • (The data inbetween may not contain special
    chars. such as lt gt )
  • lt!ELEMENT parag (PCDATA bold) gt
    (mixed content)
  • Use in XML document ltparaggt This is a
    paragraph with ltboldgt bold
  • lt/boldgt and regular ltboldgt
    textlt/boldgt lt/paraggt

22
Element Definition Examples
  • Sequence lt!ELEMENT elem (A, B) gt
  • Use in XML document ltelemgt ltAgt...lt/Agt
    ltBgt...lt/Bgt lt/elemgt
  • Choice lt!ELEMENT elem (A B) gt
  • Use in XML document ltelemgt ltAgt...lt/Agt lt/elemgt
  • Combination lt!ELEMENT elem (A, (B C), D) gt
  • Use in XML document ltelemgt ltAgt...lt/Agt
    ltBgt...lt/Bgt ltDgt...lt/Dgt lt/elemgt
  • Grouping lt!ELEMENT elem (A,(((BC),D),(EF))?
    ) gt
  • Use in XML document ltelemgt ltAgt...lt/Agt lt/elemgt
  • lt!ELEMENT freedata ANYgt
  • Use in XML document ltfreedatagt This
    ltverbgtislt/verbgt a ltboldgt bold lt/boldgt ltbluegt
    ltobjgt text lt/objgt lt/bluegt lt/freedatagt

23
Main Constructs of DTD
  • Attributes
  • lt!ATTLIST element-name
  • attribute-name attribute-type
    presence-spec default-value
  • gt
  • element-name Name of the element that
    attributes are defined for.
  • attribute-name Name of the attribute
    being defined.
  • attribute-type Indicates the list of values
    that an attribute may have.
  • presence-spec Defines the presence
    requirements of the attribute
  • FIXED The attribute always has the same
    fixed value, thus
  • it does not need to be presented with
    the element.
  • IMPLIED The attribute takes the given value
    if it is presented.
  • REQUIRED The attribute must always be
    presented with the element.

24
Attribute Definition Examples
  • lt!ELEMENT price (PCDATA)gt
  • lt!ATTLIST price
  • currency CDATA
    REQUIRED gt
  • Use in XML document ltprice currencyUSDgt123.
    50 lt/pricegt
  • (Conforms to DTD even if currency code is
    mistakenly written as UST !)
  • lt!ATTLIST price
  • currency (USD CAD
    GBP TRL) REQUIRED gt
  • lt!ATTLIST price
  • currency (USD CAD
    GBP TRL) TRL gt
  • lt!ATTLIST price
  • currency CDATA IMPLIED
    gt

25
Main Constructs of DTD
  • General Entities
  • lt!ENTITY entity-name entity-valuegt
  • Entities are used to define shortcuts to common
    text or common definitions.
  • Entities can be either internal or external.
  • Internal entity
  • lt!ENTITY cValue this is an example of
    commonly used text gt
  • External entity
  • lt!ENTITY cValue SYSTEM http//www.abc.com/com
    montxt.xmlgt
  • Use in XML document lttextgt cValue lt/textgt

26
Main Constructs of DTD
  • Parameter Entities
  • lt!ENTITY entity-name entity-valuegt
  • Parameter entities can only be used in DTDs.
  • Using an example parameter entity
  • lt!ENTITY myURL http//www.abc.com gt
  • lt!ENTITY cValue SYSTEM myURL/commontxt.xml
    gt
  • Or
  • lt!ENTITY dataType INTEGER FLOAT
    STRING gt
  • lt!ATTLIST varName
  • varType (dataType) REQUIRED gt

27
Limitations of DTD
  • Extremely limited Data Typing (no Integer, Float,
    Enumaration or formatted data descriptions, etc.)
  • Very limited power to describe number of
    occurences of an element (remember , , ?)
  • No support for inheritance in schema definitions,
    which results in very large DTDs
  • Not enough support for self-documentation
  • DTD is not XML DTDs are written in their own
    language, not in XML

28
XML Parsers
29
XML Parsers
  • A parser takes an XML document and makes its
    structure and content available to an application
    through an API
  • There are two main Application Programming
    Interfaces (APIs) for writing parsers
  • Document Object Model (DOM) and
  • Simple API for XML (SAX)
  • Today, many parsers are both DOM and SAX compliant

30
The SAX API
  • The Simple API for XML is an event-based parser.
  • It reads XML document from beginning to end, each
    time it recognizes a construct, it raises a
    relevant event.
  • For example
  • When lt comes, startElement is raised
  • When lt/ comes, endElement is raised
  • When data comes, characters event is raised
  • SAX parser can not modify XML document
  • Events are actually methods of ContentHandler
    object.

31
An example SAX API
  • ltPurchaseOrderRequestgt parser calls
    startElement
  • ltPONumbergt 1234ABCD lt/PONumbergt parser calls
    startElement, characters, and endElement
  • ltPODategt20030601lt/PODategt parser calls
    startElement, characters, and endElement
  • ltQuantitygtltboxgt3lt/boxgtlt/Quantitygt parser
    calls startElement, startElement, characters,
    endElement and endElement
  • lt/PurchaseOrderRequestgt parser calls endElement

32
XML DOM Parser
A parser validates and makes the data contained
in an XML document available to the application
33
XML StyleSheet Language (XSL)
34
Extensible Stylesheet Language
  • XSL describes the presentation of data
  • Three parts of XSL
  • XSLT (language for transforming XML documents)
  • XPath (language for extracting parts of an XML
    document)
  • XSL Formatting Objects (vocabulary for formatting
    XML documents)

35
XSL Functions
  • Transforms XML into any other language or format
  • Filters and sorts XML data
  • Extracts parts of an XML document
  • Formats XML data based on the data value (like
    displaying negative numbers in red)
  • Outputs XML data to different devices, like
    screen, paper or voice (SpeechML)

36
XML Style Sheet Processing
SOURCE SOFTWARE AG
37
(No Transcript)
38
XSLT Processor
  • Converts an XML document to another form
  • An XSL style sheet is a set of transformation
    instructions for converting a source XML document
    to a target document

39
Multiple uses of data with XML XSL
  • Data is created once and can be delivered in many
    different formats

SOURCE OASIS
40
Impact of XML
  • Interchange mechanism between applications
  • Rapidly becoming the database language of the Web
  • XML client/server/server transactions over HTTP
  • XML properties
  • Scalable
  • Maintainable
  • Easy to use (spreadsheet style skills)
  • Interoperable (exchange business components)

41
Why use XML?
42
Why use XML?
  • It is a W3C recommendation and has become a
    universally accepted standard way of structuring
    data and meta-data (syntax)
  • It models tree/forest type hierarchical
    semi-structured data,
  • It facilitates implementation-level
    interoperability by providing hardware-representa
    tion-free data exchange
  • Anyone can invent new tags for any purpose and
    define their meaning in DTDs or schemas,
  • Content-oriented tagging provides self-describing
    and human-readable data
  • Strong industrial and academic support,
  • A lot of free or inexpensive tools
  • Expected to be dominant data representation
    format on the Internet in the near future.

43
XML versus EDI
44
XML vs EDI
45
XML vs EDI
46
Interoperability and XML
47
Extensibility in XML
  • Anyone can invent new tags and attach a meaning
    to those tags
  • For example
  • ltHandHeldDevicegt This devicelt/HandHeldDevicegt
  • ltMobileDevicegt This devicelt/MobileDevicegt
  • But if every user creates its own XML definition
    for describing his data, it is not possible to
    achieve interoperability

48
Agreement on tags is necessary
  • A tagged document is not very useful without some
    kind of agreement on the tags among
    inter-operating applications

Mobile Device
Hand Held Device
49
Standardizing XML Documents
  • There are very many efforts in this respect
  • ebXML for eBusiness
  • Commerce XML
  • Common Business Library (xCBL)
  • Universal Business Library
  • RosettaNet
  • Web Service standards
  • HL7 for heath care
  • Almost any standard that you can think of is
    being developed in XML

50
eBusiness XML standarts
  • ebXML (eBusiness XML)
  • eCX (eCatalog XML)
  • Tracker XML (for import/export)
  • tXML (Transportation XML, for logistics)
  • (VISA XML Invoice Specification)
  • (eXtensible Business Reporting
    Language)

51
Industry XML Schemas
  • Many XML schemas have been defined for specific
    tasks or marketplaces

SOURCE MICROSOFT
52
Questions
Write a Comment
User Comments (0)