XML for Beginners - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

XML for Beginners

Description:

Used to apply structural markup. Limited number of tags. Not extensible (cannot create new tags/elements on the fly, ... May employ XSL for transform, format ' ... – PowerPoint PPT presentation

Number of Views:132
Avg rating:3.0/5.0
Slides: 24
Provided by: sueelle
Category:
Tags: xml | beginners | xsl

less

Transcript and Presenter's Notes

Title: XML for Beginners


1
XML for Beginners
  • Whys, Whats, Hows

2
HTML
  • Hypertext Markup Language
  • Used to apply structural markup
  • Limited number of tags
  • Not extensible (cannot create new tags/elements
    on the fly, so machine readable)
  • Employs attributes
  • May employ scripting (JavaScript, etc.)
  • May employ cascading style sheets (CSS)
  • Used for the web, HTML help, etc.
  • ANSI text file

3
Defining XML
  • The Extensible Markup Language (XML) is a subset
    of SGML. Its goal is to enable generic SGML to be
    served, received, and processed on the Web in the
    way that is now possible with HTML. XML has been
    designed for ease of implementation and for
    interoperability with both SGML and HTML.

4
XML, SGML, HTML
  • XML is a profile of SGML.
  • XML is less complex than SGML but more complex
    than HTML.
  • It has been said that XML provides 80 of the
    benefit of SGML with 20 of the effort.

5
What is XML?
  • XML stands for EXtensible Markup Language
  • XML is a markup language much like HTML
  • XML was designed to describe data
  • XML tags are not predefined. You must define your
    own tags
  • XML uses a Document Type Definition (DTD) or an
    XML Schema to describe the data
  • XML with a DTD or XML Schema is designed to be
    self-descriptive

6
Difference between XML and HTML
  • XML was designed to carry data
  • XML is not a replacement for HTMLXML and HTML
    were designed with different goals
  • XML was designed to describe data and to focus on
    what data is.HTML was designed to display data
    and to focus on how data looks.
  • HTML is about displaying information, while XML
    is about describing information.

7
XML
  • Extensible Markup Language
  • Semantic markup applied
  • Extensible any type of element can be used
  • Must be valid and structured to work
  • May employ attributes
  • May employ XSL for transform, format
  • Standardized text format designed specifically
    for transmitting structured content to web
    applications
  • Unambiguous easy to understand hierarchy
  • Elements have parent-child relationship
  • Unicode text file
  • Able to validate in IE with the proper tools
  • Strict formatting against DTD
  • Can be sorted easier than HTML

8
The design goals for XML
  • XML shall be straightforwardly usable over the
    Internet.
  • Supported by browsers
  • Platform independent
  • XML shall support a wide variety of applications.
  • XML shall be compatible with SGML.
  • It shall be easy to write programs which process
    XML documents.
  • The number of optional features in XML is to be
    kept to the absolute minimum, ideally zero.
  • But the number of absolute requirements are
    global and few in number

9
The design goals for XML
  • XML documents should be human-legible and
    reasonably clear.
  • Human legible humans can read them, but not
    necessarily that all humans can interpret them
  • Humans do not have to have specific applications
    (other than a text editor) to look at an XML file
    in order to make sense of it.
  • The XML design should be prepared quickly.
  • XML editors have evolved, however, that make this
    much easier.
  • Its like the difference between hard-coding html
    and using an html editor.
  • The design of XML shall be formal and concise.
  • XML documents shall be easy to create.
  • Terseness in XML markup is of minimal importance.

10
Why XML
  • HTML standards change too slowly.
  • Html 3.2 and 4.0, but further development slow.
  • Difficulty of constraining everyone to use the
    same set of rules.
  • XML solution
  • Flexible framework that allows the users to
    declare their own sets of constraints.
  • Within a high-level limited set of very demanding
    rules.

11
HTML vs. XML
  • HTML 3.2 together with CGI scripts, Java applets,
    and JavaScript (and its derivatives), plus
    plug-ins such as Shockwave, RealPlayer, and
    Quicktime provide Web authors and commercial
    sites with a rich array of techniques for
    displaying content that is visually compelling
    and possibly even informative. However, these
    techniques do little if anything for the
    representation of structured data unless one
    introduces middleware solutions.

12
Why XML
  • Browser-specific markup impairs the concept of a
    world-wide web that is universal platform
    independent.
  • Browser-specific extensions to the rules.
  • Difficulties in getting all options to display in
    all browsers.
  • The great browser wars (Microsoft vs. Netscape,
    Opera, etc.)

13
Why XML
  • HTML is incapable of delivering web-enabled
    applications.
  • Web-based working environments
  • Group authoring in distributed environments
  • Group database management in distributed
    environments (example multilingual terminology
    and TM database initiatives)
  • Lean-client, web-server-based applications

14
Why XML
  • Search engines return too many hits.
  • Use of axiom-based ontology resources (OWL
    environment) instead of keyword-based searches
  • Enabling inference-based computing on the Web
  • Expansion of the depth of metadata that can be
    associated with specific pages and sets of
    information
  • Metadata initiatives
  • RDF development
  • Dublin core
  • EPA, Institutes of Health, Terminology data
    management, etc.

15
Why XML Character Encoding
  • HTML is based on ISO 8859
  • 15 parts multiple pages, each for a different
    block of characters
  • Impossibility of encoding for multiple, divergent
    sets (e.g., for ru, en, ch at the same time)
  • XML is based on UNICODE
  • The Unicode Standard  Unicode defines the
    universal character set. Its primary goal is to
    provide an unambiguous encoding of the content of
    plain text, ultimately covering all languages in
    the world.
  • http//www.unicode.org/standard/WhatIsUnicode.html
  • http//www.w3.org/TR/unicode-xml/
  • Unicode in XML and Other Markup Languages

16
Examples HTML
  • ltBgtJohn Q Publiclt/Bgt ltPgt john.q.public.1_at_gsfc.nas
    a.govltBRgt phone 301-286-aaaaltBRgt fax
    301-286-bbbbltBRgt Bldg. 23, Rm. 999ltBRgt NASAltBRgt
    Goddard Space Flight CenterltBRgt
    588.0ltBRgtGreenbelt, MD 20221ltBRgt

17
Example XML
  • ltEMPLOYEEgt
  • ltNAMEgt ltFIRSTgtJohnlt/FIRSTgt ltMIDDLEgtQlt/MIDDLEgt
    ltLASTgtPubliclt/LASTgt lt/NAMEgt ltEMAILgtjohn.q.public.1
    _at_gsfc.nasa.govlt/EMAILgt ltPHONEgt301-286-aaaalt/PHONE
    gt ltFAXgt301-286-bbbblt/FAXgt ltLOCATIONgt
    ltBUILDINGgtBldg. 23lt/BUILDINGgt ltROOMgt999lt/ROOMgt
    lt/LOCATIONgtltADDRESSgt ltORGgtNASAlt/ORGgt
  • ltCENTERgtGoddard Space Flight
    Centerlt/CENTERgt ltMAILSTOPgt588.0lt/MAILSTOPgt
    ltCITYgtGreenbeltlt/CITYgt ltSTATEgtMDlt/STATEgt
    ltZIPgt20221lt/ZIPgt
  • lt/ADDRESSgt
  • lt/EMPLOYEEgt

18
Advantages of XML
  • XML preserves the semantics and structure of
    data.
  • XML presents information as hierarchical data.
  • XML data can be parsed and processed in the same
    way that you can process application-specific
    database data.
  • But because it is human-readable, e.g.,
    universally accessible and uncompiled data, it
    can be accessed and used by a variety of
    different systems.

19
XML Criteria
  • An XML document must be well-formed.
  • An XML document must be valid against its DTD,
    schema, or internally declared values.
  • Declaration of legal tags
  • Declaration of legal attributes
  • Declaration of legal attribute values
  • Declaration of permissible instances

20
Well-formedness
  • End tags must be present
  • Nested tags must not overlap
  • All tags and attributes must be lowercase (xml is
    case-sensitive)
  • Use of quote marks for attribute values is
    required, as is consistent use of single or
    double quotes
  • End tags or end markers are required, even for
    empty tags
  • http//www.w3.org/TR/xhtml1/h-4.1

21
Validity
  • A DTD or schema defines specific rules
  • Writers of valid documents follow these rules in
    addition to well-formedness constraints
  • Two-part checking well-formedness validity
    check
  • Functionality provided by Internet Explorer 6
  • http//msxml.com/xml_tutorial/valid-xml.html

22
xmllang
  • Universal locale-oriented language code
  • Country code language code

23
Viewing an XML document in IE
Write a Comment
User Comments (0)
About PowerShow.com