Title: XML
1XML
- eXtensible Markup Language
2XML
- A method of defining a format for exchanging
documents and data. - Allows one to define a dialect of XML
- A library of tags, with associated structure
ltconfiggt ltdescriptor type"FILE"
name"source"gt ltattribute name"media_type"
type"svalue"/gt ltattribute name"frame_rate"
type"svalue"/gt lt/descriptorgt lt/configgt
3The Social Benefits
- Can specify an interchange format concisely and
accurately enough to set up a validation service
easily - There is plenty of available software for dealing
with XML files and translating from one format
into another
4Downsides
- Sometimes defining a representation can be a pain
- Deciding what to leave as content and what to
move to attributes. - XML Schemas are confusing, while DTDs do not
offer enough control - Verbose
- ViPER increased about 2x uncompressed, 4/3x gzip
compressed - Difficult to read
- Lots of lt/gt and end tags get in the way of the
data
5The Real Benefits to The Programmer
- XML Schema (or DTDs) allow you to validate a
document without having to examine it - Xpath allows you to specify a node, or set of
nodes, in a document quickly and easily - SAX makes it easy to write a quick parser
- DOM makes it so you dont even have to do that
- XSLT allows you to transform from an XML
document into another document, possibly not even
standard XML - Etc.
6XML As A File Format
- Makes parsing simpler, but currently no methods
for making saving easier - Saves you from dealing with things like character
encoding and date formatting - No more difficult than making up your own
- An unfamiliar or forgotten file grants more
affordances than an XML or binary file
7Defining A Dialect
- XML Schema Structure and Data
- Define elements and attributes
- Associate them with data types
lt?xml version"1.0" encoding"UTF-8"?gt ltxsdschema
xmlnsxsd"http//www.w3.org/2001/XMLSchema"
targetNamespace"http//lamp.cfar.umd.ed
u/viper" xmlnsviper"http//lamp.cfar
.umd.edu/viper" elementFormDefault"qu
alified"gt ltxsdelement name"viper"/gt
ltxsdelement name"config"/gt lt/xsdschemagt
8Schema Datatypes
- Can create and assign datatypes to attributes and
elements. For example
ltxsdelement name"data" type"xsdbase64Binary"/gt
ltxsdattribute name"span" type"viperframespanT
ype"/gt ltxsdsimpleType name"framespanType"gt ltxsd
restriction base"xsdstring"gt ltxsdpattern
value"\d\\d" /gt lt/xsdrestrictiongt lt/xsdsim
pleTypegt
9Schema Structures
- Can specify order and contents of elements
- Sequence, choice, mixed, etc. allow specifying
how and where elements appear - Substitution groups allow one tag to take the
place of another - Can group elements without placing the into types
10Extensiblity
- Inheritance
- Can extend complex elements by adding more
attributes and elements to the bottom - Can restrict the data using the ltrestriction/gt
elements - The ltany/gt and ltanyAttribute/gt elements
- The ultimate in extensibility, allow any valid
XML in from a given namespace or range of
namespaces
11Parsing
- Using the DOM
- The DOM provides a tree structure that represents
the document - Memory heavy
- Using SAX
- Event driven
- Lightweight
- Better for large documents
12Xpath
- The common language for selecting individual
pieces of an XML document shared between X-Link
and XSLT - Also used for defining uniqueness constraints in
Schemas - DOM Level 3 will support selecting by Xpath
- Looks sort of like a JavaScript DOM call
- /viper/config/descriptor_at_typeFILE
- Selects all of the file descriptor nodes that are
of type FILE
13Resources
- www.xml.com
- O'Reilly's XML resource
- www.w3.org
- The standards themselves, and lots of good links
to implementations. - xml.apache.org
- DOM, SAX, and XSLT for C and Java