Title: K' Scott Morrison
1XML The Promise and the Reality
- K. Scott Morrison
- IBM Pacific Development Centre
- Vancouver
2XML Hype
XML will replace HTML
XML is for documents
XML is for data
XML will replace all message formats
3Profile Edit Screen NAME K. Scott
Morrison ADDRESS 8999 Nelson Way CITY Burnaby
STATE/PROV B.C. COUNTRY Canada
CODE V5A 4B5 TEL (604) 293-5753
FAX (604) 473-5807 CREDIT CARD1 TYPE
VISA NUM 123456789 EXP 12/00 CREDIT
CARD2 TYPE AMEX NUM 987654321 EXP
04/01
4Screen Scrape
Remote System
.. .. ..
Screen Scrape
Server
Persistent Store
5Binary Representation
6Binary Representation
- Issues
- Cant determine structure from data
- Portability
- Fixed field length
- Brittle interfaces
- Must modify all clients and servers
simultaneously - Mapping code typically buried in applications
- Significant maintenance problem
- Distribution of message map
- Not human readable
7ANSI_X3.4-1968 (US-ASCII) Text Representation
8US-ASCII Text Representation
- Standards-based, reasonably portable
- Human readable
- Can make conjectures about semantics
- Issues
- Limited character set
- NLS problems
- No real structure hierarchy, lists, etc
- Still very brittle
- Distribution of message maps
9Proprietary Tagging
10Proprietary Tagging
- Human understandable
- Less brittle interface
- Delimited text using tag and ltCRgt
- Issues
- Distribution of tag semantics
- Non-standard
- Character escaping issues
- E.g. ltCRgt, , etc
- No sense of hierarchy
- Programmer intensive parsers, handlers, etc
11Formalized Tagging Markup
- Markup is meta-data
- Adds information about text
- What it means, how to interpret, how to render,
etc - Markup delimits
- START-----------END
- Interface is less brittle
- Markup works as a container
- Markup adds structure
- Hierarchy, etc
12Markup Can Be Stylistic
Lorem ipsum dolor sit amet, consectetuer
adipiscing elit, sed diam nonummy nibh euismod
tincidunt ut laoreet dolore magna aliquam erat
volutpat. Ut wisi enim ad minim veniam, quis
nostrud exerci tation ullamcorper suscipit
lobortis nisl ut aliquip ex ea commodo consequat.
ltFONT FACETimes New Romangt Lorem ipsum dolor
sit amet, consectetuer adipiscing elit, sed diam
nonummy nibh euismod tincidunt ut laoreet dolore
magna aliquam erat volutpat. Ut wisi enim ad ltBgt
minim lt/Bgt veniam, quis nostrud exerci tation ltIgt
ullamcorper lt/Igt suscipit lobortis nisl ut
aliquip ex ea ltUgt commodo lt/Ugt consequat.
lt/FONTgt
13Markup Can Be Structural
Lorem ipsum dolor sit amet, consectetuer
adipiscing elit, sed diam nonummy nibh euismod
tincidunt ut laoreet dolore magna aliquam erat
volutpat. Ut wisi enim ad minim veniam, quis
nostrud exerci tation ullamcorper suscipit
lobortis nisl ut aliquip ex ea commodo consequat.
ltPgt Lorem ipsum dolor sit amet, consectetuer
adipiscing elit, sed diam nonummy nibh euismod
tincidunt ut laoreet dolore magna aliquam erat
volutpat. lt/Pgt ltPgt Ut wisi enim ad minim
veniam, quis nostrud exerci tation ullamcorper
suscipit lobortis nisl ut aliquip ex ea commodo
consequat. lt/Pgt
14Markup Can Be Semantic
Lorem ipsum dolor sit amet. consectetuer
adipiscing elit, sed diam nonummy nibh euismod
tincidunt ut laoreet dolore magna aliquam erat
volutpat. Ut wisi enim ad minim veniam, quis
nostrud exerci tation ullamcorper suscipit
lobortis nisl ut aliquip ex ea commodo consequat.
ltTITLEgt Lorem ipsum dolor sit amet. lt/TITLEgt ltBOD
Ygt consectetuer adipiscing elit, sed diam nonummy
nibh euismod tincidunt ut laoreet dolore magna
aliquam erat volutpat. Ut wisi enim ad minim
veniam, quis nostrud exerci tation ullamcorper
suscipit lobortis nisl ut aliquip ex ea commodo
consequat. lt/BODYgt
15Simple Markup Example 1
- ltMessagegt
- Hello, World!
- lt/Messagegt
16Simple Markup Example 2
- ltMessageContainergt
- ltMessagegt
- Hello, World!
- lt/Messagegt
- ltMessagegt
- Goodbye, World!
- lt/Messagegt
- lt/MessageContainergt
17Profile Using Markup
- ltProfilegt
- ltNamegt K. Scott Morrison
lt/Namegt - ltAddressgt 8999 Nelson Way
lt/Addressgt - ltCitygt Burnaby
lt/Citygt - ltStateProvincegt BC
lt/StateProvincegt - ltCountrygt Canada
lt/Countrygt - ltZipPostalCodegt V5A 1B5
lt/ZipPostalCodegt - ltTelephonegt (604) 293-5753
lt/Telephonegt - ltFAXgt (604) 473-5807
lt/FAXgt
18Profile Using Markup (cont.)
- ltCardgt
- lt Type gt VISA lt/
Type gt - ltNumbergt 123456789 lt/Numbergt
- ltExpirygt 1200
lt/Expirygt - lt/Cardgt
- ltCardgt
- ltTypegt AMEX lt/ Type gt
- ltNumbergt 987654321 lt/Numbergt
- ltExpirygt 0401
lt/Expirygt - lt/Cardgt
- lt/Profilegt
19eXtensible Markup Language
- Extensible
- Tag set is not fixed
- Structural
- Deep, hierarchical nesting of structures
- Ordered lists (unordered with Schema)
- Can infer meaning from structure
- Valid document requirement
- Can check structure against a schema
- Well formed (DTDs and Schema)
20eXtensible Markup Language
- Portable
- Text-based Unicode
- All parsers must support UTF-8 (US-ASCII)
- Parsers may support UTF-16, EBCDIC, UCS-4,
ASCII, ISO 646, ISO 8859, Shift-JIS, EUC, etc - Human readable
- Machine understandable
- W3C standard
- Rich set of emerging tools
21XML Tooling IE5
22Isnt This Just Like HTML?
- HTML is a markup language based on SGML
- Key differences
- HTML has a fixed set of tags
- HTML mixes stylistic, structural, and semantic
tags - HTML does not support deep nesting and hierarchy
- HTML is invalid
23XML Issues
- XML documents must be valid
Invalid
- Means that structure can be inferred
ltMessagegt Hello, World!
Valid
ltMessagegt Hello, World! lt/Messagegt
Hello, World! lt/Messagegt
ltMessagegt Hello, World! lt/MESSAGEgt
24XML Issues
- Knowledge of document organization
- Distribution of document organization
- Character set issues
- Document focus
- Parser speed and complexity
25Message Organization DTDs
- Document Type Definition
- Used to validate documents
- Issues
- Doesnt use XML syntax, not extensible
- Writing good DTDs is hard
- Very awkward and limited language constructs,
uses eBNF grammar - No namespaces
- No inheritance, defaults, ranges, enums
26Profile DTD Example
lt?xml version"1.0" encoding"UTF-8" ?gt lt!DOCTYPE
Profile lt!ELEMENT Profile (Name, Address, City,
StateProvince, Country,
ZipPostalCode, Telephone, FAX, Card)gt lt!ELEMENT
Name (PCDATA)gt lt!ELEMENT Address (PCDATA)gt
lt!-- Lines removed for clarity --gt lt!ELEMENT
Card (Type, Number, Expiry)gt lt!ELEMENT Type
(PCDATA)gt lt!ELEMENT Number (PCDATA)gt lt!ELEMENT
Expiry (PCDATA)gt gt ltProfilegt ltNamegt K.
Scott Morrison lt/Namegt ltAddressgt 8999 Nelson
Way lt/Addressgt lt!-- Lines removed for
clarity --gt ltCardgt ltTypegt VISA lt/Typegt
ltNumbergt 123456789 lt/Numbergt ltExpirygt
1200 lt/Expirygt lt/Cardgt lt!-- Lines
removed for clarity --gt lt/Profilegt
27Message Organization Schema
- Alternative to DTDs
- XML syntax
- Much more expressive
- Has namespace support
- Has inheritance, defaults, ranges (min/max),
enumerations, sequences, unordered lists
28Transforms XSL
ltProfilegt ltNamegt K. Scott Morrison
lt/Namegt ltAddressgt 8999 Nelson Way
lt/Addressgt ltCitygt Burnaby
lt/Citygt ltStateProvincegt BC
lt/StateProvincegt ltCountrygt Canada
lt/Countrygt ltZipPostalCodegt V5A 1B5
lt/ZipPostalCodegt ltTelephonegt
(604) 293-5753 lt/Telephonegt ltFAXgt
(604) 473-5807 lt/FAXgtltCardgt
ltNamegt VISA lt/Namegt
ltNumbergt 123456789 lt/Numbergt
ltExpirygt 1200 lt/Expirygt
lt/Cardgt ltCardgt ltNamegt AMEX
lt/Namegt ltNumbergt
987654321 lt/Numbergt ltExpirygt
0401 lt/Expirygt
lt/Cardgt lt/Profilegt
ltVisaCardgt ltCardNumbergt 123456789
lt/CardNumbergt ltExpirygt 1200
lt/Expirygt ltClientgt ltNamegt
K. Scott Morrison lt/Namegt
ltTelephonegt (604) 293-5753 lt/Telephonegt lt/Clie
ntgt lt/VisaCardgt
Source Document
Destination Document
XSL Engine
29eXtensible Stylesheet Language
- XSLXSL Transforms (XSLT) Formatting Objects
and Properties - Some basic XSLT functions
- Insertion of static text (like templates)
- Copy, discard, or rearrange source text
- Compute new text from source
30XSLT Example XML to HTML
- lt?xml version"1.0"?gt
- ltxslstylesheet xmlnsxsl"http//www.w3.org/TR/WD
-xsl"gt - ltxsltemplate match"/"gt
- lthtmlgt
- ltbodygt
- ltPgt Address is
- ltxslvalue-of select"Profile/Address"/gt
lt/Pgt - ltPgt Name is
- ltxslvalue-of select"Profile/Name"/gt
lt/Pgt - lt/bodygt
- lt/htmlgt
- lt/xsltemplategt
- lt/xslstylesheetgt
31XML as Data
ltProfilegt ltIDgt 123456789
lt/IDgt ltNamegt K. Scott Morrison
lt/Namegt ltAddressgt 8999 Nelson Way
lt/Addressgt ltCardgt ltNamegt VISA
lt/Namegt ltNumbergt 123456789
lt/Numbergt ltExpirygt 1200
lt/Expirygt lt/Cardgt lt/Profilegt
XML Document
XML-Db Extender
Profile Table
Database
Card Table
32XML as Data Messaging
Request XML Message
ltgt ltgt lt/gtlt/gt
ltgt ltgt lt/gtlt/gt
RDBMS
Response XML Message
Server System
Client System
- Transport Examples
- HTTP over sockets
- MQSeries
- etc
- Message Infrastructure
- ebXML
- SOAP
- etc
- Message Formats
- OTA
- TravelFrame
- etc
33Schema Distribution
Source System
Destination System
DTD
ltgt ltgt lt/gtlt/gt
Embedded
XML Message
ltgt ltgt lt/gtlt/gt
Replicated
DTD
ltgt ltgt lt/gtlt/gt
Centralized
Schema Repository
34Applications Interoperability
35Applications Interoperability
Thick PC Clients
MQSeries Integrator routing and transforming XML
messages
UNIX Servers
36Applications eBusiness
Internet
Web/WML/XML Servers
37Summary
- XML is here to stay
- There will be heavy vendor support for it
because - 1. XML is standardized
- 2. XML will be the B2B message format
- It can be leveraged now in most domains