Combined XML, SGML Issues - PowerPoint PPT Presentation

About This Presentation
Title:

Combined XML, SGML Issues

Description:

MHE's principals have nearly 40 years of experience in electronic ... Quicken and OFX) MHE. MHE - the print2image2Internet consultants. How Should I Use XML? ... – PowerPoint PPT presentation

Number of Views:95
Avg rating:3.0/5.0
Slides: 67
Provided by: BILL99
Category:
Tags: sgml | xml | combined | issues | quicken

less

Transcript and Presenter's Notes

Title: Combined XML, SGML Issues


1
Combined XML, SGML Issues
  • William J. Bill McCalpin
  • MIT, LIT, CDIA, EDP
  • AIIM 2002 - March 6, 2002

2
About MHE
  • MHE is the print2image2Internet consulting firm
  • MHEs principals have nearly 40 years of
    experience in electronic print streams, in taking
    electronic print streams to imaging systems, and
    now in taking legacy information to the Internet
  • See http//www.mhe-consulting.com

3
About the Speaker
  • William J. Bill McCalpin is a principal at MHE
  • Mr. McCalpin was the first - and for years the
    only - person in the world to have the MIT, LIT,
    CDIA, and EDP designations
  • Mr. McCalpin serves on the AIIM Accreditation
    Committee and AIIM Conference Committee

4
About the Speaker (cont.)
  • Mr. McCalpin is on the Xplor Board of Directors
    and is Treasurer
  • Mr. McCalpin recently completed a two-year stint
    as Xploration Editor-in-Chief
  • Mr. McCalpin is a frequent speaker at both AIIM
    and Xplor

5
What Do You Say When They Ask You, When Are You
Going To Support XML?
6
But The Real Question Is, Why Should I Support
XML?
7
Agenda
  • What is XML?
  • What do we do in e-Business?
  • When do you want to use XML?
  • The Right Way and the Wrong Way to use XML
  • The Flow of Information
  • The XML Bubble
  • The answer to when and why

8
What is XML?
9
XML And SGML
  • XML is eXtensible Markup Language
  • XML is an instance of SGML, Standard Generalized
    Markup Language, an ISO standard (ISO 8879)
  • XML is extensible because people and
    enterprises with common interests get together to
    define the tags which describe their data

10
XML and HTML
  • HTML is a tagged language, but the tags are 40 or
    50 grammatical tags like ltpgt or lth1gt
  • XML is a tagged language, and the tags are
    (usually) created and agreed to by domains or
    vertical industry segments. E.g. ltaccount_numbergt
    or ltcitygt

11
The Document
  • A document is an organized collection of
    information in time
  • A document contains information which can be
    understood by human or machine, and has validity
    at some period in time
  • The information in a document can be organized in
    many ways - as text, bitmaps, print streams,
    tagged languages, etc.

12
The New Document
  • Per this definition, the document
  • does not depend on which organization of the
    information is used (so long as author and
    recipient agree)
  • does not depend on the medium (paper, film,
    optical, magnetic or even parchment are all fine)
  • does not have to have presentation information,
    because the recipient may be a machine

13
Three Parts of an XML Document
Tagged Data (in XML)
Tag Definitions (in DTD or Schema)
Presentation (in XSL or CSS)
14
The XML Document
  • Data - data values bounded by XML tags
  • Presentation
  • CSS - Cascading Style Sheets, like for HTML
  • XSL - format information in XML
  • Tag Definitions
  • DTD - Document Type Definitions - old SGML
    definition
  • Schema - definitions in XML

15
Data In the XML Document
  • Data is the purpose of an XML document
  • Each piece of data is specifically identified by
    a tag
  • Data is organized because the tags match patterns
    in the DTD or Schema
  • An example of data in XML

16
Data Example in XML
  • ltAUTHORgt
  • ltNAMEgtWilliam J. "Bill" McCalpin, EDPP,
    CDIA, MIT, LITlt/NAMEgt
  • ltJOBTITLEgtPrincipallt/JOBTITLEgt
  • ltAFFILIATIONgtMHElt/AFFILIATIONgt
  • ltADDRESSgt
  • ltSTREETgt1400 Cheyenne
    Dr.lt/STREETgt
  • ltCITYgtRichardsonlt/CITYgt
  • ltSTATEgtTexaslt/STATEgt
  • ltZIPCODEgt75080lt/ZIPCODEgt
  • ltEMAILgtmccalpin_at_mhe-consulting.com
    lt/EMAILgt
  • lt/ADDRESSgt
  • lt/AUTHORgt

17
Presentation in XML
  • Tags in XML dont have natural formatting (unlike
    HTML), so if presentation is needed, it must be
    explicitly defined
  • CSS can be used for HTML and XML
  • XSL can be parsed by an XML parser, and it can be
    used by XML and XSLT
  • XSL example

18
Presentation Example
  • lt?xml version"1.0"?gt
  • ltxslstylesheet xmlnsxsl"http//www.w3.org/TR/WD
    -xsl"gt
  • ltxsltemplate match"author"gt
  • ltTABLE WIDTH"100" BORDER"1" CELLSPACING"0...
    ltTRgt
  • ltTD COLSPAN"2"gt
  • ltTABLE WIDTH"100" BORDER"1"
    CELLSPACING"0...
  • ltFONT COLOR"000000"gtltxslvalue-of
    select"name"/gtlt/FONTgt
  • lt/TDgt
  • ...
  • lt/xsltemplategt
  • lt/xslstylesheetgt

19
Why Two Style Sheet Languages?
20
DTD/Schema in XML
  • The DTD is the old (SGML) way of defining not
    only what tags are valid, but their relative
    order, number, mandatory/optional attributes, and
    so on
  • The Schema is a total rewrite - written in XML
    itself - which defines all of the above as well
    as possible legal values for a tag (e.g.,
    integer, date, days of the week, etc.)

21
Schema Example
  • lt?xml version"1.0"?gt
  • ltSchema name"sample_schema" ...gt
  • ...
  • lt!-- Element Types --gt
  • lt!-- data --gt
  • ltElementType name"author"gt
  • ltelement type"name" minOccurs"1"
    maxOccurs"1"/gt
  • lt/ElementTypegt
  • ...
  • lt/Schemagt

22
What do we do in e-Business?
23
What is e-Business?
  • Of course, e-Business is really just doing
    business using 100 electronic methods such as
    the Internet
  • In e-Business, we do transactions or exchange
    information using electronic media rather than
    the usual paper media
  • e-Business can broken down into two parts
  • B2C
  • B2B

24
B2C
  • B2C is Business to Consumer
  • Your business generates the information, and a
    consumer receives it
  • The consumer is normally interested only in the
    data and its presentation
  • Thus, in this scenario, the consumer needs only
    an XML document and CSS/XSL - which is more or
    less the same as HTML!

25
Important Fact 1
  • When you are engaged in B2C, and the recipient is
    a consumer with a thin client, then HTML is
    usually sufficient
  • Supplying the data in XML is usually a waste of
    time, because the recipient gets no additional
    value from the XML over HTML
  • XHTML is just HTML which is XML compliant

26
B2B
  • B2C is Business to Business
  • Your business generates the information, and
    another business receives it
  • Frequently, the recipient is not a person, but a
    software process in the business
  • Thus, in this scenario, the recipient often needs
    only the XML data and the reference to the DTD or
    Schema - no presentation may be needed!

27
Important Fact 2
  • When you are engaged in B2B, and the recipient is
    a software process, then XML is often the most
    appropriate format
  • Binary data formats may be smaller, but will
    require more work and more maintenance
  • Dont send presentation information unless the
    recipient actually wants your presentation
    information!

28
When do you want to use XML?
29
When Do I Use XML?
  • As we have seen, XML is best suited for the
    preservation of the authors content
  • And (X)HTML is best suited for presentation of
    information to an enduser
  • And this leads us to...

30
Important Fact 3
  • In todays market
  • XML is better utilized when communicating with a
    thick client - that is, most B2B in which a
    software process is the recipient
  • (X)HTML is better utilized when communicating
    with a thin client - that is, most B2C in which
    an Internet browser is the recipient
  • And when is this not true?

31
Exceptions to Fact 3
  • XML can be used in B2C when the browser is used
    with so much Java and other local applications
    that the overall process resembles a thick client
  • (X)HTML can be used in B2B if the recipient is
    just a human being rather than a software
    process, e.g., when information is transmitted
    only to be viewed

32
The Right Way And The Wrong Way To Use XML
33
CML Chemical Markup Language
  • One of the early vertical implementations of
    XML
  • The official site is http//www.xml-cml.org/
  • A better site is http//www.ch.ic.ac.uk/chimeral
    /
  • CML uses the trio of tagged data, Schema, and XSL

34
A CML XML Document
  • ltmolecule title"caffeine" id"mol_caffeine"gt
  • ltformulagtC8 H10 N4 O2lt/formulagt
  • ltstring title"CAS"gt58-08-2lt/stringgt
  • ...
  • lt/moleculegt

35
The CML Schema
  • lt?xml version"1.0"?gt
  • ltSchema name"cml_dev_karne" xmlns"urnschemas-mi
    crosoft-comxml-data" xmlnsdt"urnschemas-micros
    oft-comdatatypes"gt
  • ...
  • lt!-- Element Types --gt
  • lt!-- data --gt
  • ltElementType name"molecule" content"eltOnly"
    model"open" order"many"gt
  • ltelement type"formula" minOccurs"0"
    maxOccurs""/gt
  • ...

36
A CML Stylesheet
  • ltxsltemplate match"molecule"gt
  • ltTABLE WIDTH"100" BORDER"1" CELLSPACING"0"
    CELLPADDING"3" BORDERCOLOR"CCCCFF"
    BGCOLOR"EEEEFF"gt
  • ltTRgt
  • ltTD COLSPAN"2"gt
  • ltFONT COLOR"0000AA"gtFormula
  • ltFONT COLOR"000000"gtltxslvalue-of
    select"formula"/gtlt/FONTgtlt/TDgtltTDgt
  • ...

37
The CML Document
  • Note that each data item is tagged
  • Note that each tag matches the standard Schema
  • Note that the data is used to create a complex
    image in the browser - but not the only possible
    image!

38
A Print to XML/HTML Conversion
  • Print stream does not contain any metadata, only
    data and presentation information
  • Tags cannot be meaningful unless they are
    reverse-engineered
  • The result might be only the tagged data and the
    stylesheet
  • Too often, the XML looks like

39
Bad XML Example
  • / text positioning information /
  • .ps0positionabsolutetop533pxleft29pxwidth4
    0px
  • .ps1positionabsolutetop533pxleft317pxwidth
    38px
  • .ps2positionabsolutetop533pxleft454pxwidth
    90px
  • ...
  • / font properties information /
  • .ft1font-weightboldfont-size22px
  • .ft2font-size17px
  • .ft3font-size11px
  • lt!-- text starts here --gt
  • ltSPAN CLASS"ps0"gtltNOBRgtAccount
    Numberlt/NOBRgtlt/SPANgt
  • ltSPAN CLASS"ps1"gtltNOBRgt12345lt/NOBRgtlt/SPANgt
  • ltSPAN CLASS"ps2"gtltNOBRgtNamelt/NOBRgtlt/SPANgt
  • ...

40
An Image to XML Example
  • Most information may not be tagged
  • ltinvoicegt
  • ltaccount_nogt12345lt/account_nogt
  • ltnamegtBill McCalpinlt/namegt
  • ltdatagt70 02 02 02 02 FE A7 47 47 48 03 F9 A7
    42 27 4A 74.lt/datagt
  • lt/invoice

41
The Flow of Information
42
The Flow of Information
  • E-Business is about the flow of information
    between parties as well as within the enterprise
  • Traditionally, as information moves through the
    business process, we lose as much information as
    we add
  • Look at how we used to treat information

43
As Information Flow Used to Be
44
As Information Flow Used To Be
Data
Data
Toner on paper
Data awareness (metadata)
Presentation information
Scan
Composer
X010101(bits)
Archive
Zap!
45
As Information Flow Is Today
46
As Information Flow Is Today
Data
Data
Web page, emails, etc.
Data awareness (metadata)
Presentation information
Transform
Composer
Text and graphics
PDF
Zap!
47
As Information Flow Should Be
48
As Information Flow Should Be
email
Data
Data
Data awareness (metadata)
Data awareness (metadata)
WAP
Complete XML documents
Web page
Presentation information
archive
paper
User
49
Or, As In The XML Bubble...
Web page
Process
Add presenta- tion
Data metadata
email
Data metadata
Data metadata
Process
Cell phones
B2B applica- tions
Archive
50
Important Fact 4
  • Use XML to delay the loss of important
    information
  • Dont throw away information until you commit the
    document to a final format which cant support it
  • In other words, keep the information in XML as
    long as possible

51
The XML Bubble
52
Com- pliance
Archive
New Sales
Reprints
Policy Print
Reports
Notices
CRM
11 Mark.
Campaign Manage.
Billing
HR
Pol. Proc.
EDI
53
Com- pliance
Archive
New Sales
Reprints
Policy Print
Reports
Notices
CRM
11 Mark.
Campaign Manage.
Billing
HR
Pol. Proc.
EDI
XML
EBPP
54
Com- pliance
Archive
New Sales
Reprints
Policy Print
Reports
Notices
CRM
11 Mark.
Campaign Manage.
XML Bubble
Billing
HR
Pol. Proc.
EDI
EBPP
55
Com- pliance
Archive
New Sales
Reprints
Policy Print
Reports
Notices
CRM
11 Mark.
Campaign Manage.
XML Bubble
Billing
HR
Pol. Proc.
EDI
EBPP
56
Com- pliance
Archive
New Sales
Reprints
Policy Print
Reports
Notices
CRM
11 Mark.
Campaign Manage.
XML Bubble
Billing
HR
Pol. Proc.
EDI
EBPP
57
Com- pliance
Archive
New Sales
Reprints
Policy Print
Reports
Notices
CRM
11 Mark.
Campaign Manage.
XML Bubble
Billing
HR
Pol. Proc.
EDI
EBPP
58
Com- pliance
Archive
New Sales
Reprints
Policy Print
Reports
Notices
CRM
11 Mark.
Campaign Manage.
XML Bubble
Billing
HR
Pol. Proc.
EDI
EBPP
59
Todays Billing Process XML
Billing Extract
Post Process
Print/ Format
Data Base
XML App.
60
Driver
Driver
XML Applications with business rules
Driver
Email
Driver
61
Remember the Question, Why Should I Support
XML?
62
Why Should I Support XML?
  • I should support XML in B2B, unless the recipient
    wants only to view my presentation
  • I should support (X)HTML in B2C, unless the
    recipient has a thick client which can utilize
    the XML (cf. Quicken and OFX)

63
How Should I Use XML?
  • Once information is in XML, I should keep it
    there as long as possible
  • I should use industry accepted DTDs and Schemas
  • I shouldnt even think of well-formed XML
    (syntactically correct but no DTD/Schema) as real
    XML, to avoid confusion

64
A Final Note
  • The World Wide Consortium (www.w3c.org) is the
    standards body for the generic protocols of XML,
    such as XML syntax itself, XSL, RDF, etc.
  • Most domain or vertically centric XML
    definitions are supported by the verticals
    themselves, e.g., CML, GEML (Gene Expression
    Markup Language), etc.

65
A Final Note, Part Deux
  • At www.xml.org, there are nearly 100 Schema/DTDs
    listed from 31 different industries, from AIML
    (Astronomical Instrument Markup Language) to
    RecipeML (Recipe Markup Language) yes, XML for
    the kitchen.
  • Also see Robin Covers excellent work at
    xml.coverpages.org/sgml-xml.html

66
Contact Information
  • William J. Bill McCalpin
  • MIT, LIT, CDIA, EDP
  • Principal
  • MHE
  • 1400 Cheyenne Dr.
  • Richardson, Texas 75080-3921 USA
  • (972) 231-3660 (v) (972) 690-4521 (f)
  • mccalpin_at_mhe-consulting.com
  • www.mhe-consulting.com
Write a Comment
User Comments (0)
About PowerShow.com