The%20Document%20In%20The%2021st%20Century - PowerPoint PPT Presentation

About This Presentation
Title:

The%20Document%20In%20The%2021st%20Century

Description:

MHE - Consultants for Document and Datament Technologies. The Document ... moment good night kiss !DOCTYPE ... A passionate, romantic kiss while necking /intimacy ... – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 97
Provided by: BILL99
Category:

less

Transcript and Presenter's Notes

Title: The%20Document%20In%20The%2021st%20Century


1
The Document In The 21st Century
  • William J. Bill McCalpin
  • MIT, LIT, CDIA, EDP
  • Principal, MHE

2
Who MHE Is...
  • MHE is the consulting firm which specializes in
    the transition of information both within and
    between the electronic printing, imaging, and
    Internet environments.

3
Introduction
  • The Hegelian Dialectic

4
Thesis, Antithesis, Synthesis
  • In the philosophy of Hegel, these words show the
    inevitable transition of thought, by
    contradiction and reconciliation, from
    an initial conviction to its opposite and then to
    a new, higher conception that involves but
    transcends both of them

5
The Hegelian Dialectic
  • Thesis Most business have well-established,
    productive legacy systems
  • Antithesis XML is springing forth everywhere and
    will replace most legacy systems
  • Synthesis XML will be integrated with legacy
    systems - enhancing some processes, changing many
    others, and eliminating some altogether
  • In short, XML will change - not destroy - what
    you do

6
The Document In The 21th Century
7
What Is A Document?
  • The American Heritage Dictionary defines a
    document as information in writing placed on a
    medium such as paper, often used as a record.
  • Documents have been placed on clay tablets, gold
    leaf, animal skins, all types of paper,
    microfilm, optical storage, and so on

8
Information And Presentation
  • In every case, the document represents a
    fundamental union of information and presentation
  • But presentation presumes that the primary
    audience for the document is a human being
  • With the coming of the Internet, this is no
    longer the case

9
The Curse Of Presentation
  • Composition products require that you specify a
    printer, even before you know where the document
    will print

10
Why Are Print, Image, And Presentation Formats
Incompatible?
11
Printing And Imaging Formats
  • Many printing formats AFP, Metacode, DJDE, XES
    (UDK), PostScript, PCL, etc.
  • All formats use external resources like fonts,
    forms, graphics, etc., although sometimes
    inconsistently
  • Most are escape-sequence based, some are formal
    data architectures, and some are almost
    programming languages

12
Printing And Imaging Formats
  • Many imaging formats - while most use CCITT Group
    4 for image compression, most also have
    proprietary data wrappers
  • Later systems adopted text-based formats such as
    PDF, although storing other print streams is not
    unknown
  • Systems which store text-based formats must
    wrestle with resource issues

13
Different Print Formats
  • Why do printers have different formats? Because
    of physical constraints imposed by the hardware
  • resources reduce the amount of data sent through
    pipeline to printer
  • pages must be imaged in less than a fraction of a
    second
  • complex graphics can be developed on the printer,
    but this needs a special language

14
Different Imaging Formats
  • Why do imaging systems have different formats
    because of physical constraints imposed by the
    hardware
  • Mass storage was expensive
  • Indexing schemes were too close to the
    application
  • Text is avoided sometimes because of resource
    issues
  • Interoperability with other products an issue

15
Result
  • In each case, data architecture decisions were
    made in order to enhance some aspect of
    legibility of the stored objects.
  • If there were no requirement to present the
    information (to a human reader), then the
    requirement for custom data formats for each
    vendor would probably disappear!

16
Information Exchanges
  • B2C - business to consumer
  • B2B - business to business
  • B2B2C - business to business to consumer
  • 2C requires presentation information
  • B2B requires no presentation information, if the
    recipient is a process, not a person

17
Why B2B?
  • NYSE (New York Stock Exchange)
  • Formerly, 100 million trades in a day was
    considered very heavy
  • Now 1 billion trades a day is considered very
    heavy
  • The difference is automation the same multiplier
    applies to B2B
  • 1 effect of XML is the separation of information
    from presentation

18
The Nature Of XML
19
XML And SGML
  • XML is eXtensible Markup Language
  • XML is an instance of SGML, Standard Generalized
    Markup Language, an ISO standard (ISO 8879)
  • XML is extensible because people and
    enterprises with common interests get together to
    define the tags which describe their data

20
XML And Print Formats
  • In most print formats, something like an account
    number would be
  • AMB 200 AMI 300 SCFL 01 STO 0, 90 TRN 12345-67890
  • In XML, the same information is
  • ltaccount_numbergt12345-67890lt/account_numbergt

21
XML and Image Formats
  • Raster-based image formats contain only bitmaps
  • To read the text data within the bitmap requires
    an OCR/ICR process, which can fail
  • Most usable data is extracted from the document
    and placed in the index

22
XML And Electronic Formats
  • The nature of all electronic presentation formats
    is to be focused on the presentation of the
    information.
  • The nature of XML is focused on the authors
    content, that is, information is described as
    what it is, not how it looks.

23
Separating Information From Presentation
  • XML enables the total separation of information
    from presentation
  • Thus, some XML objects have only tagged
    information, while others have content and
    presentation information

XML
XML
XSL
24
How To Relate XML to Everyman
  • You might think that XML is too esoteric for most
    people to understand
  • But XML is based on the basic human need
    exchanging information
  • XML couples the communication skills we have used
    over the last several thousand years to modern,
    Internet technology
  • So how can you understand it?

25
Communication Difficulty 1
  • In order for any communication to take place,
    both parties must share the same fundamental
    mechanism which carries information
  • For example, in writing, if a boy and girl dont
    even share the same writing schemes, they cant
    possibly understand...

26
Chinese Characters vs Latin Alphabet
  • I Love You

27
Underlying Structure of XML
  • Text characters
  • Tags are delimited by lt and gt, i.e. ltxmlgt
  • Ending tags have /, e.g., lt/xmlgt
  • Parameters are indicated by double quotes, e.g.,
    ltPAPER track"Application"gt
  • XML is a series of tags and data, e.g.,
    ltSTATEgtTexaslt/STATEgt

28
Communication Difficulty 2
  • Once both parties agree to the fundamental
    syntax, then both parties must next agree to the
    words to be used
  • In the case of XML, how do both parties know that
    ltSTATEgt means a political subdivision and not one
    of gas,liquid,solid?

29
A Date Gone Bad
  • One evening in the hotel lobby bar, two young
    Italian men spend a while talking to an
    attractive Venezuelan girl...and her aunt
  • They spoke Italian and she spoke Spanish, but
    they communicated passably

30
A Date Still Going Bad
  • However, the aunt wanted to go up to her room
    with her niece
  • The Italians wanted to take the young lady out
    dancing...
  • So they asked her

31
Oops
  • What the boys said
  • Vuoi andare con noi sta sera?
  • What the young lady needed to hear
  • Quisieras ir con nosotros esta tarde?

32
Miscommunication
  • Even though Italian and Spanish use the same
    sounds, the same grammar, and have a common
    ancestry in Latin, some words are different
  • Unfortunately, the most common words in both
    languages are likely to be the most different

33
The Cost Of Data Differences
  • NASA lost a 125 million Mars orbiter because
    one engineering team used metric units while
    another used English units for a key spacecraft
    operation... CNN 9/30/99

34
XML Words
  • HTML has a certain number of fixed tags -
    everyone knows what they are, but they cant be
    augmented
  • In XML, everyone can make up their own tags to
    suit their needs - but how do we avoid a Tower of
    CyberBabel?

35
Communication Difficulty 3
  • Even when you agree to common tags, you still
    need to agree to a common understanding
  • In XML, the Schema (now replacing the DTD)
    defines what tags are allowed to describe a
    particular collection of data
  • For example, in the field of human relations,
    what is a date?

36
One DTD For A Date
  • A woman thinks
  • Invitation - formal
  • Dress-up - nicely
  • Eat out dinner with wine at nice restaurant
  • Entertainment see a movie
  • Private moment good night kiss
  • lt!DOCTYPE Date
  • lt!ELEMENT Date (Invitation, Dress, Meal,
    Entertainment, Intimacy) gt
  • lt!ELEMENT Invitation (PCDATA) gt
  • lt!ELEMENT Dress (PCDATA) gt
  • lt!ELEMENT Meal (PCDATA) gt
  • lt!ELEMENT Entertainment (PCDATA) gt
  • lt!ELEMENT Intimacy (PCDATA) gt

37
A Womans View Of A Date
  • ltdategt
  • ltinvitationgtTelephone calllt/invitationgt
  • ltdressgtLong dresslt/dressgt
  • ltmealgt4-star restaurantlt/mealgt
  • ltentertainmentgtthe theatrelt/entertainmentgt
  • ltintimacygtA passionate, romantic kisslt/intimacygt
  • lt/dategt

38
Another DTD For A Date
  • A man thinks
  • Eat out six-pack of beer
  • Private moment necking
  • lt!DOCTYPE Date
  • lt!ELEMENT Date (Meal,Intimacy) gt
  • lt!ELEMENT Meal (PCDATA) gt
  • lt!ELEMENT Intimacy (PCDATA) gt

39
A Mans View Of A Date
  • ltdategt
  • ltmealgtsix-pack of beerlt/mealgt
  • ltintimacygtnecking
  • lt/intimacygt
  • lt/dategt

40
When Men And Women Agree
  • ltdategt
  • ltinvitationgtTelephone calllt/invitationgt
  • ltdressgtLong dresslt/dressgt
  • ltmealgt4-star restaurantlt/mealgt
  • ltentertainmentgtthe theatrelt/entertainmentgt
  • ltintimacygtA passionate, romantic
    kisslt/intimacygt
  • lt/dategt
  • ltdategt
  • ltinvitationgtHonking
  • lt/invitationgt
  • ltdressgtNot the shirt he changed the oil
    inlt/dressgt
  • ltmealgtfood and beerlt/mealgt
  • ltentertainmentgtrent a videolt/entertainmentgt
  • ltintimacygtA passionate, romantic kiss while
    neckinglt/intimacygt
  • lt/dategt

41
The Four Stages Of XML Evolution
42
The Evolution Of Technology
  • Creation of basic technology
  • Growth of technical tools
  • Conversion of technology into business
    applications - the penetration into verticals
  • Reduction to commodity

43
1 Creation Of The Basic Technology Of XML
44
Creation Of Basic Technology
  • In 1998, the World Wide Web Consortium declared
    XML to be a recommendation, that is, a
    world-wide standard
  • This phase began in 1990 with the creation of the
    Web and browsers, and is now substantially
    complete

45
2The Growth Of Technical Tools
46
Growth Of Technical Tools
  • Once the underlying technology has been created,
    tools and utilities are built to use this
    technology
  • These tools are often somewhat primitive and are
    not focused on the business problem
  • This phase has been going furiously since 1998

47
The World Wide Web Consortium and XML
48
World Wide Web Consortium
  • The World Wide Web Consortium was created in
    October 1994 to develop common protocols that
    promote the Webs evolution and ensure its
    interoperability
  • The W3C has more than 500 Member organizations
    from around the world
  • The W3C has many roles

49
The Roles of the W3C
  • Standards Body (XML and others)
  • Software and Services
  • Working Groups
  • Initiatives
  • Activities with other standards bodies

50
W3C and Standards
  • XML
  • XSL
  • CSS1 CSS2
  • DOM
  • HTML
  • MathML
  • PICS
  • PNG
  • RDF
  • SMIL
  • SVG
  • XHTML
  • XPath, XPointer, XML Base, Xlink
  • XML Schema

51
Standards
  • XML (eXtensible Markup Language) is the universal
    format for structured documents and data on the
    Web. The base specifications are XML 1.0 Feb '98,
    and Namespaces, Jan '99.

52
Standards (Cont.)
  • XSL (eXtensible Style Sheets)
  • XSL is a language (in XML) for expressing
    stylesheets. It consists of two parts
  • XSL Transformations (XSLT) a language for
    transforming XML documents
  • An XML vocabulary for specifying formatting
    semantics (XSL Formatting Objects)

53
Standards (Cont.)
  • CSS (Cascading Style Sheets) CSS1 and CSS2
    describe how documents are presented on screens,
    in print, or perhaps how they are pronounced
  • Authors and readers can influence the
    presentation of documents without sacrificing
    device-independence or adding new HTML tags

54
Standards (Cont.)
  • CSS3 is now a Working Draft
  • The main purpose of CSS3 is to modularize the
    specification, so that dozens of changes dont
    have to be shove(d) ... into a single monolithic
    specification
  • Devices which are constrained (such as an aural
    browser) can choose to support only certain
    modules instead of all of CSS.

55
Why Two Style Sheet Languages?
56
Standards (Cont.)
  • DOM (Document Object Model)
  • a standard API to the document structure and aims
    to make it easy for programmers to access
    components of a document and delete, add or edit
    their content, attributes and style.
  • HTML (HyperText Markup Language)
  • The current language of the Internet, which is
    being redefined as XHTML 1.0

57
Standards (Cont.)
  • MathML (Mathematical Markup Language)
  • provides a much needed foundation for the
    inclusion of mathematical expressions in Web
    pages.
  • PICS Platform for Internet Content Selection
  • The PICS specification enables labels (metadata)
    to be associated with Internet content. It was
    originally designed to help parents and teachers
    control what children access on the Internet.

58
Standards (Cont.)
  • PNG Portable Network Graphics
  • a patent-free replacement for GIF and many common
    uses of TIFF
  • RDF Resource Description Framework
  • provide a lightweight metadata system to support
    the exchange of knowledge on the Web.

59
Standards (Cont.)
  • SMIL Synchronized Multimedia Integration
    Language
  • for television-like multimedia on the Web
  • SVG Scalable Vector Graphics
  • SVG is a language for describing two-dimensional
    graphics in XML

60
Standards (Cont.)
  • XHTML eXtensible HyperText Markup Language
  • What is the difference between XHTML 1.0, XHTML
    Basic and XHTML 1.1?
  • XHTML 1.0 HTML 4.01
  • XHTML Basic - subset for mobile apps
  • XHTML 1.1 - modularized tags to help support
    other applications

61
Standards (Cont.)
  • XPath, XPointer, XML Base, Xlink
  • defines linking, pointers, base URIs, etc.
  • XML Schema
  • offers facilities for describing the structure
    and constraining the contents of XML 1.0
    documents
  • The major difference between DTDs and Schemas is
    that Schemas allow better data typing (and
    Schemas are in XML)
  • Became a recommendation on May 2, 2001

62
Software and Services
  • Amaya - W3C's Editor/Browser
  • Amaya is a browser/authoring tool that allows you
    to publish documents on the Web.
  • From http//www.w3.org/Amaya/
  • CSS Validator - W3C CSS Validation Service
  • At http//jigsaw.w3.org/css-validator/

63
Software and Services (cont.)
  • HTML Tidy
  • Tidy is a utility which is able to fix up a wide
    range of HTML problems.
  • From http//www.w3.org/People/Raggett/tidy/
  • HTML Validator
  • It checks HTML documents for conformance to W3C
    HTML and XHTML Recommendations and other HTML
    standards.
  • From http//validator.w3.org/

64
Software and Services (cont.)
  • Jigsaw W3Cs Java Server
  • Jigsaw is W3C's leading-edge Web server platform,
    providing a sample HTTP 1.1 implementation on top
    of an advanced architecture implemented in Java.
    From http//www.w3.org/Jigsaw/
  • Libwww
  • Libwww is a highly modular, general-purpose
    client side Web API written in C for Unix and
    Windows (Win32). From http//www.w3.org/Library/

65
Working Groups
  • CC/PP Composite Capabilities/Preference
    Profiles
  • Automating the way in which your agent (PC, cell
    phone, PDA) identifies its capabilities and
    preferences
  • Device Independence Activity
  • These Groups are working towards making the
    information of the World Wide Web accessible to
    various devices and achieving Web device
    independent authoring.

66
Working Groups (cont.)
  • Internationalization Working Group and
    Internationalization Interest Group
  • These groups promote the use of Unicode in other
    recommendations and activities
  • Micropayments
  • The Internet enables commerce in intangibles
    (like information), but conventional payment
    methods are too expensive for this

67
Working Groups (cont.)
  • XForms - Interactive forms in XML
  • XML Encryption - encrypting/decrypting XML
    documents and their contents
  • XML Protocol - using XML as an encapsulation
    language in communications
  • XML Query - enabling collections of XML files to
    be accessed like databases

68
Working Groups (cont.)
  • Voice Browser Activity
  • This group has created a number of working
    drafts, such as on a Speech Recognition Grammar
    and a Speech Synthesis Markup Language
  • The W3C working group is basing its proposal for
    Dialog Markup Language on VoiceXML, from the
    VoiceXML Forum (www.voicexml.org), which is an
    IEEE group

69
Initiatives
  • Web Accessibility Initiative (WAI)
  • These guidelines explain how to make Web content
    accessible to people with disabilities
  • P3P - Platform for Privacy Preference
  • P3P is an industry standard providing a simple,
    automated way for users to gain more control over
    the use of personal information on Web sites they
    visit.

70
Where Can I find...?
  • Each of the preceding items can be found (today)
    at www.w3c.org
  • Everyone should check here periodically to obtain
    updates
  • Members can participate in projects and setting
    standards
  • www.xml.com is a commercial site with a
    newsletter and a huge amount of educational
    material

71
3 Conversion Of Technology Into Business
Applications
72
XML In The Verticals
  • The next step in the evolution of XML is the
    integration of XML objects into the processes of
    verticals, e.g., insurance, telecommunications,
    banking, finance, etc.
  • In each vertical, groups will come together to
    create standards for that vertical
  • This phase is just beginning in most verticals

73
The Insurance Vertical
  • ACORD (www.acord.org) is a well-known body in the
    insurance vertical
  • ACORD, the Association for Cooperative Operations
    Research and Development, describes itself as
    the insurance industry's nonprofit standards
    developer
  • ACORD initially developed standard forms to
    enable information sharing in the vertical

74
ACORD And PC
  • In the Property and Casualty business, the main
    driver to the Internet is the real-time exchange
    of data between producers, carriers, rating
    bureaus, service providers, and more.
  • The ACORD XML standard is designed to address
    the real-time requirement by defining PC
    transactions that include both a request and a
    response message.
  • from http//www.acord.org/xml_frame.htm

75
4Reduction To Commodity
76
Reduction To Commodity
  • In the last phase, the technology disappears
    from the view of the user
  • Older technologies are invisibly replaced with
    the newer technology, e.g. EDI by XML
  • Users perform business-oriented tasks without
    being aware of underlying technology

77
Past Progressions - Example 1
  • 1 - Computer chips
  • 2 - assembler
  • 3 - COBOL, Fortran, PL/I, C, and a host of 3rd
    generation languages
  • 4 - GUI-based code generators
  • We are now well into phase 4

78
Past Progressions - Example 2
  • 1 - Laser printer
  • 2 - FDL (Xerox), PPFA (IBM), etc.
  • 3 - Business-user friendly composition and
    formatting tools
  • 4 - GUI-based products with multiple,
    transparent drivers
  • We are now in phase 4

79
The Growth Of The XML Bubble
80
Com- pliance
Archive
New Sales
Reprints
Policy Print
Reports
Notices
CRM
11 Mark.
Campaign Manage.
Billing
HR
Pol. Proc.
EDI
81
Com- pliance
Archive
New Sales
Reprints
Policy Print
Reports
Notices
CRM
11 Mark.
Campaign Manage.
Billing
HR
Pol. Proc.
EDI
XML
EBPP
82
Com- pliance
Archive
New Sales
Reprints
Policy Print
Reports
Notices
CRM
11 Mark.
Campaign Manage.
XML Bubble
Billing
HR
Pol. Proc.
EDI
EBPP
83
Com- pliance
Archive
New Sales
Reprints
Policy Print
Reports
Notices
CRM
11 Mark.
Campaign Manage.
XML Bubble
Billing
HR
Pol. Proc.
EDI
EBPP
84
Com- pliance
Archive
New Sales
Reprints
Policy Print
Reports
Notices
CRM
11 Mark.
Campaign Manage.
XML Bubble
Billing
HR
Pol. Proc.
EDI
EBPP
85
Com- pliance
Archive
New Sales
Reprints
Policy Print
Reports
Notices
CRM
11 Mark.
Campaign Manage.
XML Bubble
Billing
HR
Pol. Proc.
EDI
EBPP
86
Com- pliance
Archive
New Sales
Reprints
Policy Print
Reports
Notices
CRM
11 Mark.
Campaign Manage.
XML Bubble
Billing
HR
Pol. Proc.
EDI
EBPP
87
Todays Billing Process
Billing Extract
Post Process
Print/ Format
Data Base
88
Todays Billing Process XML
Billing Extract
Post Process
Print/ Format
Data Base
XML App.
89
As the Bubble Grows
Print/ Format
Data Base
Post Process
XML App.
Billing Extract
90
Driver
Driver
XML Applications with business rules
Driver
Email
Driver
91
Composition Systems Before XML - 1
Business Rules
Compo-sition
Data base
92
Composition Systems Before XML - 2
Business Rules
Compo-sition
Data base
93
Compo-sition
Business Rules
Driver
XML Applications with business rules
Driver
Email
Driver
94
The Effect on Complex Systems
  • Over time, simple tools became complex systems
  • Due to competition, these systems added
    functionality beyond the core product
  • The XML Bubble will cause these systems to split
    again
  • Much of the added functionality was and will be
    vertically specific, and fall into the XML Bubble

95
Reference
  • www.w3c.org - the official World Wide Web
    Consortium site (youll find links to the XML
    spec here)
  • http//www.w3.org/XML/ - a long but not
    exhaustive list of XML sites, software, and
    information
  • Taming The Web With XML - an entry level
    article describing XML at http//www.mhe-consultin
    g.com/writep1.html

96
William J. Bill McCalpin
  • MIT, LIT, CDIA, EDP
  • Principal, MHE
  • 1400 Cheyenne Dr.
  • Richardson, Texas 75080-3921
  • 972-231-3660 (v) 972-690-4521 (f)
  • mccalpin_at_mhe-consulting.com
Write a Comment
User Comments (0)
About PowerShow.com