Title: Folie 1
1XML Technology
15th Meeting of the European Working Group on
Operational Workstations (EGOWS)
sdm AG software design management Carl-Wery-Str
. 42 81739 München phone 089 63812-0 www.sdm.de
Martin Lehmann, sdm AG Potsdam, June 14th
2004 Version 1.0
A Company of
2ltspeech title"XML Technology"
subtitle"15th Meeting of the European
Working Group on Operational
Workstations (EGOWS)"
/gt
ltauthor name"Martin Lehmann"
company"sdm AG"/gt ltconference
location"Potsdam" dateJune 14th
2004"/gt ltversion no"1.0"/gt
sdm AG software design management Carl-Wery-Str
. 42 81739 München phone 089 63812-0 www.sdm.de
A Company of
3Agenda
Agenda
4Documents consist of several Parts
An Overview of XML
Data Pure Information
Structure Relationship of Data
XML
Format / Layout Makes Structure visible
5XML Basics
An Overview of XML
- Definition XML is a cross-platform, software
and hardware independent tool for transmitting
information - XML describes data and its structure
- XML does not do anything
- XML is platform independent
- XML is (more or less) human readable
- XML is not designed for the minimization of
storage size - XML has been defined by World Wide Web Consortium
W3CThe W3C forum develops interoperable
technologies (specifications, guidelines,
software, and tools) to lead the Web to its full
potential.
6XML is the eXtensible Markup Language
An Overview of XML
- XML stands for eXtensible Markup Language
- XML is a markup language (much like HTML but
better) - XML was designed to describe the structure of
data. - Advantages of Structuring Data
- Structured Data can be interpreted by machines
much more effective (like for example in Search
Engines) - Data can be separated from the application much
easier.
7XML vs. HTML Where is the difference ?
An Overview of XML
- XML is not a replacement for HTML.
- XML and HTML were designed with different goals
- XML describes what a document means
- HTML describes what a document looks like
- Therefore XML can be used to store, exchange,
share any kind of data. - XML Document Structure Definition
8History
An Overview of XML
- IBM GML (Goldfarb, Mosher, Lorie)
SGML as an importantpredecessor Standardized
GeneralizedMarkup Language
9Agenda
Agenda
10The M in XML Markup of Data
XML Markup of Data
- Markup
- XML tags are not predefined (ie The X in XML).
You must define your own tags. - XML is therefore a Meta Language. Each structure
definition defines a new XML language. - Structure
- XML uses a Document Type Definition (DTD) or an
XML Schema (XSD) to describe the data structure - XML with a DTD or XML Schema is designed to be
self-descriptive
11An XML Document is divided into three Parts
XML Markup of Data
Header
Prolog
DTDorXSD
- Section defining the underlying structure of the
XML document - Document Type Definition (.dtd)or XML Schema
Definition (.xsd)
Structure
Data
- Data Section itself
- XML file (.xml)
Content
We will focuson Data first.
12XML Data Description in Elements ...
XML Markup of Data
- Data is described with the markup of element tags
- ltelementgtlt/elementgt or ltelement/gt
- Example
- ltPersongtFriedrich der Großelt/Persongt
- Nesting of elements is possible. Elements
therefore form a tree (have parents, childs etc.) - Example
- ltPersongt
- ltfirstNamegtFriedrichlt/firstNamegtltlastNamegtder
Großelt/lastNamegt - lt/Persongt
Header
Prolog
DTDorXSD
Structure
Data
Content
13XML Data Description in Elements ... and
Attributes
XML Markup of Data
Header
- You can add an unlimited number of attributes to
each element - Example
- ltPerson profession"king" born"Jan
24th 1712" died"Aug 17th 1786"/gt
Prolog
DTDorXSD
Structure
Data
Content
14XML Data Description in Elements ... and
Attributes
XML Markup of Data
Header
- Data can be stored in elements, attributes or
both
Prolog
ltPerson firstName"Friedrich" lastName"der
Große" profession"king"/gt
DTDorXSD
Structure
ltPersongt ltfirstNamegtFriedrichlt/firstNamegtltlastNa
megtder Großelt/lastNamegtltprofessiongtkinglt/professi
ongt lt/Persongt
Data
Content
ltPerson profession"king"gt ltfirstNamegtFriedrichlt/
firstNamegtltlastNamegtder Großelt/lastNamegt lt/Person
gt
15Elements vs. Attributes
XML Markup of Data
Header
- Some of the problems using attributes
- Attributes cannot describe structures
- Attributes cannot contain multiple values
- Attributes are not easily expandable
- Attributes are more difficult to manipulate (by
code) - Attribute values are not easy to test against a
DTD - If you use attributes as data containers, you end
up with documents that are difficult to read and
maintain. - Use elements to describe data.
- Use attributes only to provide information that
is describing additional information.
Prolog
DTDorXSD
Structure
Data
Content
16XML Syntax Overview in brief
XML Markup of Data
Header
- A closing tag is always necessary
- XML tag names are case sensitive
- XML elements must be nested properly
- An XML document must have a root element
- White space is not truncated
- CR/LF is converted to LF
- XML comments are like in HTML lt!-- This is
an annotation --gt
Prolog
DTDorXSD
Structure
Data
Content
17XML Markup Looking back to the XMLized Title
Slide
XML Markup of Data
18Agenda
Agenda
- XML Structure DTD and XML Schema
19XML Structure
XML Structure DTD and XML Schema
Header
- Structure
- XML uses a Document Type Definition (DTD) or an
XML Schema (XSD) to describe the data structure - XML with a DTD or XML Schema is designed to be
self-descriptive - An XML document with correct syntax is called
well formed . - XML validated against a DTD is called valid.
- A DTD is optional, ie not absolutely necessary
but recommended to check the structure. If one
knows what to do, the DTD can be omitted.
Prolog
DTDorXSD
Structure
Data
Content
By the way HTML does not even have to be
well-formed
20Document Type Definition DTD
XML Structure DTD and XML Schema
Header
- A DTD defines the underlying document structure.
- It names and qualifies all XML elements and their
attributes.
Prolog
DTDorXSD
Structure
lt?xml version"1.0" encoding"UTF-8"?gt lt!ELEMENT
person (firstName, lastName)gtlt!ATTLIST person
profession CDATA REQUIREDgtlt!ELEMENT
firstName PCDATAgtlt!ELEMENT lastName PCDATAgt
Data
DTD
Content
ltPerson profession"king"gt ltfirstNamegtFriedrichlt/
firstNamegtltlastNamegtder Großelt/lastNamegt lt/Person
gt
XML
21Document Type Definition DTD
XML Structure DTD and XML Schema
Header
- All elements have to be named
- The structure has to be defined with regular
expressions to define the nesting of elements - Element defining child elements in a
sequencelt!ELEMENT Person (firstName , lastName)gt - Element defining child elements but optional
lt!ELEMENT Person (firstName?, lastName)gt - Element defining a series child elements
lt!ELEMENT Person (firstName, lastName)gt means
0 or more means at least 1 - Element defining a choice lt!ELEMENT Person
(firstName lastName)gt - EMPTY elementlt!ELEMENT firstName EMPTYgt
Prolog
DTDorXSD
Structure
Data
Content
22XML Schema Better than a DTD
XML Structure DTD and XML Schema
Header
- DTD is pretty old and has several disadvantages.
XML Schema (XSD) has been defined to overcome DTD
limitations. - A XSD uses XML syntax, ie only one language is
used. Any application which can work with XML can
work with XSD, too. - XSD has support for data types and data content
limitation. This allows - permissible document content
- to validate the correctness of data
- to define restrictions on data and data patterns
- easier conversion
Prolog
DTDorXSD
Structure
Data
Content
23XML Schema
XML Structure DTD and XML Schema
- Predefined Data Types
- Type hierarchy consists of base and derived types
- Own data types can be defined as
- Restrictions of existing types
- List of existing types
- Union (of the value ranges) of existing types
- Aspects can be defined for each data type (like
setting min and max, defining a pattern,
enumerations etc.)
Header
Prolog
- String
- Various Numerical Types
- Time- and Date
- Boolean
- Binary (hex- or Base64-coded)
DTDorXSD
Structure
Data
Content
24XML Header Information
XML Structure DTD and XML Schema
- XML Version information required
- lt?xml version"1.0"?gt
- More attributes optional
- standalone
- encoding
- Example
- lt?xml version"1.0" encoding"ISO-8859-1"
standalone"no"?gt
Header
Prolog
DTDorXSD
Structure
Data
Content
25Putting it all together ...
XML Structure DTD and XML Schema
Header
lt?xml version"1.0" encoding"UTF-8"?gt
Structure as DTDElements, Attributes
andEntities (Shortcuts)
lt!DOCTYPE employee lt!ENTITY auml "228"gt
lt!ELEMENT employee (name, phone)gt lt!ELEMENT
name (PCDATA)gt lt!ELEMENT phone EMPTYgt
lt!ATTLIST phone country (DCHDKCDNunknown) "D"
predial CDATA IMPLIED
number CDATA REQUIREDgt gt
Data
ltemployeegt ltnamegtErich Kuchikaumlschtlilt/namegt
ltphone predial"123" number"4567"/gt ltphone
country"CH" predial"041" nummer"987654"/gt lt/em
ployeegt
26Agenda
Agenda
- XML Usage in Applications
27Using XML in Applications
XML Usage in Applications
- XML can be used to ...
- ... transfer data between applications (used as
self-describing data) - ... store data
- Steps when using the data of a XML document
- The data has to be read and interpreted
- The data has to be transformed to other
formats(like to Java Objects, Database Content,
Web-Pages etc.)
28DOM and SAX
XML Usage in Applications
- Two Standard Programming APIs are defined DOM
and SAX - DOM Object Model for XML Document
- Defined by W3C
- The structure of the XML Document is defined in a
tree structure. - Creation of documents, navigation through the
structure and the modification of elements is
easy - as based on the document tree. - SAX Event-based Callback interface
- Defined by XML-DEV community (SAX 1.0 since Mai
1998) - Open Source but de facto standard
- Events are generated whenever the parser reads a
special type of element or other content
29DOM Example
XML Usage in Applications
30SAX Example
XML Usage in Applications
- Interface ContentHandler defines events
- Opening a Document
- Opening an Element
- Closing an Element
- CDATA Sections
- ...
- MyContentHandler has to implement the Callback
Functions
31SAX Example
XML Usage in Applications
1 startDocument()
SAX Parser Method Calls
2 startElement("company", attribs)
3 startElement("employee", attribs)
4 startElement("name", attribs)
5 characters(char, start, length)means
Dilbert
6 endElement("name")
7 startElement("phone", attribs)
8 characters(char, start,
length)means 1234
9 endElement("phone")
10 endElement("employee")
11 endElement("company")
32DOM vs. SAX
XML Usage in Applications
- DOM
- Easy navigation in the document
- Modification and creation of documents is
possible - Parsing process is quite transparent
- Lots of resources needed
- Should be used, if the whole document needs to be
read anyway
- SAX
- Faster
- Less ressources needed
- API is easy
- Lower-level-access
- Should be used, if only special parts of the
documents have to be read (fast)
33Agenda
Agenda
- Example NinJo Configuration using XML
34Automatic Data Binding
Example NinJo Configuration using XML
- Disadvantage of DOM and SAX Application
programmers should not know about the parsing
process, it should be kept transparent - Idea Automatic Data Binding
- The structure (DTD or XSD) of the document is
normally known in advance. One can use this
structure to generate other representations out
of it, like for example sourcecode for Java
classes - The transformation from XML to Java can then be
done automatically, the programmer does not need
to care about parsing. - The standard Java API JAXB provides this data
binding idea.
35Automatic Data Binding
Example NinJo Configuration using XML
XML Structure
Java Structure
XML Content
Java Content
36NinJo Configuration Framework
Example NinJo Configuration using XML
- NinJo Configuration Framework cares for
(un-)marshalling automatically, ie Data Binding
technology is used in the background. - In here, configuration files can be read from and
written to a local directory tree. Extension of a
Configuration Server currently in work. - Therefore NinJo has developed a own XML language
(NML NinJo Modeling Language) to define the
structure and to simplify the automatic mapping.
- The structure is defined in a so-called XBD file
- The content is defined in a XML file
- The XBD structure defines the configuration
structure - Configuration contents
- Configuration references (links to or reusage of
other modules)
37NML NinJo Modeling Language
Example NinJo Configuration using XML
38NinJo Configuration Framework ConfigDesigner
Example NinJo Configuration using XML
39NinJo Configuration Framework Config Context
Example NinJo Configuration using XML
- Each NinJo module has its own Configuration
Context (ie its own location where its config
files are stored). - The Configuration context is stored in a
hierarchy Global, site specific and user
specific configurations can be stored and
retrieved.
SystemContext
SiteContext
UserContext
40Example NinJo ColorTable Configuration
(Temperature)
Example NinJo Configuration using XML
lt?xml version"1.0" encoding"UTF-8"?gt ltcolorTable
Cfg defGreen"204" unit"C/10"gt ltunitClass
configName"temperature"/gt ltcolors blue"200"
green"0" red"200" value"-350.0"/gt ltcolors
blue"255" green"150" red"255"
value"-300.0"/gt ltcolors blue"255" green"150"
red"0" value"-200.0"/gt ltcolors blue"255"
green"255" red"255" value"-100.0"/gt ltcolors
blue"255" green"255" red"0"
value"0.0"/gt ltcolors blue"200" green"200"
red"0" value"50.0"/gt ltcolors blue"50"
green"200" red"50" value"100.0"/gt ltcolors
blue"50" green"255" red"50"
value"150.0"/gt ltcolors blue"0" green"240"
red"255" value"200.0"/gt ltcolors blue"50"
green"150" red"255" value"250.0"/gt ltcolors
blue"0" green"0" red"255"
value"300.0"/gt ltcolors blue"0" green"0"
red"200" value"350.0"/gt lt/colorTableCfggt
41Agenda
Agenda
- An Overview of Advanced XML Technologies
42Advanced XML Techniques
An Overview of Advanced XML Technologies
Namespacesavoid name conflicts
XSL is used for applying Style Sheets
XHTML and XForms are replacements for or
extensions to HTML
SOAP is the basic protocol for WebServices
43XSL, XSLT and XSL-FO
An Overview of Advanced XML Technologies
- XSL eXtensible Style Language
- The W3C started to develop XSL because there was
a need for an XML based Style Sheet Language. - Style Sheets describe how documents are presented
on screens, in print or published somewhere else - XSL consists of three parts
- XSLT is a language for transforming XML documents
- XPath is needed as a underlying language for
defining parts of an XML document - XSL-FO is a language for formatting XML documents
44What can be done with XSL ?
An Overview of Advanced XML Technologies
- Think of XSL as set of languages that can
- transform one XML language into another,
- filter and sort XML data,
- define parts of an XML document,
- format XML data based on the data value (like for
example displaying negative numbers in red) - output XML data to different media (like screens,
paper, or voice).
45What can be done with XSL ?
An Overview of Advanced XML Technologies
46XSLT used for Transformations
An Overview of Advanced XML Technologies
- XSLT is a language for transforming XML documents
into other XML documents (hence the name). - Often, this is used to transform XML to a type of
document that is recognized by a browser, like
HTML. Normally XSLT does this by transforming
each XML element into an HTML element. - XSLT can test and make decisions about which
elements to display, like ... - ... adding new elements to the output,
- ... removing elements,
- ... rearranging or sorting elements.
- A common way to describe the transformation
process is to say that XSLT transforms an XML
source tree into an XML result tree.
47XPath
An Overview of Advanced XML Technologies
- XPath is the basic adressing language XSL is
working on. - It defines a node tree (quite similar to DOM).
The nodes are ordered (depending on their
apperance in the document). - One can navigate through the tree. Some
relationships shown below(not all , 11 are
defined in total).
parent
ancestor
childs
descendants
48An Example for XSL Publish a CD Catalog on the
Web
An Overview of Advanced XML Technologies
Output is HTML pagewith the CD Catalog
4
3
Apply XSL to XML
2
XSL Document
CD CatalogXML Document
1
49An Example for XSL - Step 1 XML Document
An Overview of Advanced XML Technologies
lt?xml version"1.0" encoding"ISO-8859-1"?gt lt?xml-
stylesheet type"text/xsl" href"cdcatalog.xsl"?gt
ltcataloggt ltcdgt lttitlegtThe Joshua
Treelt/titlegt ltartistgtU2lt/artistgt
ltcountrygtIrelandlt/countrygt ltcompanygtIsland
Recordslt/companygt ltpricegt16.99lt/pricegt
ltyeargt1995lt/yeargt lt/cdgt ... lt/cataloggt
50An Example for XSL - Step 2 XSL Stylesheet
An Overview of Advanced XML Technologies
lt?xml version"1.0" encoding"ISO-8859-1"?gt
ltxslstylesheet version"1.0" ...gt
ltxsltemplate match"/"gt lthtmlgt ltbodygt
lth2gtMy CD Collectionlt/h2gt lttable
border"1"gt lttr bgcolor"9acd32"gt
ltth align"left"gtTitlelt/thgt ltth
align"left"gtArtistlt/thgt lt/trgt
ltxslfor-each select"catalog/cd"gt lttrgt
lttdgtltxslvalue-of select"title"/gtlt/tdgt
lttdgtltxslvalue-of select"artist"/gtlt/tdgt
lt/trgt lt/xslfor-eachgt lt/tablegt
lt/bodygtlt/htmlgt lt/xsltemplategt lt/xslstylesheetgt
HTML (blue) writtento output as shown
XSL Transformations
51An Example for XSL Step 3/4 Apply XSL to XML,
Browser Output
An Overview of Advanced XML Technologies
52Advanced XML Techniques
An Overview of Advanced XML Technologies
Namespacesavoid name conflicts
XSL is used for applying stylesheets
?
XHTML and XForms are replacements or extensions
to HTML
SOAP is the basic protocol for WebServices
53Namespaces
An Overview of Advanced XML Technologies
- Since element names in XML are not fixed, very
often a name conflict will occur when two
different documents use the same names describing
two different types of elements. - This problem is solved with unique prefixes.
Table uses theHTML4 namespace
ltftable xmlnsf"http//www.ikea.com/furniture"gt
ltfnamegtAfrican Coffee Tablelt/fnamegt
ltfwidthgt80lt/fwidthgtltflengthgt120lt/flengthgt lt/f
tablegt
lthtable xmlnsh"http//www.w3.org/TR/html4/"gt
lthtrgt lthtdgtAppleslt/htdgt
lthtdgtBananaslt/htdgtlt/htrgt lt/htablegt
Table uses afurniture namespace
54Advanced XML Techniques
An Overview of Advanced XML Technologies
Namespacesavoid name conflicts
XSL is used for applying stylesheets
?
?
XHTML and XForms are replacements or extensions
to HTML
SOAP is the basic protocol for WebServices
55XHTML eXtensible HyperText Markup Language
An Overview of Advanced XML Technologies
- XHTML is HTML defined in XML.
- Why XHTML?
- Web has reached a point where many pages contain
"bad" HTML. - HTML Code might work fine if you view it in a
browser, even if it does not follow the HTML
rules. - XHTML is almost identical to HTML 4.01. It is
aimed to replace HTML, but it is stricter and
cleaner. - XHTML elements must be properly nested
- XHTML documents must be well-formed
- Tag names must be in lowercase
- All XHTML elements must be closed
56XForms
An Overview of Advanced XML Technologies
- XForms are the next generation of HTML forms
- XForms are richer and more flexible than HTML
forms - They are platform and device independent
- It is designed to handle interactive transactions
- Data and logic is separated from presentation
- XForms contain features like calculations and
validations of forms (less scripting is required) - XForms forms can be routed to multiple users and
locations
57Advanced XML Techniques
An Overview of Advanced XML Technologies
Namespacesavoid name conflicts
XSL is used for applying stylesheets
?
?
XHTML and XForms are replacements or extensions
to HTML
SOAP is the basic protocol for WebServices
?
58SOAP Simple Object Access Protocol
An Overview of Advanced XML Technologies
- SOAP will be developed as a W3C standard
- SOAP is a protocol for accessing Web Services
- SOAP is a simple communication protocol to let
applications exchange information via Internet
protocol - SOAP is platform and language independent
- SOAP is based on XML
- SOAP is simple and extensible
59SOAP
An Overview of Advanced XML Technologies
- SOAP messages are ordinary XML documents
lt?xml version"1.0"?gt ltsoapEnvelope
xmlnssoap"... " ltsoapHeadergt ...
lt/soapHeadergt ltsoapBodygt ...
ltsoapFaultgt ... lt/soapFaultgtlt/soapBod
ygt lt/soapEnvelopegt
60Usage of SOAP in Web Services
An Overview of Advanced XML Technologies
- SOAP is the protocol for Web Services.
- Web Services are loosely coupled software
components based on standard Internet
technologies (a current hype, not a very new
idea). - Two of the key problems solved by Web Services
(over earlier distributed systems such as CORBA,
DCOM, RPC ...) are - Interoperability Usage of a standardized format
for distributed messaging. - Firewall traversal Usage of non-standard ports
is a problem for collaboration across
corporations. But most of the firewalls allow
access through port 80 (for HTTP).
61Usage of SOAP in Web Services
An Overview of Advanced XML Technologies
- SOAP communication can be based on several
underlying protocols HTTP, FTP, TCP, SMTP, POP3,
MQSeries, ... - Architecture Registry knows about Providers.
Consumers asks for Providers and calls Provider
then.
62Usage of SOAP in Web Services
An Overview of Advanced XML Technologies
- Each application can be a combination of
Providers, Consumers or Registries.
63Agenda
Agenda
64Literature
Literature
http//www.w3.org/TR/REC-xml http//www.xml.org ht
tp//www.xmlsoftware.com http//www.w3.org/DOM/ h
ttp//sax.sourceforge.net/ http//www.w3.org/Style
/XSL/ http//www.w3.org/TR/xpath http//www.w3.or
g/TR/soap/ http//java.sun.com/xml/jaxb/
http//www.irb.hr/cern/WWW/publications/sgmlen/s
gmlen.html http//www.xmlfiles.com/xml/ http//w
ww.w3schools.com/default.asp http//java.sun.com/
xml/tutorial_intro.html
- XML
- XML Software
- DOM
- SAX
- XSL
- XPath
- SOAP
- Data Binding
- SGML
- Tutorials
65Any Questions (about the Content -) ?
Literature