Title: Introduction to the Logical Structure of XML Documents
1Introduction to the Logical Structure of XML
Documents
- Web Engineering, SS 2007
- Tomáš Pitner, Michael Derntl
2Example of an XML Document
- lt?xml version'1.0' encoding'UTF-8'?gt
- ltstaff organization"Bundesregierung"gt
- ltperson id"agu"gt
- ltnamegtAlfred Gusenbauerlt/namegt
- ltparty url"http//www.spoe.at"gtSPÖlt/partygt
- lt/persongt
- ltperson id"wmo"gt
- ltnamegtWilhelm Moltererlt/namegt
- ltparty url"http//www.oevp.at"gtÖVPlt/partygt
- lt/persongt
- lt/staffgt
3XML Document Prolog/Heading
- lt?xml version'1.0' encoding'UTF-8'?gt
- ltstaff organization"Bundesregierung"gt
- ltperson id"agu"gt
- ltnamegtAlfred Gusenbauerlt/namegt
- ltparty url"http//www.spoe.at"gtSPÖlt/partygt
- lt/persongt
- ltperson id"wmo"gt
- ltnamegtWilhelm Moltererlt/namegt
- ltparty url"http//www.oevp.at"gtÖVPlt/partygt
- lt/persongt
- lt/staffgt
Version typically 1.0
Character encoding UTF-8, UTF-16, US-ASCII
always work!
XML Prolog
4XML Document - Root Element
- lt?xml version'1.0' encoding'UTF-8'?gt
- ltstaff organization"Bundesregierung"gt
- ltperson id"agu"gt
- ltnamegtAlfred Gusenbauerlt/namegt
- ltparty url"http//www.spoe.at"gtSPÖlt/partygt
- lt/persongt
- ltperson id"wmo"gt
- ltnamegtWilhelm Moltererlt/namegt
- ltparty url"http//www.oevp.at"gtÖVPlt/partygt
- lt/persongt
- lt/staffgt
Root element contains most of the information in
the doc, every doc must have exactly one root
element!
5Elements and Tags
Element
Element Name
- lt?xml version'1.0' encoding'UTF-8'?gt
- ltstaff organization"Bundesregierung"gt
- ltperson id"agu"gt
- ltnamegtAlfred Gusenbauerlt/namegt
- ltparty url"http//www.spoe.at"gtSPÖlt/partygt
- lt/persongt
- ltperson id"wmo"gt
- ltnamegtWilhelm Moltererlt/namegt
- ltparty url"http//www.oevp.at"gtÖVPlt/partygt
- lt/persongt
- lt/staffgt
Start Tag of the element
End Tag of the element
6Attributes
Attribute placed in the element's start tag
- lt?xml version'1.0' encoding'UTF-8'?gt
- ltstaff organization"Bundesregierung"gt
- ltperson id"agu"gt
- ltnamegtAlfred Gusenbauerlt/namegt
- ltparty url"http//www.spoe.at"gtSPÖlt/partygt
- lt/persongt
- ltperson id"wmo"gt
- ltnamegtWilhelm Moltererlt/namegt
- ltparty url"http//www.oevp.at"gtÖVPlt/partygt
- lt/persongt
- lt/staffgt
Attribute Value in single or double quotes!
Note multiple Attributes are separated through
whitespace
Attribute Name unique within an element!
7Element Content Text Nodes
- lt?xml version'1.0' encoding'UTF-8'?gt
- ltstaff organization"Bundesregierung"gt
- ltperson id"agu"gt
- ltnamegtAlfred Gusenbauerlt/namegt
- ltparty url"http//www.spoe.at"gtSPÖlt/partygt
- lt/persongt
- ltperson id"wmo"gt
- ltnamegtWilhelm Moltererlt/namegt
- ltparty url"http//www.oevp.at"gtÖVPlt/partygt
- lt/persongt
- lt/staffgt
Element
Element Content Text Node
8Element Content Child Elements
(Parent) Element
- lt?xml version'1.0' encoding'UTF-8'?gt
- ltstaff organization"Bundesregierung"gt
- ltperson id"agu"gt
- ltnamegtAlfred Gusenbauerlt/namegt
- ltparty url"http//www.spoe.at"gtSPÖlt/partygt
- lt/persongt
- ltperson id"wmo"gt
- ltnamegtWilhelm Moltererlt/namegt
- ltparty url"http//www.oevp.at"gtÖVPlt/partygt
- lt/persongt
- lt/staffgt
Element Content Child Elements
9XML Document with DTD
- lt?xml version'1.0' encoding'UTF-8'?gt
- lt!DOCTYPE staff SYSTEM "staff.dtd"gt
- ltstaff organization"Bundesregierung"gt
- ltperson id"agu"gt
- ltnamegtAlfred Gusenbauerlt/namegt
- ltparty url"http//www.spoe.at"gtSPÖlt/partygt
- lt/persongt
- ltperson id"wmo"gt
- ltnamegtWilhelm Moltererlt/namegt
- ltparty url"http//www.oevp.at"gtÖVPlt/partygt
- lt/persongt
- lt/staffgt
Root element name
System identifier (URI) of the entity/file with
Document Type Definition
Document Type Declaration
10XML Document with Comment
- lt?xml version'1.0' encoding'UTF-8'?gt
- lt!-- Comment content --gt
- lt!DOCTYPE staff SYSTEM "staff.dtd"gt
- ltstaff organization"Bundesregierung"gt
- ltperson id"agu"gt
- ltnamegtAlfred Gusenbauerlt/namegt
- ltparty url"http//www.spoe.at"gtSPÖlt/partygt
- lt/persongt
- ltperson id"wmo"gt
- ltnamegtWilhelm Moltererlt/namegt
- ltparty url"http//www.oevp.at"gtÖVPlt/partygt
- lt/persongt
- lt/staffgt
Comment content
Comment (-node) usually not interpreted by the
application
11Processing Instructions
- lt?xml version'1.0' encoding'UTF-8'?gt
- lt?xml-stylesheet href"style.css"
type"text/css"?gt - lt!DOCTYPE staff SYSTEM "staff.dtd"gt
- ltstaff organization"Bundesregierung"gt
- ltperson id"agu"gt
- ltnamegtAlfred Gusenbauerlt/namegt
- ltparty url"http//www.spoe.at"gtSPÖlt/partygt
- lt/persongt
- ltperson id"wmo"gt
- ltnamegtWilhelm Moltererlt/namegt
- ltparty url"http//www.oevp.at"gtÖVPlt/partygt
- lt/persongt
- lt/staffgt
PI Target
PI Data (no attributes!)
Processing Instruction interpretation depends on
application