Title: XML: Basics
1XML Basics
- Paul V. Biron
- Permanente Clinical Systems Development
- Kaiser Permanente, Southern California
- Paul.V.Biron_at_kp.org
2Outline
- HTML, SGML and XML
- The World Wide Web Consortium (W3C)
- HTML as an application of SGML
- XML define your own tag set
- XML vs. "standard" HL7 encoding
- Document Type Definitions (DTDs)
- Well-formed vs. valid
- Elements
- Attributes
- Entities
3Outline (cont.)
- The "XML Family" of Standards
- XSL
- XLL
- Namespaces
- Next generation schema definition languge
- Benefits to be gained
- Representational expansion
- Wide availability of toolsets
- Wide availability of trained personnel
4HTML, SGML, XML
- HTML HyperText Markup Language
- Recommendation of the World Wide Web
Consortium (W3C) - an application of SGML (the HTML DTD)
- SGML Standard Generalized Markup Language
- ISO 88791986(E)
- XML is a proper subset of SGML
- XML and SGML are metalanguages
- Language for defining other languages
- Recommendation of the W3C
- Formally Adopted on 10-February-1998
5HTML is SGML (although not XML)
- Radiology Report - Chest X-Ray
- Patient Information
- Name Henry Levin, the 7th
- MRN 123456789
- DOB May 13, 1923
- Clinical Data
- History of smoking for 40 years.
- Procedure
- Chest X-Ray
- Findings
- Comparison is made with a chest x-ray ...
- Impressions
- RLL nodule, suggestive of malignancy. Compared
with a prior CXR from 6 months ago, nodule size
has increased.
lth2gtPatient Informationlt/h2gt ltulgt
ltligtltbgtNamelt/bgtHenry Levin, the 7thlt/ligt
ltligtltbgtMRNlt/bgt 123456789lt/ligt
ltligtltbgtDOBlt/bgt May 13, 1923lt/ligt lt/ulgt lth2gtClini
cal Datalt/h2gt ltpgtHistory of smoking for 40
years.lt/pgt lth2gtProcedurelt/h2gt ltpgtChest
X-raylt/pgt lth2gtFindingslt/h2gt ltpgtComparison is
made with a chest-x-ray lt/pgt lth2gtImpressionslt/h
2gt ltpgtRLL nodule, suggestive of malignancy.
Compared with a prior CXR from 6 months ago,
nodule size has increased.lt/pgt lth2gtRecommendation
slt/h2gt ltpgtI notified the ordering physician of
this finding by phone.lt/pgt
6XML - Define your own tags
- Radiology Report - Chest X-Ray
- Patient Information
- Name Henry Levin, the 7th
- MRN 123456789
- DOB May 13, 1923
- Clinical Data
- History of smoking for 40 years.
- Procedure
- Chest X-Ray
- Findings
- Comparison is made with a chest x-ray ...
- Impressions
- RLL nodule, suggestive of malignancy. Compared
with a prior CXR from 6 months ago, nodule size
has increased.
ltRadiologyReportgt ltPatientInfogt ltNamegtHenry
Levin, the 7thlt/Namegt ltMRNgt123456789lt/MRNgt
ltDOBgtMay 13, 1923lt/DOBgt lt/PatientInfogt ltClinic
alDatagtHistory of smoking for 40
years.lt/ClinicalDatagt ltProceduregtChest
X-raylt/Proceduregt ltFindingsgt Comparison is made
with a chest-x-ray lt/Findingsgt ltImpressionsgt R
LL nodule, suggestive of malignancy. Compared
with a prior CXR from 6 months ago, nodule size
has increased. lt/Impressionsgt ltRecommendationsgt I
notified the ordering physician of this finding
by phone. lt/Recommendationsgt lt/RadiologyReportsgt
7XML - Define your tag arrangement
lt!ELEMENT RadiologyReport (PatientInfo,
ClinicalData, Procedure, Findings,
Impressions, Recommendations)gt lt!ELEMENT
PatientInfo (Name, MRN, DOB)gt lt!ELEMENT
Name (PCDATA)gt lt!ELEMENT
MRN (PCDATA)gt lt!ELEMENT DOB
(PCDATA)gt lt!ELEMENT
ClinicalData (PCDATA)gt lt!ELEMENT
Procedure (PCDATA)gt lt!ELEMENT
Findings (PCDATA)gt lt!ELEMENT
Impressions (PCDATA)gt lt!ELEMENT
Recommendations (PCDATA)gt
ltRadiologyReportgt ltPatientInfogt ltNamegtHenry
Levin, the 7thlt/Namegt ltMRNgt123456789lt/MRNgt
ltDOBgtMay 13, 1923lt/DOBgt lt/PatientInfogt ltClinic
alDatagtHistory of smoking for 40 years.
lt/ClinicalDatagt ltProceduregtChest
X-raylt/Proceduregt ltFindingsgt Comparison is made
with a chest-x-ray lt/Findingsgt ltImpressionsgt R
LL nodule, suggestive of malignancy. Compared
with a prior CXR from 6 months ago, nodule size
has increased. lt/Impressionsgt ltRecommendationsgt
I notified the ordering physician of this finding
by phone. lt/Recommendationsgt lt/RadiologyReportsgt
8XML vs. Standard Encoding
HL7 Message
ltRadiologyReportgt ltPatientInfogt ltNamegtHenry
Levin, the 7thlt/Namegt ltMRNgt123456789lt/MRNgt
ltDOBgtMay 13, 1923lt/DOBgt lt/PatientInfogt ltClinica
lDatagtHistory of smoking for 40
years.lt/ClinicalDatagt ltProceduregtChest
X-raylt/Proceduregt ltFindingsgt Comparison is made
with a chest-x-ray lt/Findingsgt ltImpressionsgt R
LL nodule, suggestive of malignancy. Compared
with a prior CXR from 6 months ago, nodule size
has increased. lt/Impressionsgt ltRecommendationsgt I
notified the ordering physician of this finding
by phone. lt/Recommendationsgt lt/RadiologyReportsgt
MSHltcrgt PID1123456789LevinHenrythe
7th19230513ltcrgt OBR1Chest
X-rayltcrgt OBX1TX71020GDTClinical data
History of smoking for 40 years.ltcrgt OBX2TX710
20GDTFindings Comparison is made with a
chest-x-ray ltcrgt OBX3CE71020IMPRLL
nodule, suggestive of malignancy. Compared with a
prior CXR from 6 months ago, nodule size has
increased.ltcrgt OBX4CE71020RECI notified
the ordering physician of this finding by phone.
ltcrgt
9XML vs. Standard Encoding
lt?xml version"1.0" ?gt lt!DOCTYPE ORU.R01 SYSTEM
"hl7_v23.dtd"gt ltORU.R01gt ltMSHgtlt/MSHgt ltPID
PID.1"1"gt ltPID.3 CX.1"123456789"/gt
ltPID.5 XPN.1"Levin" XPN.2"Henry" XPN.3"the
7th"/gt ltPID.7 TS.1"19230513"/gt lt/PIDgt ltOBR
OBR.1"1"gt ltOBR.4 CE.2"Chest
X-ray"/gt lt/OBRgt ltOBX OBX.1"1" OBX.2"TX"gt
ltOBX.3 CE.1"71020" SUB"GDT"/gt ltOBX.5
TX.1"Clinical data History of..."/gt lt/OBXgt ltOBX
OBX.1"2" OBX.2"TX"gt ltOBX.3 CE.1"71020"
SUB"GDT"/gt ltOBX.5 TX.1"Findings
Comparison is made"/gt lt/OBXgt ltOBX OBX.1"3"
OBX.2"CE"gt ltOBX.3 CE.1"71020" SUB"IMP"/gt
ltOBX.5 CE.2"RLL nodule, suggestive of
..."/gt lt/OBXgt ltOBX OBX.1"4" OBX.2"CE"gt
ltOBX.3 CE.1"71020" SUB"REC"/gt ltOBX.5
CE.2"I notified the ordering physician...
"/gt lt/OBXgt lt/ORU.R01gt
HL7 Message
MSHltcrgt PID1123456789LevinHenrythe
7th19230513ltcrgt OBR1Chest
X-rayltcrgt OBX1TX71020GDTClinical data
History of... ltcrgt OBX2TX71020GDTFindings
Comparison is made ltcrgt OBX3CE71020IMPRL
L nodule, suggestive of... ltcrgt OBX4CE71020RE
CI notified the ordering physician... ltcrgt
10Document Type Definitions
- Document Type Definitions
- also referred to as DTDs
- Provide a "schema" for a class of documents
- A "schema" defines certain semantic and
structural constraints, including - A set of element declarations
- Documentation
- Optional supporting specifications, such as style
sheets - DTDs are one language for writing schemas for
both SGML and XML documents - Recent proposals (e.g., Document Content
Descriptors, submission to W3C) have been made
11Well-Formed vs. Valid
- A well-formed document must match the production
labeled document - In common usage, this often translates to The
elements, delimited by start- and end-tags, nest
properly within each other - An XML document is valid if it has an associated
document type declaration and if the document
complies with the constraints expressed in it
XML Document
prolog
element
Misc
12Overview of XML
13Elements
Element Only other elements may be
present in the elements content You can
control the order of these sub-elements
XML Document
lt!ELEMENT ORU (OBR, OBX)gt ...
ltORUgt ltOBR OBR.1"1"gt ltOBR.4
CE.1"80004" CE.2"ELECTROLYTES"/gt
lt/OBRgt ltOBX OBX.1"1" OBX.5"150"
OBX.11"F"gt ltOBX.3 CE.1"84295" CE.2"NA"/gt
ltOBX.6 CE.1"mmol/l"/gt lt/OBXgt lt/ORUgt
14Attributes
String (CDATA) String (or CDATA) attributes may
take any literal string as a value, including
whitespace characters CDATA (character data)
XML Document
lt!ATTLIST OBX.3 CE.1 CDATA
IMPLIED CE.2 CDATA IMPLIED
CE.3 CDATA IMPLIEDgt ...
ltORUgt ltOBR OBR.1"1"gt ltOBR.4
CE.1"80004" CE.2"ELECTROLYTES"/gt
lt/OBRgt ltOBX OBX.1"1" OBX.5"150"
OBX.11"F"gt ltOBX.3 CE.1"84295" CE.2"NA"/gt
ltOBX.6 CE.1"mmol/l"/gt lt/OBXgt lt/ORUgt
15Entities
XML Document
prolog
element
16The "XML Family" of Standards
- XSL Extensible Style Language
- Declarative language for "screen rendering" and
transformation - Will allow a single XML file to be "rendered" in
multiple formats - viewing HL7 messages in a standard web browser
- W3C V1.0 "working draft" released August 18, 1998
- XLL Extensible Link Language
- Inter/Intra document linking specifications
- HTML's lta href""gt on steroids
- Xlink W3C working draft released March 3, 1998
- Xpointer W3C working draft released March 3, 1998
17The "XML Family" of Standards
- Namespaces
- Mechanism for avoiding element/attribute name
collisions - W3C working draft released August 2, 1998
- To be incorporated into a future revision of the
XML Specification - Next generation "schema" definition languages
- DTD's on steroids
- More extensive datatyping
- Inheritance mechanism(s)
- W3C working group recently formed
- Working draft(s)/recommendation(s) at least 1
year away - "Document Content Description" W3C submission
acknowledged August 10, 1998
18The Benefits to be Gained
- Representational expansion
- Recursive (segment-less) encodings
- "self-documenting" structures
- Wide availability of toolsets
- XML parsers and parser SDKs
- Stylesheet editors and processors
- Transformation (conversion) engines
- Wide availability of trained personnel
- XML programmers
19Questions
XML Document
prolog
element
lt?QUESTIONS ?gt