Title: Basic XML
1Basic XML
2Outline
- Introduction
- The XML tutorial
- XML Progress and Projects
- Demonstrations
3Introduction
- Current Trends
- Current Issues in Data Management
- The XML Contribution
4Current Trends
- Moores Law computing power doubles every 18
months
Source http//www.intel.com/intel/museum/25anniv/
hof/moore.htm
5Current Trends
- Metcalfes Law the value of a network increases
with the square of its size
6Current Trends
Source http//www.snia.org/Robert_Gray/index.htm
7Current Issues in Data Management
- Increasing number of nodes applications,
databases, etc. - Increasing number of layers better definition
of functional responsibilities - Meta-data seeking to pre-answer significant
queries and retain context - Data Distribution and Synchronization are
becoming very expensive tasks
8Current Issues in Data Management
- Establishing the Value of Data
- When acquired, expected value should be greater
than cost of acquisition - The use of data determines the realized value
- Data has a current value based on potential use
- Re-use increases value of data
- Unusable data has no value
- Data quality should measure fitness-for-purpose
- fit-for-purpose depends upon the purpose
- Business processes changes alter the value of data
9Better Data Management
- Attach intended meaning to data
- Retain context (acquisition and use) of each
value - Who, what, when, how, etc.
- Human and machine readable/writeable
- Easy and inexpensive to implement
- Interpret when written, not as being read
- Increase scope of consistency
- Databases, Applications and Documents
10The XML Contribution
- XML provides significant benefits to data
management if - Tagged meaningful
- Hierarchical contextual
- Commercial available
- Distributed consistent
- Need to agree on how to use XML to get greatest
benefits
11Two XML Uses
- Exchange of data
- The data will end up in a database
- The data is potentially read by someone
- Use by another application.
- Requires also interface specifications.
XML is the Language of the Internet
12The XML Vision
13Strengths of XML
- Useful across many end-user technologies
- Consistency of use decreases data translation and
data rot - Cheap and Easy
14Shortcomings of XML
- XML data file sizes are significantly bigger than
non-XML formats - gt 1.2x the size of a text file
- gt 3-25x the size of application files
- Compression to within 10 of original file size
- External specification file (the optional DTD)
may not always be accessible - The Splintering Avalanche
15The XML Tutorial
- Building an XML file
- Exercise 1 Business Cards
- Building an XML DTD
- Exercise 2 BusinessCardML
16101 - Tags
- A tag is text surrounded with lt and gt
- A-z (case sensitive), 0-9, _, etc. (16 bit
unicode) - Do not start with _ or number or xml
- Cannot contain blanks ( )
- Each tag has a beginning ltstartDategt and and
ending lt/startDategt form must agree - Only one exception ltstartDate/gt
- Data values are placed between tags
- ltstartDategt20 January 1999lt/startDategt
- If no data, can use ltstartDate/gt instead
- blanks are retained in data
17Exercise 1Making Tags
- Trade your business card with a neighbor
- Construct a table with three columns
- data item, element, attribute
- In one column, list all of the data items on your
neighbors business card - In an adjacent column, construct an XML tag set
for each data item
18102 - Attributes
- Tags may have attributes
- ltstartDate typeestimatedgt20 January
1999lt/startDategt - Attribute data always enclosed in or
- Use attributes for
- Populating enumerated lists
- Defining default values
- Limiting extensibility
- Defining links (references, ids, relationships)
19Exercise 1Attributes
- Circle any information on your neighbors
business card which MAY be represented by an XML
attribute - Type of address? phone number?
- Link to logo?
- Construct an attribute label for each data item
which is a potential attribute
20103 Tag Hierarchy
- Tags may also contain other tags
- ltStartDate typeestimatedgt
- ltyeargt2000lt/yeargt
- ltmonthgtJanuarylt/monthgt
- ltdaygt20lt/daygt
- lt/StartDategt
- Nested tags must be closed before container tags
can be closed! - First (root) tag opened must be last tag closed!
21Exercise 1Design the XML file content
- Construct a root tag for your XML document
- Identify possible parent-child tags
- Using your tags and attributes, put all of the
data items into a hierarchy within the root tag
set - Place data values between opening and closing
forms - Nest children tags within parent tags
22104 XML Instructions
- All XML files start with this declaration
- lt?xml version1.0 ?gt
- lt? and ?gt denotes this is a processing
instruction these have no closing tag - Provides instructions to parsers
- character sets
- stylesheets
- multimedia instructions
- Etc.
23105 XML Extras
- Comments
- Use pre-defined tags lt!-- and --gt
- The comment may not contain --
- Non-XML character data (CDATA) may be passed
through the parser - Use pre-defined tags ltCDATA and gt
- No markup inside, cannot nest
- Use for source code, raw data, examples, etc.
24Exercise 1Complete the XML file
- Add XML declaration to top of your XML file
- Add comments
- Tell the file reader the purpose of the file
- Explain your rationale for tags and attributes
- Other information as appropriate
25Specifying the Content of an XML File
- The XML tags and their hierarchy are defined in a
separate file - DTD (Document Type Definition) is the current
standard document oriented - XML Schema is next years standard (early
implementations are available) data oriented
26Specifying the Content of an XML File
The specification file defines every tag name,
the nested tags each contains (order, optionality
and cardinality), how data values are
represented, and all attributes of the tag.
- The XML data file
- ltStartDate typeestimatedgt
- ltyeargt2000lt/yeargt
- ltmonthgtJanuarylt/monthgt
- ltdaygt20lt/daygt
- lt/StartDategt
- The XML Specification (DTD) file
- ltELEMENT StartDate (year, month?, day?) gt
- ltATTLIST StartDate type (estimated actual)
REQUIRED gt - ltELEMENT year (PCDATA) gt
- ltELEMENT month (PCDATA) gt
- ltELEMENT day (PCDATA) gt
27Specifying the Content of an XML File
- XML Specifications are easier to build and
maintain - End-user terms and structures
- Build sample data file then convert
- Publish Specifications allowing everyone to know
exactly how this data file is constructed - URL declared at top of XML data file
- Commercial tools enforce rules in the
specification when creating data - Ensures valid XML data files
28201 DTD Building Blocks
- DTDs define the format of XML, but are not
written in XML themselves - The components used in DTDs include
- Element a tag, the basic unit of XML definition
- Attribute a characteristic of an element
- Entity a re-usable definition that can be used
within elements or attributes
29202 DTD Elements
- Element Syntaxlt!ELEMENT tagName tagContentgt
- tagName gives the name of the element
- Follow rules for naming tags
- Unique within the DTD
- tagContent declares how data is associated with
the element - content model
- grouping operators
- occurrence indicators
30202 DTD Elements (Contd)
- tagContent
- Either describes how the tag holds data or how it
contains other tags - If tag contains data
- (CDATA) text of any length
- (PCDATA) text of any length but also tags
- ANY tag may contain any element or (PCDATA)
- EMPTY tag is always empty
- ltELEMENT startDate (CDATA)gt
31202 DTD Elements (Contd)
- tagContent
- If tag contains other (child) tags
- Place one or more tags within the ( and )
group indicators - Separate with
- if selecting one from set (i.e., OR operator)
- , if a sequence (i.e., THEN operator)
- ltELEMENT startDate ((year, month, day)
textDate)gt
32202 DTD Elements (Contd)
- tagContent
- If tag contains other (child) tags
- Add occurrence indicators for each tag
- Nothing one and only one
- one or more
- ? zero or only one
- zero or more
- ltELEMENT startDate ((year, month, day?)
textDate)gt
33Exercise 2Putting Elements in the DTD
- For each tag in your business card XML file,
write out the ELEMENT declarations - If they contain data, use (CDATA) or (PCDATA)
- If they contain elements, use grouping and
occurrence indicators
34203 DTD Attributes
- Attribute Syntaxlt!ATTLIST tagName attName
attContentgt - tagName gives the name of the element
- attName defines the attribute name
- Follow rules for naming tags
- Unique within the element
- attContent declares how data is associated with
the attribute - defines the allowed values
- declares the default value
- attName and attContent may be repeated to define
additional attributes for an element
35203 DTD Attributes (Contd)
- attContent attValue attDefault
- Describes how the attribute holds data
- attValue
- CDATA text of any length
- (value1 value2 ) an enumerated list
- ID (surrogate key) and IDREF(S) (pointer(s) to
ID) - attDefault
- value1 selected value is default (quotes
required) - REQUIRED must be specified (mandatory)
- IMPLIED may be omitted (optional)
36Exercise 2Putting Attributes in the DTD
- For any attributes you defined in your XML file,
add these to the ELEMENT declarations - Use CDATA for text and numerical values
- Use REQUIRED for mandatory attributes
- Use IMPLIED for optional attributes
37204 DTD Entities
- Entities define re-usable sections of text
- In DTD
- lt!ENTITY lengthUoM ft m gt
- lt!ATTLIST casingString length (lengthUoM) mgt
- In XML
- lt!ENTITY author POSCgt
- ltspecificationAuthorgtauthorlt/specificationAuthor
gt - Can point to other DTDs
38205 DTD Extras
- Comments
- Use pre-defined tags lt!-- and --gt
- The comment may not contain --
- Notations
- Identifies non-XML data
- Locates an application that can process the
non-XML data
39Exercise 2Complete the DTD file
- Add comments
- Tell the reader the purpose of the file
- Document your Elements and Attributes
- Consolidate re-used text into Entities
- Generalize the structure if necessary
40301 Validated XML
- Well Formed XML contains properly constructed,
structured tags - Validated XML includes a link to a DTD which
defines its structure - Requires a statement at the top of the XML file
that declares the DTD utilized - - lt!DOCTYPE RootTag SYSTEM name.dtdgt
- Entities may be locally defined, also
41Exercise 3Add the DTD to the XML file
- Add the DOCTYPE statement to your XML file
- Attempt to open your XML file in IE4 (if
available) - Debug, Publish, Use and Improve
42The XML Family
- XSL Transformation (XSLT)
- Transform XML into HTML
- Transform one XML into another XML
- SVG display of complex 2D pictures
- SMIL provides multimedia capabilities
- DOM (tree) and SAX (event) provide APIs for XML
documents
43The XML Family
- New Parts of XML are being added to provide new
functionality - XLink define hypertext addresses
- XPointer addressing parts of XML data
- XForms better processing of forms
- SOAP a simple object API
- XQL XML query language
- And a host of others . . .
44Projects and Progress
- The XML Standards
- W3C
- Standards for Using XML
- ebXML and OpenGIS
- Repositories
- EP Industry Activity
- Implementations
45XML Standards
- World Wide Web Consortium (W3C)
- A 400 member international consortium
- Publishes recommendations and Technical Reports
rather than specifications - Vendor neutral
- Goals
- Universal web access
- Best use of WWW resources (tools)
- Guide technical development of the WWW
46Standards for Using XML
- ebXML
- UN/EDIFACT (EDI, trade and eBusiness) and OASIS
(SGML, structured data interchange) - an open XML-based infrastructure global
interoperable, secure and consistent - OpenGIS Consortium
- Standard interface for GIS objects
- Geographical ML (GML)
47Standards for Using XML
- XML Repositories organizations which collect
and publish XML document specifications (DTDs,
etc.) - BizTalk - Microsoft
- XML.ORG Registry OASIS members
- BizCodes a proposal to share XML terms
(elements, attributes) across XML documents and
files
48Standards for Using XML
- EP Industry
- PIDX
- PIDD dictionary of common EP terms used in EDI
- DTI (UK)
- PPRD regulatory reporting of monthly production
data - POSC
- WellLogML well log header and trace data
- LogGraphicsML display of well logs
- ProductionML monthly and annual production
reporting - WellSchematicML surface and downhole equipment
49Implementations
- XML.ORG has list of 140 XML producers
- (see http//xml.org/xmlorg_registry/index.shtml)
- XML is appearing in EP products
- well logs, drilling, wellbore configuration,
production, invoicing, procurement - BizTech4Energy working on DTDs for
Business-to-Technical data interchange - Landmark, Microsoft, PWC, SAP, Schlumberger,
others
50XML Development Attitudes
- I am first, so follow me - vendors
- Its faster/better/cheaper to do it myself
other vendors - Its my turf, so follow me standards
organizations - You have to do it my way regulators
- Cooperation is Absolutely Essential!
51Demonstrations
- Using XML for regulatory forms
- BLM APD
- Using XML for well logs
- Well log header and trace display using SVG
- Using XML to complete, edit and submit regulatory
forms - MMS - Web-based submission of Weekly Activity
Reports (WAR)s and Item 17 forms
52BLM Application for Permit to Drill
- Original Form 3160-3
- Form 3160-3 XML DTD
- Sample XML file
- XSL Transformation file
- The result of XML XSLT
53WellLogML Data in XML
- lt/WellInformationgt
- ltCurveInformationgt
- ltCurve name"RILD" uom"OHMM" mnemonic"RILD
"gt - ltnullValuegt-99999.0000lt/nullValuegt
- ltcurveDescriptiongtDeep Induction
Resistivitylt/curveDescriptiongt - lt/Curvegt
- ltParameterInformationgt
- ltParameterBlockgt
- ltparameter description"Company Name" uom""
mnemonic"CN "gtWARREN PETR. CORP.lt/parametergt - ltparameter description"Run Number" uom""
mnemonic"RUN"gt1lt/parametergt - ltparameter description"API Logging Company
Code" uom"" mnemonic"LCC"gt440lt/parametergt - ltparameter description"Annual Mean Surface
Temp." uom"DEGF" mnemonic"ST"gt80.0000lt/parameter
gt - ltparameter description"Loggers Total Depth"
uom"FT" mnemonic"TDL"gt5764.0000lt/parametergt - lt/ParameterBlockgt
- lt/ParameterInformationgt
- ltCurveData format"READABLE" curve"RILD"gt
- ltdatagt
- 6.2800 6.3760 6.4720 6.5680 6.6640 6.8370
7.0100 7.1830 7.2900 7.3960 - 7.3140 7.2310 7.0410 6.8520 6.6620 6.4730
6.2830 6.1640 6.0450 5.9260
54WellLogML Data in HTML
- lttable width"100" cellspacing"0"
cellpadding"1" border"2"gt - lttr xmlns""gt lttd colspan"6"gt ltcentergt ltbgtCurve
Informationlt/bgt lt/centergt lt/tdgt lt/trgt - lttr xmlns""gt lttdgtltbgtStart Indexlt/bgtlt/tdgt
- lttdgtltbgtEnd Indexlt/bgtlt/tdgt
- lttdgtltbgtIndex
Spacinglt/bgtlt/tdgt - lttd width"60"
colspan"3"gtltbgtCurveslt/bgtlt/tdgt lt/trgt - lttr xmlns""gt lttdgt3000.0000 F lt/tdgt
- lttdgt6457.0000 F lt/tdgt
- lttdgt1.0000 F lt/tdgt
- lttd width"60"
colspan"3"gtDEPT, RILD, RSN, SPlt/tdgt lt/trgt - lt/tablegt
- ltbrgt lta name"svg"gtlt/agt
- ltembed src"/ebiz/xmlLive/WellLogViewer/sv
g/962821424_45.svg.svgz" height"3360"
width"729.76" type"image/svg-xml"gt - lt/embedgt
- lt/BODYgt
- lt/HTMLgt
55WellLogML Data in SVG
- lt?xml version"1.0" standalone"no"?gt
- lt!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 20000303
Stylable//EN" "http//www.w3.org/TR/2000/03/WD-SVG
-20000303/DTD/svg-20000303-stylable.dtd"gt - ltsvg width"6.56in" height"35.6in"
enableZoomAndPanControls"false"gt - lttext style"fill00ffont-familyarialfont-size
10pt" x"0.1in" y"0.2in"gtBOREHOLE - lt/textgt
- ltg transform"translate(0.0in,0.6in)"
id"BoreholeEvents"gt - ltsvg width"1.0in" height"35in"gt
- ltg id"Borehole" style"fill00fstroke00fstro
ke-width1pt"gt - ltline x1"30" y1"0in" x2"30" y2"34.57in"/gt
- ltline x1"70" y1"0in" x2"70" y2"34.57in"/gt
- lt/ggt
- lt/svggt
- lt/ggt
56Demonstrations using XML
- POSC Live!
- Well Log Viewer
- Monthly Production Reporting
- MMS Demonstration
57More Information
- Read about POSC at http//www.posc.org
- Most files and demonstrations are available at
http//www.posc.org/ebiz/ - Contact us at 1 (713) 784-1880 or
info_at_posc.org - Visit our XML Activity in EP web page and sign
up with the public (mail) list server at
xmlActivity_at_posc.org