What XML Schema Designers Need to Know About Measurement Units PowerPoint PPT Presentation

presentation player overlay
1 / 40
About This Presentation
Transcript and Presenter's Notes

Title: What XML Schema Designers Need to Know About Measurement Units


1
What XML Schema Designers Need to Know About
Measurement Units
  • Frank Olken and John McCarthy
  • Lawrence Berkeley National Laboratory
  • Presented to XTECH 2000, San Jose, CA
  • February 29, 2000

2
Content of Talk
  • Syntax issues - markup
  • Semantics- units, dimensionality
  • Architecture - references to shared registries
  • Units registry operations - who???

3
What is the issue?
  • XML is used for data exchange
  • Often used to encode measured quantities
  • e.g., for Ecommerce, engineering, medicine
  • How do we encode measurement units?

4
Why do we need measurement units?
  • Quantities without units are meaningless !!!
  • Misunderstandings about units causes
  • loss of spacecraft (Mars Climate Explorer)
  • contractual disputes
  • potential loss of life (in medicine)
  • Common error
  • Delusions of shared assumptions about measurement
    units.

5
Actually two issues
  • What do we need to specify?
  • Units
  • Dimensionality
  • Property ?
  • How do we write it in XML?

6
XML encodings of units
  • Implicitly structured strings
  • ltheightgt 5 inches lt/heightgt
  • bad style (requires special parser)

7
XML encodings of units
  • Explicit markup is better, e.g.,
  • ltspeedgt
  • ltvaluegt 5 lt/valuegt
  • ltunitsgt km/hr lt/unitsgt
  • lt/speedgt
  • advantage easy to identify units info.
  • problem parsing units values still requires
    special purpose parser

8
XML encodings of units
  • Explicit markup with namespaces/URIs
  • ltx xmlnsisoUnitshttp//www.iso.org/units gt
  • ltspeedgt
  • ltvaluegt 5 lt/valuegt
  • ltunitsgt ltxlink typesimple, REFisoUnitskmPer
    Hour /gt lt/unitsgt
  • lt/speedgt lt/xgt
  • advantages
  • exploits existing lookup mechanism
  • standard units designators, mechanism for
    checking dimensional consistency
  • disadvantage
  • nonstandard units names (XML prohibits slashes
    in IDs)

9
Architecture of XML Units Encoding
  • Use URI references from schemas/instances into
    standard units registries
  • Units/Dimensionality registries have detailed
    fully marked up descriptions of derived/composite
    units/dimensions in terms of
  • Basis units/dimensions

10
Basis units declaration (XML)
  • ltunitdecl ID"meter" /gt
  • ltnamegt meter lt/namegt
  • ltUnitTypegt Base lt/UnitTypegt
  • ltsymbolgt m lt/symbolgt
  • ltdimensionalitygt ltA
    xmllinksimple REFlength" /gt
  • lt/dimensionalitygt
  • ltdefinitiongt
  • Length of path traveled
    by light in a time interval of
  • 1/299,792,458 second.
  • lt/definitiongt
  • ltcitegt ltA xmllinksimple ref..
    /gt lt/citegt
  • lt/unitdeclgt

11
Derived units declaration (XML)
  • ltunitdecl ID"inch" /gt
  • ltnamegt Inch lt/namegt
  • ltsymbol gt in lt/symbolgt
  • ltunitTypegt Derived lt/unitTypegt
  • ltdimensionalitygt ltA
    xmllinksimple REF"length" /gt
  • lt/dimensionalitygt
  • ltconversionfactorgt 0.0254
    lt/conversionfactorgt
  • ltunits REF"meter" /gt
  • lt/unitdeclgt

12
Composite units decl. (XML)
  • ltunitdecl ID"metersPerSecond" gt
  • ltnamegt MetersPerSecond lt/namegt ltUnitTypegt
    Composite lt/UnitTypegt
  • ltdimensionalitygt ltA xmllinksimple
    REF"DimSpeed" /gt
  • lt/dimensionalitygt
  • ltunitsgt
  • ltnumeratorgt
  • ltunitgt ltradixgt ltA xmllink"simple"
    REF"meter" /gtlt/radixgt
  • ltexponentgt 1 lt/exponentgt
    lt/unitgtlt/numeratorgt
  • ltdenominatorgt
  • ltunitgt ltradixgt ltA xmllink"simple"
    REF"second"/gtlt/radixgt
  • ltexponentgt 1
    lt/exponentgt lt/unitgt ltdenominatorgt
  • lt/unitsgt
  • lt/unitdeclgt

13
Units vs. Dimensionality
  • Units
  • representational issue
  • e.g., feet, meters, centimeters
  • Dimensionality
  • semantic concept
  • e.g., mass, length, time
  • speed length / time

14
Units not sufficient, need dimensionality
  • Example tons
  • unit of mass
  • unit of power (refrigeration)
  • unit of energy (megatons)
  • These are all homonyms which need to be
    distinguished, hence
  • Also need to specify dimensionality

15
Dimensional consistency
  • Units with same dimensionality are said to be
    dimensionally consistent
  • Examples
  • Dimensionality time
  • units seconds, minutes, hours, days
  • Dimensionality length
  • units feet, inches, meters, kilometers
  • Dimensionality Length / Time
  • units kilometers / hour, meters/second,
    miles/hour

16
Significance of dimensional consistency
  • Usually IMPLIES
  • Comparability
  • Additivity
  • Automated unit conversion
  • Some exceptions to these rules
  • (see below)

17
Classic representation of dimensionality
  • Product of powers of basis dimensions
  • Basis dimensions
  • mass, length, time, number of (moles), current,
    luminous intensity, temperature
  • Exponents
  • integers, range from -4 to 4
  • Example
  • energy Mass Length2 Time(-2)

18
Dimensionless quantities concentration
  • mass/mass, moles/moles, or volume/volume
  • Classic theory says these are dimensionally
    consistent, hence comparable, additive .
  • NO !!!
  • Need to distinguish dimensionless mass, molar,
    volumetric ratios

19
Revised dimensionality
  • Numerator
  • product of non-negative powers of basis
    dimensions
  • Denominator
  • product of non-negative powers of basis
    dimensions
  • Can distinguish
  • mass/mass vs. moles / moles
  • Problem breaks the dimensional algebra

20
When to specify units/dimensionality?
  • Specify in schema? Or in instance?
  • Preferred
  • specify both dimensionality and units in schema
  • homogeneous units in document instance produces
    fewer errors
  • easier to search

21
Second best choice
  • Specify dimensionality in schema
  • Specify units in document instance
  • e.g., per element instance
  • e.g, dimensionality length (in schema)
  • Units feet or meters (in instance)
  • Advantage can check for plausible units

22
Worst Choice
  • Specify both units and dimensionality in document
    instances.
  • Checking of dimensions is impossible.
  • Can check for units consistent with
    dimensionality.
  • Necessary for heterogeneous catalogs ...

23
Ideal XML Encoding of Units and Dimensionalities
  • Extend XML Schema Basic Datatypes
  • Add facets on types to encode units and
    dimensionality
  • Hence, schema/query processor can check units and
    dimensionality
  • Problems huge type lattice, complex type
    checking / unit conversion

24
XML Encoding Considerations
  • Use detailed markup to specify units and
    dimensionality.
  • This simplifies design of processor for checking
    units compatibility and automatic units
    conversion.
  • Result is very verbose - see above.

25
Practical XML Encoding Solution
  • Store detailed units/dimensionality info at well
    known site
  • Use URI reference to point to full
    units/dimensionality specification
  • Use namespaces to shorten URI reference in
    instances
  • Need canonical encodings of units for URI
    references

26
Corollaries
  • Someone needs to maintain units/dimensionality
    repositories
  • Separate applications to check units
    compatibility are needed
  • Can automate units conversion
  • Implies standard XML query language will not
    check units .

27
Basis Dimension Declaration (XML)
  • ltdimensiondecl ID"length" gt
  • ltnamegt length lt/namegt
  • ltDimensionTypegt Base
    lt/DimensionTypegt
  • ltdefinitiongt
  • A measurement of
    distance.
  • lt/definitiongt
  • ltcitegt ltA xmllink"simple"
    REF"...." /gt lt/citegt
  • ltexampleUnitsgt
  • Meters, ....
  • lt/exampleUnitsgt
  • lt/dimensiondeclgt

28
Composite Dimensionality Declaration (XML)
  • ltdimensiondecl ID"DimSpeed" gt
  • ltnamegt Speed lt/namegt
  • ltDimensionTypegt Composite lt/DimensionTypegt
  • ltdimensionalitygt
  • ltnumeratorgt ltdimensiongt
  • ltradixgt ltA xmllink"simple"
    REF"length" /gt lt/radixgt
  • ltexponentgt 1 lt/exponentgt
    lt/dimensiongt lt/numeratorgt
  • ltdenominatorgtltdimensiongt
  • ltradixgt lt XLINK REF"time" gt
    lt/radixgt
  • ltexponentgt 1 lt/exponentgt
    lt/dimensiongt ltdenominatorgt
  • lt/dimensionalitygt
  • lt/dimensiondeclgt

29
Who will maintain the XML units repository?
  • W3C ? No, lacks units expertise
  • OASIS? (xml.org) ?
  • Intl. Scientific Societies IUPAC? ICSU?
  • NIST ?
  • Engineering Societies IEEE ? ASME?
  • ASTM ?
  • American Physics Society? American Chemical
    Society ?
  • UN/CEFACT ? (ebXML.org)?
  • ISO? IEC ?

30
Measures vs. Coordinates
  • Measures
  • length
  • temperature difference
  • time
  • subtended angle
  • Coordinates
  • position
  • absolute temperature
  • datetimestamp
  • latitude/longitude

31
Automatic unit conversion
  • Only for dimensionally consistent units
  • Convert to/from canonical (SI) units
  • hence O(n) conversion factors, vs. O(n2)
    converson factors

32
Dimensionally inconsistent unit conversions
  • Example mass to/from volume
  • wheat in bushels or in tons
  • oil in barrels or tons
  • Very common in commerce
  • Requires knowledge of material density
  • Should be done explicitly (user/application)

33
Additional complexities
  • Dimensionality is not sufficient to specify type
    lattice
  • Example
  • torque and work (energy)
  • same dimensionality
  • Mass Length2 / Time2
  • but these are incommensurate
  • torque cross product, work dot product

34
Implications
  • Need to subtype dimensionality to differing
    properties
  • May need rules on comparability, additivity of
    properties (subtypes) of common dimensionality

35
Conclusions
  • Must specify both units and dimensionality
  • either in schema or instances
  • Use product of basis dimensions (units)
  • Use URI references to detailed specs
  • Dimensional analysis theory is incomplete
  • Custodian organization for units repository?

36
Acknowledgements
  • This work supported by U.S. Environmental
    Protection Agency, Superfund Office
  • Program manager Bruce Bargmeyer
  • Also thank Peter Murray-Rust, Malcolm Panthaki,
    Max Sherman, for discussions, suggestions, etc.

37
Contact Information
  • Frank Olken, olken_at_lbl.gov, http//pueblo.lbl.gov/
    olken Lawrence Berkeley National Lab, Mailstop
    50B-3238, 1 Cyclotron Road, Berkeley, CA
    94720, Tel 510-486-5891
    Pager 510-442-7361
  • John L. McCarthy, jlmccarthy_at_lbl.gov,
    http//www.lbl.gov/mccarthy Lawrence Berkeley
    National Lab, Mailstop 50C, 1 Cyclotron Road,
    Berkeley, CA 94720, Tel 510-486-5307

38
Bibliography
  • Olken, F., McCarthy,J. Measurement Units in XML
    Datatypes, http//www.lbl.gov/olken/mendel/w3c/x
    ml.schema.wg/units/ syntax.htm , June 1999
  • Olken, F., McCarthy,J. Simplified Measurement
    Units for XML Datatypes, http//www.lbl.gov/olke
    n/mendel/w3c/xml.schema.wg/units/
    simplesyntax.htm , June 1999
  • Hart, George W. Multidimensional analysis
    algebras and systems for science and engineering
    /, George W. Hart. New York Springer-Verlag,
    c1995.
  • Schadow, G McDonald, CJ Suico, JG Föhring, U
    Tolxdorff, T. "Units of measure in clinical
    information systems", Journal of the American
    Medical Informatics Association, 1999 Mar-Apr,
    vol. 6 number 2, pages151-62.

39
Bibliography (cont.)
  • Taylor, Barry. Guide for the Use of the
    International System of Units (SI) NIST Special
    Publication 811, 1995 Edition, U.S. National
    Institute of Standards and Technology
  • ISO TC-12. ISO 311992 Parts 0-13 Quantities and
    units, ISO Standards Handbook, International
    Organization for Standardization, 345 pages, 3rd
    edition, Geneva, 1993, ISBN 92-67-10185-4.
    (Available in the United States from ANSI)
    (Contains multiple ISO standards )
  • .Gruber, T.R. Olsen, G.R. An ontology for
    engineering math-ematics. in (Edited by Doyle,
    J. Sandewall, E. Torasso, P.) Proc, of 4th
    International Conference on Principles of
    Knowledge Representation and Reasoning (KR'94),
    Bonn, Germany, 24-27 May 1994.) San Francisco,
    CA, USA Morgan Kaufmann Publishers, 1994.
    p.258-69 (See also URL http//www-ksl.stanfor
    d.edu /knowledge-sharing/papers/engmath.html )

40
Bibliography (cont.)
  • Monica Gayle Funston, Walter Gerstle, and Malcolm
    Panthaki, "Quantity, Revisited An
    Object-Oriented Reusable Class", URL
    http//www.arc.unm.edu/CoMeT/publication/quantity.
    html, 1998(?)
  • Greene, Stephan. Metadata for units of measure
    in social science databases, International
    Journal of Digital Libraries, (1977), vol. 1, pg.
    161-175
  • de Boer, J. On the Hisotry of Quantity
    Calculus, Metrologia, 1994/1995, vol. 32, pg.
    405-429
  • Gehani, Narain H. Databases and Units of
    Measure, IEEE. Tans. On Software Engineering,
    vol. SE-8, no. 6, Nov. 1982, pg. 605-611.
  • Karr, M. and Loveman, D.B. Incoporation of
    Units into Programming Languages, Comm. ACM, May
    1978, vol. 21, no. 5, pg. 385-390
  • many, many other papers/standards . Send
    citations to olken_at_lbl.gov I will post to my
    web site.
Write a Comment
User Comments (0)
About PowerShow.com