Title: an International Virtual Observatory data exchange format
1an International Virtual Observatorydata
exchange format
Roy Williams François Ochsenbein Clive
Davenhall Daniel Durand Pierre Fernique David
Giaretta Robert Hanisch Tom McGlynn Alex Szalay
Andreas Wicenec
2XML Structured Information
ltFromgtAntonio Stadivariuslt/Fromgt ltTogtDomenico
Scarlattilt/Togt ltDategt ltDaygt13lt/Daygt ltMonthgt4lt/Mo
nthgt ltYeargt1723lt/Yeargt lt/Dategt ltBodygt Io bisogno
una appartamento acoglienti a Cremona lt/Bodygt
Separation of structure from presentation
4/13/23 April 13, 1723 17.iv.1723
The computer can read the document Find all
memos from April 1723
3VOTable
- Full metadata representation
- Hierarchy of RESOURCEs
- containing PARAMs and TABLEs
- UCD (unified content descriptor)
- a has unit meter
- a has UCD ORBIT_SIZE_SMAJ (Semi-major axis of
the orbit ) - Can reference remote and/or binary streams
- Table can be
- Pure XML
- "Simple Binary"
- FITS Binary Table
4VOTable Parentage
- Astrores
- XML format for tables
- Developed at CDS Strasbourg
- Presented at ADASS 1999
- Vizier implementation
- XSIL
- XML format for Tables and Arrays
- Developed at LIGO Caltech 2000
- Extensible through Type-Class dynamic loading
- Java parsing, browsing, editing
- Matlab interface
5Sample VOTable
lt?xml version"1.0"?gt lt!DOCTYPE VOTABLE SYSTEM
"http//us-vo.org/xml/VOTable.dtd"gt ltVOTABLE
version"1.0"gt ltDEFINITIONSgt ltCOOSYS
ID"myJ2000" equinox"2000." epoch"2000."
system"eq_FK5"/gt lt/DEFINITIONSgt ltRESOURCEgt
ltPARAM name"Observer" datatype"char"
arraysize"" value"William Herschel"gt
ltDESCRIPTIONgtThis parameter is designed to store
the observer's name lt/DESCRIPTIONgt
lt/PARAMgt ltTABLE name"Stars"gt
ltDESCRIPTIONgtSome bright starslt/DESCRIPTIONgt
ltFIELD name"Star-Name" ucd"ID_MAIN"
datatype"char" arraysize"10"/gt ltFIELD
name"RA" ucd"POS_EQ_RA" ref"myJ2000"
unit"deg" datatype"float"
precision"F3" width"7"/gt ltFIELD
name"Dec" ucd"POS_EQ_DEC" ref"myJ2000"
unit"deg" datatype"float"
precision"F3" width"7"/gt ltFIELD
name"Counts" ucd"NUMBER" datatype"int"
arraysize"2x3x"/gt ltDATAgt
ltTABLEDATAgt ltTRgt
ltTDgtProcyonlt/TDgtltTDgt114.827lt/TDgtltTDgt5.227lt/TDgt
ltTDgt4 5 3 4 3 2 1 2 3 3 5 6lt/TDgt
lt/TRgt ltTRgt ltTDgtVegalt/TDgtltTDgt279.
234lt/TDgt ltTDgt38.782lt/TDgtltTDgt8 7 8 6 8
6lt/TDgt lt/TRgt lt/TABLEDATAgt
lt/DATAgt lt/TABLEgt lt/RESOURCEgt lt/VOTABLEgt
6Table Cell
boolean bit unsignedByte short int long char unico
deChar float double floatComplex doubleComplex
scalar
Primitives
arrays
variable length arrays
etc
7VOTable is Flexy
- eg Table of images
- UCD"JPEG_IMAGE" datatype"unsignedByte"
arraysize"" - eg Table of URL links
- UCDDATA_LINK"datatype"char" arraysize""
8VOTable Schema (xsd)
9Table Data Model
- Metadata
- Class definition for Row
- FIELD
- data type
- semantic type
- Data
- Each Row is a list of Cells
- Each Cell is an array of Primitives
- may be variable length
10Table Data Layout
- All metadata first
- small, complex, XML
- Class definition for table record
- params, description, etc etc
- Then data
- (may be) large, remote
- XML binary FITS
- Instantiations of table record
- All records MUST have same format
11Param Data Model
- Param is Table with one cell
- Like a FIELD value
- But with a value attribute
12Primitives
datatype Meaning FITS Bytes
"boolean" Logical "L" 1
"bit" Bit "X"
"unsignedByte" Byte (0 to 255) "B" 1
"short" Short Integer "I" 2
"int" Integer "J" 4
"long" Long integer "K" 8
"char" ASCII Character "A" 1
"unicodeChar" Unicode Character  2
"float" Floating point "E" 4
"double" Double "D" 8
"floatComplex" Float Complex "C" 8
"doubleComplex" Double Complex "M" 16
- All have fixed binary length
- Same as FITS primitives
- Except Unicode
13Multidimensional Array Cell
- A table cell can have lots of Primitives
- Example WCS parameters are arrays
- ltFIELD nameCRVAL datatypedouble arraysize
2/gt - Example up to 10 images, each 64x64
- ltFIELD name"thumbs" datatype"unsignedByte"
arraysize"64x64x10"/gt
14Hierarchy
- A VOTable contains RESOURCES
- RESOURCE can contain
- TABLE
- RESOURCE
- etc etc
- Usage example
- Many observations in the file,
- each is a RESOURCE
- Each observation is
- Parameters
- Calibration table
- Raw data table
15Hierarchy
ltTABLE nameNutation and Aberrationgt ltGROUP
nameNutationgt ltFIELD
nameLongitude/gt ltFIELD
nameObliquity/gt lt/GROUPgt ltGROUP
nameAberrationgt ltGROUP nameEquinox
1950.0gt ltFIELD nameC/gt
ltFIELD nameD/gt lt/GROUPgt
ltGROUP nameEquinox 1955.0gt ltFIELD
nameC/gt ltFIELD nameD/gt
lt/GROUPgt lt/GROUPgt lt/TABLEgt
16Unified Content Descriptors
PHOT.INT-MAG.B Integrated total blue magnitude
ORBIT.ECCENTRICITY Orbital eccentricity
STAT.MEDIAN Statistics Median Value
INST.QE Detector's Quantum Efficiency
- Can be resolved by web service
- to description, examples, etc
- Base Specifiers
- eg error in default right ascension
- POS.EQ.RA, MAIN, ERROR
17VOTable Friends
Some self-describing file formats
XML BinaryStreaming Table Datacube Semantics
VOTable v v v v
BinX v v v v
MS Dataset v v
HDF v v
XDF v v
18XML Parsing
SAX Event-Based Handlers for StartElement, Text,
EndElement, etc.
Found element BookCatalogue Found element
Book Found Element Title Found Text The Cambridge
Star Atlas Found End Element Title .
19Parsing
DOM Document Object Model Returns a tree-like
Document object with data attached
BookCatalogue
Book
Book
Title
Title
Author
Cambridge Star Atlas
ISBN
Parallel Computing Works!
Wil Tirion
20Binding to make a Parser
From the Schema an API and library is
generated JAXB Breeze Castor
This is JAVOT (Caltech)
for(int i0 ilttable.getFieldCount() i)
Field field (Field)table.getFieldAt(i)
String u field.getUcd() if(u !
null u.equals("POS_EQ_RA_MAIN"))
System.out.println("Field " i " is for RA")
21VOTable Software
Treeview from UK-VO
22VOTable Software
VOPlot from India-VO
23VOTable Software
VOTool from US-VO
24VOTable software
Mirage from Bell Labs