Title: VOTable%20Format,%20I/O%20Libraries%20
1VOTable Format, I/O Libraries Tools
- Participants-
- Francois Ochsenbein (CDS, Strasbourg)
- Mark Taylor (Starlink, UK)
- Pallavi Kulkarni (IUCAA, India)
- Sonali Kale (Persistent Systems,India)
2Agenda
- VOTable format
- I/O Libraries
- Plotting tools for VOTables
- Mirage
- VOPlot
- Topcat
- VOTable browsers
- Treeview
3Understanding VOTableFrançois Ochsenbein
(francois_at_simbad.u-strasbg.fr), on behalf
ofRoy Willliams, Clive Davenhall, Daniel
Durand, Pierre Fernique, David Giaretta, Robert
Hanish, Tom McGlynn, Alex Szalay, Andreas Wicenec
4Overview
- The VOTable format is a proposed XML standard for
representing tabular data in the context of the
Virtual Observatory - VOTable was designed to be compatible with FITS
Binary Tables (the data part can be addressed
directly) - VOTable is designed as a flexible storage and
exchange format for tabular data, with particular
emphasis on astronomical tables
5Why move to VOTable ?
- Interoperability is encouraged through the use of
standard data structures and descriptions
(metadata) - FITS is a widely accepted structure, but FITS
keywords are rarely shared beyond the basic ones. - XML built-ins
- easy validation of input document
- easy transformations and displays via XSLT
engine - VOTable can cope with very large datasets
- does not require the huge XML overhead
- direct access to binary files and existing FITS
files
6Data Model
- VOTable hierarchy of Metadata Tables
- Metadata Infos Descriptions Parameters
Links Fields - Resource set of tables
- Table list of Fields/Parameters Data
- Data stream of Rows (or binary stream (remote))
- Row list of Cells
- Cell Primitive
- or variable-length list of Primitives
- or or multidimensional array of Primitives
- Primitive integer, character, float,
floatComplex
7A VOTable document
Contains the following elements
- DEFINITIONS typically about used coordinate
system(s) - RESOURCE contains a DESCRIPTION and a list of
tables, eventually (recursively) other RESOURCEs - TABLE contains
- a textual DESCRIPTION of the table
- a list of FIELD (columns) which describe the
table layout and eventually PARAMETER which may
specify some constants - the DATA part which can be present in the
document, or remotely accessible in several
formats including FITS, or absent (description of
the data structure) - possibly LINK for getting details, explanations,
related data,... - Example
8Table Element in Detail
- Written as DESCRIPTION and a collection of FIELD
elements, plus possible PARAMETER and LINK
elements, all representing the metadata followed
by the DATA - Fields describe the nature of table columns
- Table data start with the DATA element, followed
by the actual rows containing the values of the
fields, in the same order as their description.
9FIELD in detail
- Has sub elements like DESCRIPTION, LINK, and
possibly a VALUES element - Provides information for a corresponding cell in
the DATA element - For identification, the FIELD has a name, and
also an ID - name is the field designation (column header)
- ID is an identifier which follows the XML rules
(restricted character set, unicity in an XML
document) - The FIELD must contain a datatype attribute
- Each table cell may contain more than one of the
specified datatype and is specified by arraysize,
which can be variable and multidimensional
(64x64x)
10Available Datatypes
11Field cont..
- unit attribute precises the units in a controlled
vocabulary - ucd (Unified Content Descriptors)
- This provides information about the semantics of
the field expressed as a standardized string
originally created at CDS and currently being
reviewed. - precision indicates the number of significant
digits for edition purposes - VALUES element
- Holds the domain information (min, max, null,
set of available values) - LINK Element
- It is used to provide pointers to other documents
or data servers through URL.
12Parameter definitions
- A PARAMETER is similar to a FIELD it may
contain a DESCRIPTION, the attributes unit, ucd,
name,ID,... plus a value attribute - may be considered as a constant column
- Typical examples
- frequency or wavelength of flux measurements
- statistical error of results
13Data Content
- The data content of the table is in a single DATA
element and is organized in a reading order. - Data can be stored or accessed in several
formats - TABLEDATA introduces an XML coding of the table
- FITS refers to an external FITS file
- BINARY indicates a binary coding of the data used
for its efficiency - STREAM introduces remote or encoded data
- Data can be in the input stream or stored
separately - Example
14VOTable Additions
- Several new features are being considered
- GROUP of fields introduces a view of the fields
as a structure - utype attribute to characterize the role of a
field in the context of a data model - encoding attribute of a cell element in order to
store e.g. images in table cells
15VOTable I/O librariesPallavi Kulkarni
(pck_at_iucaa.ernet.in)
16I/O Libraries
- Why are parsers required ?
- Why are writers required ?
- Tree Structured Approach.
- Event Driven Approach.
- Tree Structured vs. Event Driven Approach.
- VOTable Parsers Writers.
17Why are parsers required ?
-
- Provide a library for API based access to
VOTable files. - These APIs can be directly used to develop
VOTable applications without having to do raw
VOTable processing. -
18Why are writers required ?
- Writer provides APIs for generating
- syntactically correct documents.
- The user doesnt need to be aware of low level
APIs. - Simplifies the process of writing a document.
19Tree Structured Approach
- Tree structured approach is a two step process.
- The entire XML data is loaded in memory.
- Operations are performed on the loaded data.
-
VOTable
Resource
Resource
Table
Table
Table
Table
20Event Driven Approach
- Does not create a tree structure in memory.
- Single step process.
- Data is passed from XML document to the
application on the fly.
21Tree Structured vs. Event Driven
- Event driven approach is faster.
- Event driven approach is good for large documents
because it takes comparatively less memory. - With event driven approach, one can access the
data but not modify it. With tree structured
approach, one can modify data. - Tree structured parsing allows back and forth
traversal in the XML data. - Event driven parsing can be stopped once desired
XML data has been extracted.
22VOTable Parsers Writers
- JAVOT (NVO)
- SAVOT (European VO)
- VOTable Java Streaming Writer (VO-India)
- C Parser (VO-India)
- VOTable Perl Modules (NVO)
- VOTableDOM (NVO)
23JAVOT
- Developed at Caltech.
- Written in JAVA.
- Creates a tree structure in memory.
- Current version supports reading of VOTables.
- Editing and writing of tables not yet supported.
- More information can be found athttp//www.us-vo.
org/VOTable/JAVOT/
24SAVOT
- Developed at CDS.
- Written in JAVA.
- Supports reading, writing editing of VOTables.
- Can work in both full (tree structured)
sequential (event driven) modes for parsing the
data. - Suitable for large VOTable files as well.
- More information can be found at
http//simbad.u-strasbg.fr/public/cdsjava.gml
25Sample VOTable
- ltVOTABLEgt
- ltRESOURCEgt
- ltTABLEgt
- ltFIELD namearea datatypeint/gt
- ltDATAgtltTABLEDATAgt
- ltTRgtltTDgt2000lt/TDgtlt/TRgt
- ltTRgtltTDgt35467lt/TDgtlt/TRgt
- lt/TABLEDATAgtlt/DATAgt
- lt/TABLEgt
- lt/RESOURCEgt
- lt/VOTABLEgt
-
26Sample Code (Sequential mode)
Get resource element
- SavotResource currentResource
sb.getNextResource() - TRSet tr currentResource.getTRSet(0)
- TDSet theTDs tr.getTDSet(0)
- for (int k 0 k lt theTDs.getItemCount() k)
- System.out.println(theTDs.getContent(k))
-
Fetch all the rows From first table
Fetch first row from the set of rows.
Print the data Field wise
27VOTable JAVA Streaming Writer
- Developed at IUCAA and Persistent Systems.
- Written in JAVA.
- It supports only writing of data.
- Takes a streaming approach i.e. event based
approach to write the data. - More information can be found at
http//vo.iucaa.ernet.in/voi/votableStreamWriter.
htm
28VOTable to be generated
- ltVOTABLEgt
- ltRESOURCEgt
- ltTABLEgt
- ltFIELD namePlanet datatypechargt
- ltDATAgtltTABLEDATAgt
- ltTRgtltTDgtMercurylt/TDgtlt/TRgt
- lt/TABLEDATAgtlt/DATAgt
- lt/TABLEgt
- lt/RESOURCEgt
- lt/VOTABLEgt
29Sample Code
Initiallize the streaming writer
- VOTableStreamWriter voWrite new
VOTableStreamWriter(prnStream) - VOTable vot new VOTable()
- voWrite.writeVOTable(vot)
- VOTableResource voResource new
VOTableResource() - voWrite.writeResource(voResource)
- VOTableTable voTab new VOTableTable()
- VOTableField voField new VOTableField()
- voField.setName("Planet")
- voField.setDataType(char)
- voTab.addField(voField1)
Create write VOTable element.
Create write Resource element.
Create a table element
Create a field add it to the table.
30Sample code contd.
Write the table to output stream
- voWrite.writeTable(voTab)
- String firstRow "Mercury"
- voWrite.addRow(firstRow, 1)
- voWrite.endTable()
- voWrite.endResource()
- voWrite.endVOTable()
Create a row write it to output stream
End the respective elements.
31C Parser
- Developed at Persistent Systems and IUCAA.
- Written in C.
- Available as both streaming and non-streaming
parser. - It runs on Windows NT 4.0, Windows 2000, and
Redhat Linux 7.1.
32C Parser
- Currently, supports reading of VOTables and
pure-xml table data. - Currently, doesnt support reading of binary or
FITS data doesnt support writing of VOTables. - More information can be found at
http//vo.iucaa.ernet.in/voi/cplusparser.htm
33VOTable PERL modules
- Developed by the ClassX project at GSFC (NVO).
- Written in PERL.
- Builds a tree structure in memory.
- Optimizations to handle large TABLEDATA elements.
- Can be used for parsing existing VOTables and
creating new ones from scratch. - More information can be found at
- http//heasarc.gsfc.nasa.gov/classx/pub/votable/
34PERL for formatting printing
- Developed at NCSA (NVO).
- Written in PERL.
- It supports writing of VOTable documents.
- Takes a streaming i.e. event driven approach to
write data. - More information can be found at
http//monet.ncsa.uiuc.edu/rplante/VO/VOTable-DOM
.pm
35VOTable Applications VOPLOT MIRAGESonali
Kale (sonali_at_persistent.co.in)
36Plotting Tools for VOTables
- VOPlot
- Developed under Virtual Observatory India
initiative. - Mirage
- Developed by Lucent Technologies, Bell Labs.
- Topcat
- Developed by Starlink, UK.
37VOPlot Introduction
- Visualization toolkit developed by Persistent
Systems and IUCAA in collaboration with CDS. - Motivation Allow web-based visualization of
astronomy data. - Available as web-based version as well as a
standalone desktop application. - Web-based version is successfully integrated with
Vizier Catalogue Service.
38VOPlot Visualization
- Has all the features of versatile graphical tool
- Scatter plots
- Connected plots
- Histograms
- Logarithmic axes
- Overlaying
- Auto-ranging
39VOPlot Features
- Column transformations based on expression.
- Data subset creation based on filter condition.
- Save graph in EPS format.
- Ability to select points on graph.
- View meta data and data in tabular and VOTable
format.
40VOPlot Features
- Inter-operable with Aladin developed by CDS.
- This enables simultaneous visualization of some
graph in VOPlot together with a representation as
a sky plot in Aladin. - Selecting some region in the graph highlights the
corresponding points in the sky plot and vice
versa.
41VOPlot Features
- VOPlot can be used for basic statistical
analysis. Basic uni-variate and multivariate
functions supported. - Displays box plot.
- A box plot provides a visual summary important
aspects of data distribution.
42VOPlot Features
- Allows overlaying from multiple catalogues.
- Allows drawing of error bars.
- Can be integrated with any web-based catalogue
service that creates output in VOTable format. - Example Successfully integrated with LEDAS (UK).
43Launch VOPlot from Vizier
44Launch VOPlot from Vizier (cont.)
45Launch VOPlot from Vizier (cont.)
Choose Y column
46VOPlot Inter-operation with Aladin
47Inter-operation with Aladin (cont.)
Object selected in Aladin.
Point highlighted in VOPlot
48VOPlot References
- VOPlot
- http//vo.iucaa.ernet.in/voi/voplot.htm
- Vizier Catalogue Service
- http//vizier.u-strasbg.fr/viz-bin/VizieR
- Aladin
- http//aladin.u-strasbg.fr/
- Virtual Observatory India
- http//vo.iucaa.ernet.in/voi/
49Mirage Introduction
- Mirage is a Java-based tool for data
visualization tool and exploratory analysis. - Developed at Bell Laboratories, Lucent
Technologies. - Support for VOTables added in collaboration with
John Hopkins University. - Ray Plante produced an XSL stylesheet to convert
from VOTable to Mirage format.
50Mirage Data Visualization Features
- Mirage provides multiple simultaneous views of
data. - Data visualization is provided through
- Data matrix view
- Scatter Plots
- Histogram.
- Feature vector plot.
51Mirage Operations On Graph
- Points on plot can be selected using different
selection methods (box, Bezier curve or free
hand). - Selection can be broadcasted to other views.
- Automatic walk in through graphs.
- Grid of plots can be created.
- Coloring of plot points.
52Mirage Command Interpreter
- Command line interpreter supports
- Creation of new columns based on arithmetic
expressions. - Addition of new columns from other data file.
- Selection of points using a logical expression.
- Color selection of points.
53Mirage Data Matrix View
Data read from file
One row for each entry and one column for each
attribute.
54Mirage Scatter Plot
Walking through a plot.
Choose a column for plotting
55Mirage Histogram
Change bin width
56Mirage Multiple Simultaneous Views
Make a selection
Broadcast selection
57Mirage Grid of plots
Zoom in and out
Command to add column
58Mirage References
- Mirage
- http//www.bell-labs.com/project/mirage/
- VO Enabled Mirage
- http//skyservice.pha.jhu.edu/develop/vo/mirage/
- XSL Stylesheet to convert VOTable to Mirage
format - http//www.euro-vo.org/pub/articles/ScienceWithPro
toVOtools /vot2mirage.xml
59VOTable Applications TOPCAT TreeviewMark
Taylor (m.b.taylor_at_bristol.ac.uk)
60TOPCATTool for OPerations on Catalogues And
Tables
- Generic table read/view/edit/analyse/plot/write
- Format-neutral VOTable, FITS, SQL and others
- Pure java
- Full online help
- Powerful expression language
- Supports large tables
- Under active development
61TOPCAT capabilities
- View/edit table data
- View/edit table and column metadata
- Units, UCDs, type, array shape, format-specific
items... - Create new columns using algebraic expressions
- Powerful language conditionals, nulls, string
handling... - Move/delete columns
- Sort rows
- Create row subsets in various ways
- Algebraic, graphical, boolean columns, selected
rows - Plot columns against each other
- Distinguish different defined subsets
- Calculate per-column statistics
62Table I/O
format-neutral view
- VOTable
- FITS
- SQL
- ASCII
- LaTeX
- Mirage
- others
TOPCAT (view/edit GUI)
- VOTable
- TABLEDATA, FITS, STREAM?
- FITS
- TABLE, BINTABLE
- SQL
- ...others
Tablecopy (command line)
Tables class library (your code here)
tablecopy -ofmt fits my-votable.xml
63(No Transcript)
64Treeview
- Find tables in deep hierarchies
- Directory trees
- VOTable hierarchical structure
- Multi-extension FITS files
- Compression .gz, .bz2, .Z
- Tar, zip, jar files
- FTP, HTTP
- Quick preview of VOTables
- XML, table data, table/column metadata
- Launch TOPCAT
- Recent versions of TOPCAT (v0.4) have embedded
Treeview
65References
- TOPCAT
- http//www.starlink.ac.uk/topcat/
- Treeview
- http//www.starlink.ac.uk/treeview/
- Starlink
- http//www.starlink.rl.ac.uk/