VOTable%20Format,%20I/O%20Libraries%20 - PowerPoint PPT Presentation

About This Presentation
Title:

VOTable%20Format,%20I/O%20Libraries%20

Description:

Roy Willliams, Clive Davenhall, Daniel Durand, Pierre Fernique, David Giaretta, ... follows the XML rules (restricted character set, unicity in an XML document) ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 66
Provided by: sonali9
Category:

less

Transcript and Presenter's Notes

Title: VOTable%20Format,%20I/O%20Libraries%20


1
VOTable Format, I/O Libraries Tools
  • Participants-
  • Francois Ochsenbein (CDS, Strasbourg)
  • Mark Taylor (Starlink, UK)
  • Pallavi Kulkarni (IUCAA, India)
  • Sonali Kale (Persistent Systems,India)

2
Agenda
  • VOTable format
  • I/O Libraries
  • Plotting tools for VOTables
  • Mirage
  • VOPlot
  • Topcat
  • VOTable browsers
  • Treeview

3
Understanding VOTableFrançois Ochsenbein
(francois_at_simbad.u-strasbg.fr), on behalf
ofRoy Willliams, Clive Davenhall, Daniel
Durand, Pierre Fernique, David Giaretta, Robert
Hanish, Tom McGlynn, Alex Szalay, Andreas Wicenec
4
Overview
  • The VOTable format is a proposed XML standard for
    representing tabular data in the context of the
    Virtual Observatory
  • VOTable was designed to be compatible with FITS
    Binary Tables (the data part can be addressed
    directly)
  • VOTable is designed as a flexible storage and
    exchange format for tabular data, with particular
    emphasis on astronomical tables

5
Why move to VOTable ?
  • Interoperability is encouraged through the use of
    standard data structures and descriptions
    (metadata)
  • FITS is a widely accepted structure, but FITS
    keywords are rarely shared beyond the basic ones.
  • XML built-ins
  • easy validation of input document
  • easy transformations and displays via XSLT
    engine
  • VOTable can cope with very large datasets
  • does not require the huge XML overhead
  • direct access to binary files and existing FITS
    files

6
Data Model
  • VOTable hierarchy of Metadata Tables
  • Metadata Infos Descriptions Parameters
    Links Fields
  • Resource set of tables
  • Table list of Fields/Parameters Data
  • Data stream of Rows (or binary stream (remote))
  • Row list of Cells
  • Cell Primitive
  • or variable-length list of Primitives
  • or or multidimensional array of Primitives
  • Primitive integer, character, float,
    floatComplex

7
A VOTable document
Contains the following elements
  • DEFINITIONS typically about used coordinate
    system(s)
  • RESOURCE contains a DESCRIPTION and a list of
    tables, eventually (recursively) other RESOURCEs
  • TABLE contains
  • a textual DESCRIPTION of the table
  • a list of FIELD (columns) which describe the
    table layout and eventually PARAMETER which may
    specify some constants
  • the DATA part which can be present in the
    document, or remotely accessible in several
    formats including FITS, or absent (description of
    the data structure)
  • possibly LINK for getting details, explanations,
    related data,...
  • Example

8
Table Element in Detail
  • Written as DESCRIPTION and a collection of FIELD
    elements, plus possible PARAMETER and LINK
    elements, all representing the metadata followed
    by the DATA
  • Fields describe the nature of table columns
  • Table data start with the DATA element, followed
    by the actual rows containing the values of the
    fields, in the same order as their description.

9
FIELD in detail
  • Has sub elements like DESCRIPTION, LINK, and
    possibly a VALUES element
  • Provides information for a corresponding cell in
    the DATA element
  • For identification, the FIELD has a name, and
    also an ID
  • name is the field designation (column header)
  • ID is an identifier which follows the XML rules
    (restricted character set, unicity in an XML
    document)
  • The FIELD must contain a datatype attribute
  • Each table cell may contain more than one of the
    specified datatype and is specified by arraysize,
    which can be variable and multidimensional
    (64x64x)

10
Available Datatypes
11
Field cont..
  • unit attribute precises the units in a controlled
    vocabulary
  • ucd (Unified Content Descriptors)
  • This provides information about the semantics of
    the field expressed as a standardized string
    originally created at CDS and currently being
    reviewed.
  • precision indicates the number of significant
    digits for edition purposes
  • VALUES element
  • Holds the domain information (min, max, null,
    set of available values)
  • LINK Element
  • It is used to provide pointers to other documents
    or data servers through URL.

12
Parameter definitions
  • A PARAMETER is similar to a FIELD it may
    contain a DESCRIPTION, the attributes unit, ucd,
    name,ID,... plus a value attribute
  • may be considered as a constant column
  • Typical examples
  • frequency or wavelength of flux measurements
  • statistical error of results

13
Data Content
  • The data content of the table is in a single DATA
    element and is organized in a reading order.
  • Data can be stored or accessed in several
    formats
  • TABLEDATA introduces an XML coding of the table
  • FITS refers to an external FITS file
  • BINARY indicates a binary coding of the data used
    for its efficiency
  • STREAM introduces remote or encoded data
  • Data can be in the input stream or stored
    separately
  • Example

14
VOTable Additions
  • Several new features are being considered
  • GROUP of fields introduces a view of the fields
    as a structure
  • utype attribute to characterize the role of a
    field in the context of a data model
  • encoding attribute of a cell element in order to
    store e.g. images in table cells

15
VOTable I/O librariesPallavi Kulkarni
(pck_at_iucaa.ernet.in)
16
I/O Libraries
  • Why are parsers required ?
  • Why are writers required ?
  • Tree Structured Approach.
  • Event Driven Approach.
  • Tree Structured vs. Event Driven Approach.
  • VOTable Parsers Writers.

17
Why are parsers required ?
  • Provide a library for API based access to
    VOTable files.
  • These APIs can be directly used to develop
    VOTable applications without having to do raw
    VOTable processing.

18
Why are writers required ?
  • Writer provides APIs for generating
  • syntactically correct documents.
  • The user doesnt need to be aware of low level
    APIs.
  • Simplifies the process of writing a document.

19
Tree Structured Approach
  • Tree structured approach is a two step process.
  • The entire XML data is loaded in memory.
  • Operations are performed on the loaded data.

VOTable
Resource
Resource
Table
Table
Table
Table
20
Event Driven Approach
  • Does not create a tree structure in memory.
  • Single step process.
  • Data is passed from XML document to the
    application on the fly.

21
Tree Structured vs. Event Driven
  • Event driven approach is faster.
  • Event driven approach is good for large documents
    because it takes comparatively less memory.
  • With event driven approach, one can access the
    data but not modify it. With tree structured
    approach, one can modify data.
  • Tree structured parsing allows back and forth
    traversal in the XML data.
  • Event driven parsing can be stopped once desired
    XML data has been extracted.

22
VOTable Parsers Writers
  • JAVOT (NVO)
  • SAVOT (European VO)
  • VOTable Java Streaming Writer (VO-India)
  • C Parser (VO-India)
  • VOTable Perl Modules (NVO)
  • VOTableDOM (NVO)

23
JAVOT
  • Developed at Caltech.
  • Written in JAVA.
  • Creates a tree structure in memory.
  • Current version supports reading of VOTables.
  • Editing and writing of tables not yet supported.
  • More information can be found athttp//www.us-vo.
    org/VOTable/JAVOT/

24
SAVOT
  • Developed at CDS.
  • Written in JAVA.
  • Supports reading, writing editing of VOTables.
  • Can work in both full (tree structured)
    sequential (event driven) modes for parsing the
    data.
  • Suitable for large VOTable files as well.
  • More information can be found at
    http//simbad.u-strasbg.fr/public/cdsjava.gml

25
Sample VOTable
  • ltVOTABLEgt
  • ltRESOURCEgt
  • ltTABLEgt
  • ltFIELD namearea datatypeint/gt
  • ltDATAgtltTABLEDATAgt
  • ltTRgtltTDgt2000lt/TDgtlt/TRgt
  • ltTRgtltTDgt35467lt/TDgtlt/TRgt
  • lt/TABLEDATAgtlt/DATAgt
  • lt/TABLEgt
  • lt/RESOURCEgt
  • lt/VOTABLEgt

26
Sample Code (Sequential mode)
Get resource element
  • SavotResource currentResource
    sb.getNextResource()
  • TRSet tr currentResource.getTRSet(0)
  • TDSet theTDs tr.getTDSet(0)
  • for (int k 0 k lt theTDs.getItemCount() k)
  • System.out.println(theTDs.getContent(k))

Fetch all the rows From first table
Fetch first row from the set of rows.
Print the data Field wise
27
VOTable JAVA Streaming Writer
  • Developed at IUCAA and Persistent Systems.
  • Written in JAVA.
  • It supports only writing of data.
  • Takes a streaming approach i.e. event based
    approach to write the data.
  • More information can be found at
    http//vo.iucaa.ernet.in/voi/votableStreamWriter.
    htm

28
VOTable to be generated
  • ltVOTABLEgt
  • ltRESOURCEgt
  • ltTABLEgt
  • ltFIELD namePlanet datatypechargt
  • ltDATAgtltTABLEDATAgt
  • ltTRgtltTDgtMercurylt/TDgtlt/TRgt
  • lt/TABLEDATAgtlt/DATAgt
  • lt/TABLEgt
  • lt/RESOURCEgt
  • lt/VOTABLEgt

29
Sample Code
Initiallize the streaming writer
  • VOTableStreamWriter voWrite new
    VOTableStreamWriter(prnStream)
  • VOTable vot new VOTable()
  • voWrite.writeVOTable(vot)
  • VOTableResource voResource new
    VOTableResource()
  • voWrite.writeResource(voResource)
  • VOTableTable voTab new VOTableTable()
  • VOTableField voField new VOTableField()
  • voField.setName("Planet")
  • voField.setDataType(char)
  • voTab.addField(voField1)

Create write VOTable element.
Create write Resource element.
Create a table element
Create a field add it to the table.
30
Sample code contd.
Write the table to output stream
  • voWrite.writeTable(voTab)
  • String firstRow "Mercury"
  • voWrite.addRow(firstRow, 1)
  • voWrite.endTable()
  • voWrite.endResource()
  • voWrite.endVOTable()

Create a row write it to output stream
End the respective elements.
31
C Parser
  • Developed at Persistent Systems and IUCAA.
  • Written in C.
  • Available as both streaming and non-streaming
    parser.
  • It runs on Windows NT 4.0, Windows 2000, and
    Redhat Linux 7.1.

32
C Parser
  • Currently, supports reading of VOTables and
    pure-xml table data.
  • Currently, doesnt support reading of binary or
    FITS data doesnt support writing of VOTables.
  • More information can be found at
    http//vo.iucaa.ernet.in/voi/cplusparser.htm

33
VOTable PERL modules
  • Developed by the ClassX project at GSFC (NVO).
  • Written in PERL.
  • Builds a tree structure in memory.
  • Optimizations to handle large TABLEDATA elements.
  • Can be used for parsing existing VOTables and
    creating new ones from scratch.
  • More information can be found at
  • http//heasarc.gsfc.nasa.gov/classx/pub/votable/

34
PERL for formatting printing
  • Developed at NCSA (NVO).
  • Written in PERL.
  • It supports writing of VOTable documents.
  • Takes a streaming i.e. event driven approach to
    write data.
  • More information can be found at
    http//monet.ncsa.uiuc.edu/rplante/VO/VOTable-DOM
    .pm

35
VOTable Applications VOPLOT MIRAGESonali
Kale (sonali_at_persistent.co.in)
36
Plotting Tools for VOTables
  • VOPlot
  • Developed under Virtual Observatory India
    initiative.
  • Mirage
  • Developed by Lucent Technologies, Bell Labs.
  • Topcat
  • Developed by Starlink, UK.

37
VOPlot Introduction
  • Visualization toolkit developed by Persistent
    Systems and IUCAA in collaboration with CDS.
  • Motivation Allow web-based visualization of
    astronomy data.
  • Available as web-based version as well as a
    standalone desktop application.
  • Web-based version is successfully integrated with
    Vizier Catalogue Service.

38
VOPlot Visualization
  • Has all the features of versatile graphical tool
  • Scatter plots
  • Connected plots
  • Histograms
  • Logarithmic axes
  • Overlaying
  • Auto-ranging

39
VOPlot Features
  • Column transformations based on expression.
  • Data subset creation based on filter condition.
  • Save graph in EPS format.
  • Ability to select points on graph.
  • View meta data and data in tabular and VOTable
    format.

40
VOPlot Features
  • Inter-operable with Aladin developed by CDS.
  • This enables simultaneous visualization of some
    graph in VOPlot together with a representation as
    a sky plot in Aladin.
  • Selecting some region in the graph highlights the
    corresponding points in the sky plot and vice
    versa.

41
VOPlot Features
  • VOPlot can be used for basic statistical
    analysis. Basic uni-variate and multivariate
    functions supported.
  • Displays box plot.
  • A box plot provides a visual summary important
    aspects of data distribution.

42
VOPlot Features
  • Allows overlaying from multiple catalogues.
  • Allows drawing of error bars.
  • Can be integrated with any web-based catalogue
    service that creates output in VOTable format.
  • Example Successfully integrated with LEDAS (UK).

43
Launch VOPlot from Vizier
44
Launch VOPlot from Vizier (cont.)
45
Launch VOPlot from Vizier (cont.)
Choose Y column
46
VOPlot Inter-operation with Aladin
47
Inter-operation with Aladin (cont.)
Object selected in Aladin.
Point highlighted in VOPlot
48
VOPlot References
  • VOPlot
  • http//vo.iucaa.ernet.in/voi/voplot.htm
  • Vizier Catalogue Service
  • http//vizier.u-strasbg.fr/viz-bin/VizieR
  • Aladin
  • http//aladin.u-strasbg.fr/
  • Virtual Observatory India
  • http//vo.iucaa.ernet.in/voi/

49
Mirage Introduction
  • Mirage is a Java-based tool for data
    visualization tool and exploratory analysis.
  • Developed at Bell Laboratories, Lucent
    Technologies.
  • Support for VOTables added in collaboration with
    John Hopkins University.
  • Ray Plante produced an XSL stylesheet to convert
    from VOTable to Mirage format.

50
Mirage Data Visualization Features
  • Mirage provides multiple simultaneous views of
    data.
  • Data visualization is provided through
  • Data matrix view
  • Scatter Plots
  • Histogram.
  • Feature vector plot.

51
Mirage Operations On Graph
  • Points on plot can be selected using different
    selection methods (box, Bezier curve or free
    hand).
  • Selection can be broadcasted to other views.
  • Automatic walk in through graphs.
  • Grid of plots can be created.
  • Coloring of plot points.

52
Mirage Command Interpreter
  • Command line interpreter supports
  • Creation of new columns based on arithmetic
    expressions.
  • Addition of new columns from other data file.
  • Selection of points using a logical expression.
  • Color selection of points.

53
Mirage Data Matrix View
Data read from file
One row for each entry and one column for each
attribute.
54
Mirage Scatter Plot
Walking through a plot.
Choose a column for plotting
55
Mirage Histogram
Change bin width
56
Mirage Multiple Simultaneous Views
Make a selection
Broadcast selection
57
Mirage Grid of plots
Zoom in and out
Command to add column
58
Mirage References
  • Mirage
  • http//www.bell-labs.com/project/mirage/
  • VO Enabled Mirage
  • http//skyservice.pha.jhu.edu/develop/vo/mirage/
  • XSL Stylesheet to convert VOTable to Mirage
    format
  • http//www.euro-vo.org/pub/articles/ScienceWithPro
    toVOtools /vot2mirage.xml

59
VOTable Applications TOPCAT TreeviewMark
Taylor (m.b.taylor_at_bristol.ac.uk)
60
TOPCATTool for OPerations on Catalogues And
Tables
  • Generic table read/view/edit/analyse/plot/write
  • Format-neutral VOTable, FITS, SQL and others
  • Pure java
  • Full online help
  • Powerful expression language
  • Supports large tables
  • Under active development

61
TOPCAT capabilities
  • View/edit table data
  • View/edit table and column metadata
  • Units, UCDs, type, array shape, format-specific
    items...
  • Create new columns using algebraic expressions
  • Powerful language conditionals, nulls, string
    handling...
  • Move/delete columns
  • Sort rows
  • Create row subsets in various ways
  • Algebraic, graphical, boolean columns, selected
    rows
  • Plot columns against each other
  • Distinguish different defined subsets
  • Calculate per-column statistics

62
Table I/O
format-neutral view
  • VOTable
  • FITS
  • SQL
  • ASCII
  • LaTeX
  • Mirage
  • others

TOPCAT (view/edit GUI)
  • VOTable
  • TABLEDATA, FITS, STREAM?
  • FITS
  • TABLE, BINTABLE
  • SQL
  • ...others

Tablecopy (command line)
Tables class library (your code here)
tablecopy -ofmt fits my-votable.xml
63
(No Transcript)
64
Treeview
  • Find tables in deep hierarchies
  • Directory trees
  • VOTable hierarchical structure
  • Multi-extension FITS files
  • Compression .gz, .bz2, .Z
  • Tar, zip, jar files
  • FTP, HTTP
  • Quick preview of VOTables
  • XML, table data, table/column metadata
  • Launch TOPCAT
  • Recent versions of TOPCAT (v0.4) have embedded
    Treeview

65
References
  • TOPCAT
  • http//www.starlink.ac.uk/topcat/
  • Treeview
  • http//www.starlink.ac.uk/treeview/
  • Starlink
  • http//www.starlink.rl.ac.uk/
Write a Comment
User Comments (0)
About PowerShow.com