Title: IAGA2005_IPY VOs
1IPY Cluster Project 63 Heliosphere Impact on
Geospace Kick-off Workshop, Finnish
Meteorological Institute Helsinki, Finland, 5-9
February 2007
IPY Data ManagementHow one can deploy a Virtual
Observatory in Cyberspace?
Vladimir Papitashvili Department of Atmospheric,
Oceanic and Space Sciences University of Michigan
2Search Observatory at Google about 16,800,000
links!
What makes The Observatory active and productive?
What is Observatory? (definition by Wikipedia)
An observatory is a location used for observing
terrestrial and/or celestial events. Astronomy,
astrology, climatology, geology, meteorology,
oceanography and volcanology are examples of
disciplines for which observatories have been
constructed. Historically, observatories were as
simple as containing a sextant (for measuring the
distance between stars) or Stonehenge (which has
some alignments on astronomical phenomena).
The very existence of physical objects (stars,
galaxies, etc.) to observe or various fields
(temperature, winds, salinity, geomagnetic, etc.)
to measure.
3Data Collection and Dissemination Since IGY and
Nowadays
1957 to 1990s
1990s - Present
Get data (strip charts and tables) from physical
observatories to World Data Centers via snail-
air-mail
Send digital data from observatories to research
institutions and WDCs on CD-ROMs or via the
Internet
What is a Virtual Observatory? Why we need
Virtual Observatories now? Where are Virtual
Observatories in the process of science data
collection and dissemination? How to deploy a
Virtual Observatory in cyberspace?
Get hard copies of received analog data (or
sometime digital media) from WDCs via snail-
air-mail
Get digital data from WDCs and other research
institutions on CDs or via Internet World Wide
Web
Do eye-ball analysis of these hard copies (or
printouts) and get some scientific results
OR
Ingest received digital data to personal
workstation, process, visualize, check, correct,
put together, do cross-correlation, analysis,
etc.
Do some magic with analog charts and tables
converting them to digital data, process the
latter and get scientific results
and if you were lucky with the received data
You get some science results that can be shared
immediately with other researchers via the
Internet and World Wide Web
4Search Definition of Virtual Observatory at
Google about 439,000 links!
What is a Virtual Observatory?
What is the National (Astronomical) Virtual
Observatory? http//www.virtualobservatory.org Th
e NVO is an effort to make all the astronomy data
in the world easy to access, using a simple set
of web interfaces.
The NVO does not collect any data of its own
instead, it provides the resources to let users
search and analyze data that already exists.
5Search Definition of Virtual Observatory at
Google about 439,000 links!
What is a Virtual Observatory in space physics?
NASA Research Opportunities in Space and Earth
Sciences (ROSES) in 2005 and 2006 defines a
Virtual Observatory as A suite of software
applications on a set of computers that allows
users to uniformly find, access, and use
resources (data, software, document, and image
products and services using these) from a
collection of distributed product repositories
and service providers.
A VO is a service that unites services and/or
multiple repositories. (Aaron Roberts, NASA/GSFC)
620th Century Paradigm of Sharing Data Data were
to submitted to Data Centers
Why Virtual Observatories didn't exist in the
past?
- Data submissions to World Data Centers (?) were
and remains voluntary. - World Data Centers require significant and
continuous support (financial and manpower) for
data acquisition and storage. - Many types of collected scientific data are often
not suitable for World Data Centers e.g., the
quality of geomagnetic variation data does not
satisfy the WDC criteria, set mainly for the
standard magnetic observatory data.
Courtesy of the RAND Corporation
- Although at present the World Data Centers
provide most of their data online, they still
constitute a quasi-centralized system of data
collection, storage, and dissemination.
Push Data Concept
721st Century Paradigm Data are published,
visualized, and shared via World Wide Web
Why we need Virtual Observatories now?
- Sharing data via multiple Virtual Observatories
allows data providers achieve greater visibility
among scientific user communities. - This eliminates the voluntary need of
submitting data to World Data Centers (?) the
latter can pull data from the data provider Web
sites. - A Fabric of interconnected data nodes (providers
or secondary archives) is a new vision for
distributed, self-populating data repositories.
- Being integrated in this Data Fabric, World Data
Centers will play even more important role - as
clearinghouses they would need to watch the
always evolving Data Fabric and preserve at least
2-3 copies of a particular dataset across the
global network of data.
Courtesy of the RAND Corporation
Pull Data Concept
8Conditions Required for Deployment of Physical
and Virtual Observatories
Where are Virtual Observatories in the process of
science data collection and dissemination?
For Physical Observatories The very existence
of a physical fabric of objects allows us to
observe them a similar fabric of fields allows
us to measure them.
For Virtual Observatories As providers put data
on the Web, they waive a fabric of data in
cyberspace, which allows us to collect,
visualize, and interpolate/manipulate (or model)
virtual data over the areas where NO physical
observations available.
9Where are Virtual Observatories in the process of
science data collection and dissemination?
Present Future
1990s - Present
Send digital data from observatories to research
institutions and WDCs on CD-ROMs or via the
Internet
Internet/World Wide Web-based Data Location
Discovery
Get digital data from WDCs and other research
institutions on CDs or via Internet World Wide
Web
Web-Based Data Acquisition
Format Conversion (if needed)
Ingest received digital data to personal
workstation, process, visualize, check, correct,
put together, do cross-correlation, analysis,
etc.
Immediate Data Visualization
Data Management Integration
and if you were lucky with the received data
Data Virtualization Modeling
You get some science results that can be shared
immediately with other researchers again via the
Internet and World Wide Web
Global Data Fabric makes Virtual Observatories
real
10Virtual Observatory as a Tool to Access the IPY
Global Data Fabric
A Virtual Observatory Template
- Integrated Visualization Layer
- Highest Level of Data Analysis
IDL
MATLAB
Simulink
Flat File Manager
ASCII to Flat File Format Layer - to ingest
downloaded data in the Web-based Portal database
or to individual Data Nodes
FORMAT CONVERSION (A2F)
DATA ACQUISITION via FTP, SSL, XML, HTTP, OPeNDAP
Data Acquisition Layer retrieves data from World
Wide Web sites
LOCATION DISCOVERY Look-up Tables and Web Crawler
Lowest Layer look-up and discovery modules
This template can manage data time series, data
profiles, and sets of images
11Virtual Observatory Framework (see examples at
http//mist.engin.umich.edu)
- A Web Portal (two-component
middleware) - Secure, scalable, platform independent, and
user-friendly Java-based middleware provides
remote access to the Portals Flat File Manager
that access local (portals) and remote databases.
Users
Portal
- A standalone, self-organizing Data Node
(single-component) - Complementary software package to create,
populate, manage users Data Nodes - building
elements of the Global Data Fabric.
12Working ExampleVirtual Global Magnetic
Observatory
http//mist.engin.umich.edu
13Summary
- Existing World Data Centers continue to serve the
worldwide scientific community in providing free
access to global geophysical databases. - Recently many digital datasets have been placed
on the World Wide Web, often in near-real time,
but some of these data will not be even submitted
to any of the existing data centers. - Within a framework of the Electronic Geophysical
Year, we formulated a concept of the Global Data
Fabric which is populated by independent Data
Nodes accessed via the discipline-oriented
Virtual Observatories deployed in cyberspace.
14Summary (contd)
- We postulate that a Virtual Observatory can be
deployed in cyberspace only if a
discipline-specific data infrastructure
(primitive or sophisticated) is made available
electronically. - Thus, if the discipline-specific Data Fabric
makes itself available in cyberspace via FTP,
SSL, or HTTP ports for search and retrieval, then
access to collected scientific data becomes
trivial through developing software/middleware
packages, installed either at a single server
(data portal) or at a number of personal
computers (data nodes). - Generally speaking, the described here VO/DF
coupled pair constitutes a very simple and easy
to implement approach. Data accessed via that
approach may not necessarily be scientific the
proposed concept would work well for Virtual
Corporation or Virtual Retailer data fabrics.