Title: Technology Drivers for the Virtual Observatory
1Technology Drivers forthe Virtual Observatory
Robert J. Hanisch Space Telescope Science
Institute Baltimore, Maryland
2Virtual Observatory Paradigm
- The Digital Universe requires a new kind of
observatory - The Virtual Sky comprising the distributed,
networked collection of digital data archives and
catalogs - A Virtual Telescope that gathers the data
(identifies data providers, distributes user
queries, assembles and organizes retrievals) - Virtual Instruments that process the data
(statistical and analytical algorithms,
cross-correlation visualization tools)
3What is the VO? Content
4What is the VO? Components
5VO Information Technology Components
- Data management system
- Common data access layer
- Supports pipeline processing, archiving,
retrieval - Information management system
- General query and updating of attributes of
billions of objects - Metadata management
- Knowledge support system
- Correlation, visualization, statistical
comparison - Catalog data and original image pixel data
6VO Information Technology Challenges
- Data federation and fusion
- Caching and data replication services
- Efficient indexing
- Software agents
- Metadata standards and protocols
- Metadata semantics
- Syntax for metadata exchange (XML)
- Translation, conversion, common API
- Integration with FITS
7VO Information Technology Challenges
- Scalability
- Huge and growing datasets
- Federations and heterogeneity of huge datasets
- Accommodation for various network bandwidths
- Accommodation for computationally intensive
applications - Support for high dynamic range in usage profiles
- Data Understanding
- Data mining and statistical analysis
- Self-classification, visualization-assisted
classification - Correlation studies, non-parametric estimation
8VO Conceptual Architecture
User Interface
Astronomer
Data
Data
Data
Data
Data
9VO Conceptual Architecture
User Interface
Astronomer
There are 125,852 galaxies which satisfy these
criteria. What now?
Data
Data
Data
Data
Data
10Initiating VO Present Context
- A strong foundation for building VO is in place
- Existing data- and supercomputer- centers
- HEASARC, IPAC/IRSA, STScI/HST/MAST,
CXC, CADC - NPACI, NCSA
- Active community efforts on frontier IT problems
- Astronomical information services
- ADS, NED, SIMBAD
- Data analysis software
- AIPS, IRAF, IDL, FTOOLS, Skyview, SExtractor
11NASAs Astrophysics Data Services
... Infrared.OpticalUltravioletEUVX-ray
Gamma-ray
12NASAs Astrophysics Data Services
... Infrared.OpticalUltravioletEUVX-ray
Gamma-ray
13NASAs Astrophysics Data Services
... Infrared.OpticalUltravioletEUVX-ray
Gamma-ray
14NASAs Astrophysics Data Services
... Infrared.OpticalUltravioletEUVX-ray
Gamma-ray
15Initiating VO Key Elements
- Links with information systems research
- VO pushes technical limits in many key areas
- storage technology
- information management
- data handling
- distributed and parallel computing
- high speed networking
- data visualization and data mining
- Challenges similar in other sciences and the
private sector - Partnerships among universities, NSF, NASA,
foreign observatories and agencies, and industry
essential
16Advanced Hardware
- Large distributed database engines with Gbyte/sec
aggregate I/O speed - High speed (gt10 Gbits/s) backbones
cross-connecting the major archives databases - Scalable computing environment with hundreds of
CPUs for statistical analysis and discovery - Petabyte mass storage capability
17Data Volume and Bandwidth
Typical query response volume
Minimum required interconnect bandwidth
18New Analysis Tools
- Discover new patterns through advanced
statistical methods and visualization techniques - in catalogs
- in image databases
- by comparison of catalogs and image databases
- Confront catalogs and image databases with
numerical simulations of astrophysical systems - Uploadable Application packages to minimize data
transfer
19VO Education Public Outreach
- VO can enable broad public access to panchromatic
images and digital movies of the changing sky - The compelling nature of these images can be used
to capture public interest and advance science
literacy - Students can take and analyze data using applet
software developed for web browsers - Superb resource for designing hands-on curriculum
modules
20Summary
- The Virtual Observatory initiative
- Ensures an evolutionary, cost-effective
exploitation of the discovery potential of large,
multi-wavelength databases - Develops the protocols, software tools and
management approaches necessary to carry out
surveys at acceptable cost - Maximizes the impact of survey efforts through
broad public access - Coordinates development of tools to query and
analyze databases - Ensures that the astronomical community has the
proper network and hardware infrastructure to
carry out its science
21A Virtual Observatory for Data Exploration and
Discovery can be the catalyst for a New
Astronomy