Space Science Archives at NASA - PowerPoint PPT Presentation

About This Presentation
Title:

Space Science Archives at NASA

Description:

In the mid 1980's NASA tried to develop the Astronomical Data System (ADS) ... 1990 NASA started the HEASARC with a more limited view of systematically ... – PowerPoint PPT presentation

Number of Views:93
Avg rating:3.0/5.0
Slides: 30
Provided by: osti
Category:
Tags: nasa | archives | nasa | science | space

less

Transcript and Presenter's Notes

Title: Space Science Archives at NASA


1
Space Science Archives at NASA
  • Jeffrey Hayes
  • Science Mission Directorate, NASA HQ
  • July 15, 2004

2
The funding agencies and muses
How we really work! (Note where Urania is)
3
Space Science Archives
  • The idea of archiving data is old in the NASA
    science
  • community. Data are the only legacy a mission
    has
  • once it has ended.
  • Because data was being archived in a hap-hazard
    manner early on, it was decided in the late
    1960s to form a central archive to capture all
    Space Science data -- the National Space Science
    Data Center (NSSDC) at Goddard Space Flight
    Center (GSFC). This is still the deep archive
    for all NASA Space Science missions.

4
Space Science Archives
  • Over the years the NSSDC evolved to include not
    only
  • Space Science data, but large ground-based
  • Astronomical catalogues needed by both the NASA
    and
  • the general science communities.
  • All data were available on request.
  • All data were maintained on (by todays
    standards) very primitive media (i.e. cards, 7
    track mag. tape)
  • Leads to questions of accessibility and
    usefulness.

5
Standards
  • For data to be useful, it must conform to a form
    that most of the community agrees to, and can
    understand.
  • Catalogue data was standardized on the
    80-character line of data (hold-over from punch
    cards). There were limitations, because some
    records were longer than 80 characters. Image
    data was essentially photographic prints.
    Spectroscopy was in lists of lines and
    wavelengths.
  • All of these formats had severe limitations for
    transport and further analysis.

6
Standards
  • In 1977 the Flexible Image Transport System
    (FITS) standard was published (Wells). This was
    a self-describing data structure that allowed for
    the storage of data on a computer as a file with
    an embedded header.
  • Quickly became the standard in the astronomical
    and parts of the Solar physics communities, and
    with substantial modifications, is still the
    standard.
  • All FITS data can be read by all FITS readers
    (with provisos).
  • Became a NASA standard for data in 1999.
  • New standards are coming on-line (XML, VOTable).

7
Archival Evolution
  • The NSSDC was a great idea, but the advances in
  • computer science made it out of date.
    Astronomers
  • wanted access to digital data that was
    straight-forward
  • and did not require translation from one data
    format to
  • another. There was also a desire for rapid
  • dissemination of that data other than by US Mail.

  • The growth in the Internet and the decline of
    computer
  • hardware prices made this possible.

8
Archival Evolution
  • In the mid 1980s NASA tried to develop the
    Astronomical Data System (ADS), which would
    combine all astronomical data. It was a failure
    because there was not enough compute power or
    network capacity for such system. (A sideline to
    the ADS work was digitizing all the astronomical
    literature, and the ADS is now the premier site
    for such data in the world. All scanned papers
    are in PDF or JPEG formats.)
  • 1990 NASA started the HEASARC with a more limited
    view of systematically archiving all High Energy
    Astrophysics data. This was very successful and
    the model for other space science archives.

9
Active Archives
  • HEASARC was the first of a series of active
    archives that allowed the community to interact
    with the data by allowing down-loads of
    post-Level 1 data sets. The archive model now
    consists of scientists who maintain the integrity
    of the data, and also develop new and better
    tools for the manipulation and analysis of these
    data.
  • The model has been generalized across wavelength
    regimes HEA -- HEASARC UV/Optical -- MAST
    IR/sub-mm -- IRSA CMB -- LAMBDA etc.
  • The planetary sciences have the Planetary Data
    System (PDS) which parallels the Astronomy
    archives, but with differences (i.e. includes
    FITS as well as other standards, and nodes based
    on science discipline type).

10
Philosophy
  • The last decade has seen the phenomenal growth in
    the power and ability in computers. This has
    allowed for the rapid evolution in archives and
    their ability to respond to the communitys
    needs.
  • We have moved from a main-frame, static archive
    philosophy, to one that is more mobile and
    dynamic and evolves through feedback with more
    sophisticated data products.
  • We have moved away from simple curation to
    managing the data. Data must now be migrated
    from one medium to another with a reasonable plan
    on how to do it.

11
Philosophy
  • It is NASA policy that all science data come into
    the public domain as soon as possible. The
    researcher signs an agreement that all data taken
    by a NASA mission will be archived and after a
    suitable length of time (within 6 to 12 months
    from the date of observation/data acquisition),
    the data becomes publicly accessible. The are
    very few exceptions to this rule.

12
Hardware
  • Hardware is now cheap. Both memory and disks are
    to the point where it is logical and practical to
    spin all data possible and make it accessible via
    the Web to all users.
  • Less than 10 years ago, we were still using
    9-track tapes, some Exabytes, some DAT, and M/O
    disks. Little data was spinning. Now with the
    advent of RAID disks on the TB scale, flash
    drives on the GB scale, and laptops with G-flop
    compute power, the only problem is bandwidth.
    That too is getting cheaper and more accessible.
  • The NASA archives are trying to keep abreast of
    all these developments.

13
Software
  • The archiving tools used are usually COTS
    products. It is not cost effective to develop
    entire stand-alone SQL systems for archives. The
    customization is in adapting their use in an
    astronomical context. However as the evolution
    of database management continues, we are seeing a
    tremendous flexibility in how these data can be
    managed.
  • The compute power also allows for the development
    of very sophisticated data imaging, manipulation,
    and analysis tools. This is now considered to be
    within the purview of the active science archives.

14
Management
  • Day-to-day operations are managed at the active
    archive level by a scientist responsible to NASA
    HQ for the assigned activities.
  • There are biannual meetings of a coordinating
    committee (ADEC) with NASA HQ having ex-officio
    membership.
  • The active Astronomy archives are peer-reviewed
    every 4 years in the Astronomy Senior Review
    process. On-going and new activities are
    proposed and judged. Funding can be reallocated
    as needed.

15
Operational Archives
  • The suite of NASA Space Science archives
    currently consists of
  • 1 deep archive
  • NSSDC
  • 7 active archives
  • HEASARC, MAST, IRSA, LAMBDA, MSC, PDS, SSDC
  • 2 data services
  • ADS, NED
  • 3 on-going Great Observatory missions have
    stand-alone archives which are associated with
    the above active archives
  • HST, CXO, SIRTF

16
Interoperability
  • There is a movement in the astronomical community
    to have access to data from multiple wavelength
    regimes in order to cross-correlate them. The
    Space Science archives are now working on a plan
    to implement such interoperability (the NVO by
    another name). In addition, we want to
    incorporate both theory and modeling data in the
    infrastructure.
  • This promises a huge leap in science by using the
    Internet, Grid technologies, and very fast
    computing techniques.
  • NASA is working on a response to the white paper
    produced by the archives.

17
Historical Data
  • One last point What to do with old, pre-digital
    data?
  • Photographic plates exist in huge numbers and are
    both fragile and of finite lifetime. Harvard and
    Caltech are digitizing their plate collections,
    but most other institutions cannot because of the
    cost.
  • Do we accept the lost of such historical data, or
    so we collect only those collections at large or
    national observatories which have a uniform
    pedigree?
  • Should data have expiration dates? (Like milk.)

18
NASA Astronomy Archives
  • Backup Material

19
Data Archive Centers LAMBDA
  • Legacy Archive for Microwave Background Data
    Analysis (LAMBDA)
  • http//lambda.gsfc.nasa.gov/
  • One Stop Shopping for CMB Researchers
  • Contains Cosmic Microwave Background data and
    data products from WMAP, COBE, IRAS, SWAS
    missions related software (CMBFAST, HEALPix
    etc) and archives of news and science papers.

20
Data Archive Centers IRSA
  • NASA/IPAC InfraRed Science Archive (IRSA)
  • http//irsa.ipac.caltech.edu/
  • Archive node for scientific data sets from
    NASAs infrared and sub-millimeter astronomy
    projects and missions
  • Contains data from 2MASS, IRAS, MSX, SWAS, ISO,
    Spitzer, and related inventory, software, and
    data exploration services.

21
Data Archive Centers MAST
  • Multimission Archive at Space Telescope (MAST)
  • http//archive.stsci.edu/
  • Supports a variety of astronomical data
    archives, with the primary focus on
    scientifically related data sets in the optical,
    ultraviolet, and near-infrared parts of the
    spectrum
  • Contains data and data products from HST, FUSE,
    IUE, EUVE, ASTRO, HUT, UIT, WUPPE, and others
  • Also, catalogues and surveys from GALEX, SDSS,
    GSC, DSS, VLA-FIRST, relevant software (STSDAS),
    etc.

22
Data Archive Centers HEASARC
  • High Energy Astrophysics Science Archive Research
    Center (HEASARC)
  • http//heasarc.gsfc.nasa.gov/
  • An archive of astronomy data from extreme
    ultraviolet, X-ray, and gamma-ray observatories

  • Contains data from ASCA, BeppoSAX, CGRO, Chandra,
    EUVE, HETE-2, Integral, ROSAT, RXTE, XMM-Newton,
    and others. In the future will serve data from
    Astro-E2 and Swift. Also multi-mission software
    and analysis tools, and information for educators
    and the public

23
Data Archive Centers CXC
  • Chandra X-ray Center (CXC)
  • http//chandra.harvard.edu/
  • Center for Chandra science and calibration data,
    proposer information, data analysis software
    assistance, public information and education
    resources.

24
Services ADS and NED
  • NASA Astrophysics Data System (ADS)
  • http//adswww.harvard.edu/
  • The main body of data in the ADS consists of
    bibliographic records searchable through database
    queries, and full-text scans of much of the
    astronomical literature.
  • NASA/IPAC Extragalactic Database (NED)
  • http//nedwww.ipac.caltech.edu/
  • Database built around a master list of
    extragalactic objects bibliographic references,
    photometry, position and redshift data, etc.

25
Data Archive Centers MSC
  • Michelson Science Center (MSC)
  • http//msc.caltech.edu/
  • Science operations and analysis service
    organization for selected NASA Origins Theme
    projects - software infrastructure, science ops
    and consulting to Navigator Program projects and
    their user communities
  • Up-and-coming archive for data Palomar Testbed
    Interferometer, Keck Interferometer, SIM, and TPF.

26
Data Archive Centers NSSDC
  • National Space Science Data Center (NSSDC)
  • http//nssdc.gsfc.nasa.gov/
  • The NSSDC is responsible for the long term
    archiving and preservation of all space science
    data - provides a permanent archive for OSS data
    (for space physics, solar physics and
    planetary/lunar, as well as astrophysics)
  • Relatively recent data is held on CD-ROMs older
    astrophysics datasets available on offline
    media.

27
Data Archive Centers SSDC
  • Solar Science Data Center (SSDC)
  • http//ssdc.gsfc.nasa.gov/
  • Provides a permanent archive for Solar data (for
    space physics, solar physics as well as upper
    atmospheric physics)
  • Relatively recent data is held on CD-ROMs older
    astrophysics datasets available on offline
    media.
  • Data is in mixed formats mainly FITS for
    imaging, but HDF used for spectra.
  • Colocated with NSSDC at GSFC.

28
Data Archive Centers PDS
  • Planetary Data System (PDS)
  • http//pds.jpl.nasa.gov/
  • Provides active archive for various aspects of
    planetary mission data. Unlike other centers, it
    is discipline specific, not wavelength specific
    (i.e. rings, small bodies, satellites, etc.)
  • Relatively recent data is held on CD-ROMs older
    datasets available on RAID disks or on offline
    media.
  • Data is in mixed formats some FITS for imaging,
    but mainly HDF of various flavors for other
    data.
  • Various discipline nodes across the country with
    central coordination at JPL Central Node.

29
Data Archives Funding Levels
  • NASA archive funding levels in FY04 (in M)
Write a Comment
User Comments (0)
About PowerShow.com