CTIO Dark Energy Camera: Scientific Data Management - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

CTIO Dark Energy Camera: Scientific Data Management

Description:

CTIO Dark Energy Camera: Scientific Data Management – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 13
Provided by: raypl
Learn more at: https://decam.fnal.gov
Category:

less

Transcript and Presenter's Notes

Title: CTIO Dark Energy Camera: Scientific Data Management


1
CTIO Dark Energy Camera Scientific Data
Management
Ray Plante (Applications Technology, NCSA)
2
Scientific Data Management at NCSA
  • Integrating hardware software technologies with
    scientific expertise into a system that meets
    scientific goals
  • Requires the collaboration of computer scientists
    and astronomers
  • Astronomy data laboratory at NCSA
  • Build on our experience in data management
    processing
  • Environment where HP technologies and expertise
    can be readily assembled to tackle challenging
    data problems
  • Center for cultivating innovation from the
    scientific community

3
Astronomical Data at NCSABIMA Data Archive
Pipeline
  • Archive has been operating and delivering data
    over the web since 1994.
  • Architecture
  • Data transferred in near real-time
  • Long-term storage NCSA Mass Storage System
  • Local disk cache of most recently produced and
    accessed data
  • PostgreSQL, XML for managing metadata
  • Organized into PI-oriented collections
    (reflecting GO model)
  • Current size gt 1 TB
  • BIMA system is basis for CARMA (2004-5)
  • Automated Pipeline Processing
  • As data arrives from telescope
  • Uses NCSA resources to process data
  • Products returned to the archive for distribution

4
Astronomical Data at NCSAThe Data Laboratory
  • Other projects
  • CARMA Archive and Pipeline
  • Astronomical Digital Image Library (ADIL)
  • A public repository for FITS Images since 1995
  • Archive mirrors and value-added processing
  • VLA Archive
  • Optical surveys DSS, Quest2, Sloan
  • NOAO pilot project for mirroring processing
  • NCSA/UIUC is partnering on the LSST proposal
  • The Data Laboratory
  • an environment for cultivating innovation from
    the community

5
CTIO Dark Energy Camera
  • Observing Modes
  • Primarily survey mode
  • uniform-products
  • automated processing.
  • GO mode
  • Non-uniform products
  • User-guided processing
  • some processing shared from survey processing
    pipeline?

6
Scientific Data Management Issues for the Dark
Energy Camera
  • Long-term preservation of data
  • Strategic deployment of primary and secondary
    storage
  • Metadata management Curation
  • Data access for pipelines
  • Data distribution to users
  • VO integration

7
Issues in Data ManagementPreservation
  • How do you ensure usability beyond the funded
    life of the project?
  • Continuously evolving storage architecture
  • NCSA Mass Storage System 3rd generation since
    1994
  • Databases importance of employing standards
  • Low-maintenance curation
  • Careful use of standard data formats
  • Software
  • OTS solutions vs. custom solutions
  • Commercial vs. open

8
Issues in Data ManagementMetadata Management
  • Traditional problem sufficient description to
  • drive processing
  • support searching browsing
  • Large collections cant afford human
    intervention
  • Greater detail needed
  • Key to future reprocessing
  • Automated generation of metadata can be expensive
    development cost (more for GO mode).
  • Automated Processing requires unified approach
  • Common methods for accessing
  • Maximizing the value of a large, uniform
    collection
  • Capturing errors, quality, astrometry, filter
    characteristics

9
Metadata Management CurationOpportunity for
Collaboration
  • Distributed Expertise
  • NCSA broad experience in managing astronomical
    archives
  • Illinois Fermilab strong interests in science
  • Fermilab strong pipeline development
    management
  • NOAO interest in long-term curation and service
    to larger community
  • Technical side
  • Databases, web-based browsing, data distribution,
    metadata-driven processing, VO services
  • Science side
  • Data product planning, catalog schemas, error
    characterization
  • Community side
  • Collection schemas, data services development, VO
    exposure

10
Collaboration Models
  • NVO/IVOA
  • Task-oriented working groups with distributed
    participation
  • Tasks led by individual institutions
  • CARMA
  • WBS centered at consortium institution
  • Coordinated design across consortium
  • Shared WBS packages

11
Issues in Data ManagementData Access
Distribution
  • Access by Processing Pipeline
  • Keep archive close to processing
  • Grid-based solutions will be key
  • Data Distribution to Users
  • What products to make available
  • What data services to provide
  • Image cutouts, SEDs, light curves, data mining
  • Massive collection data vs. services
  • The Portal Challenge Browsing heterogeneous
    products

12
Issues in Data ManagementVirtual Observatory
Visibility
  • Value of cross-correlation with other collections
  • Importance increases with time
  • Requires that detailed metadata propagate to VO
    environment
  • Likely to include information not required
    internally (e.g. by pipeline)
  • Importance of standards
  • NCSAs two-pronged strategy
  • Strategic deployment of standard custom
    services
  • SkyNode standard enabling efficient distributed
    joins
  • Provide mulitple Views of data
  • Mirroring of other large collections
  • 2MASS, DSS, FIRST,
Write a Comment
User Comments (0)
About PowerShow.com