Astrophysics%20on%20the%20OSG%20(LIGO,%20SDSS,%20DES)%20Kent%20Blackburn%20LIGO%20Laboratory%20California%20Institute%20of%20Technology - PowerPoint PPT Presentation

About This Presentation
Title:

Astrophysics%20on%20the%20OSG%20(LIGO,%20SDSS,%20DES)%20Kent%20Blackburn%20LIGO%20Laboratory%20California%20Institute%20of%20Technology

Description:

Nickolai Kuropatkin, Neha Sharma, Chris Stoughton, James Annis, ... 12 Kilobytes. 2 Megabytes. Data Output/Job. 9 Gigabytes. 1 Megabytes. Data Input/Job. 180 ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Astrophysics%20on%20the%20OSG%20(LIGO,%20SDSS,%20DES)%20Kent%20Blackburn%20LIGO%20Laboratory%20California%20Institute%20of%20Technology


1
Astrophysics on the OSG(LIGO, SDSS, DES)Kent
BlackburnLIGO LaboratoryCalifornia Institute of
Technology
  • Open Science Grid Consortium Meeting
  • University of Florida
  • January 23, 2006

2
Outline and Contributors
  • LIGO on the OSG
  • Kent Blackburn, Duncan Brown, Albert Lazzarini,
    David Meyers
  • SDSS, NEO DES on the OSG
  • Nickolai Kuropatkin, Neha Sharma, Chris
    Stoughton, James Annis, Steve Kent

3
Gravitational Wave Physics on the OSG
  • Laser Interferometer Gravitational wave
    Observatory (LIGO)
  • LIGO Scientific Collaboration (LSC)

4
LIGO on the Open Science Grid
  • Search for Gravitaitional Waves
  • Hanford, WA
  • Livingston, LA
  • Plus GEO, TAMA and VIRGO
  • LIGO Scientific Collaboration
  • 40 Institutions worldwide
  • 400 individuals contributing
  • LIGO Data Grid (LDG)
  • Nine Grid Sites
  • Over 2000 CPUs
  • Multi-Petabyte Data Archive at Caltech
  • Scientific Data Collection grouped into temporal
    Science Runs
  • Currently In Science Run 5
  • Goal to collect one year plus of design
    sensitivity data
  • One Terabyte of data each day
  • Analysis carried out primarily on the LIGO Data
    Grid (LDG)
  • Stepping out onto the OSG
  • http//www.ligo.caltech.edu

5
LIGO Data Analysis Classifications
  • Principle Classifications of Searches
  • Binary Inspiral (Neutron Stars Black Holes)
  • Consumes bulk of LIGO Data Grid resources
  • Burst (Supernovae and other Unmodeled Events)
  • Coincidence between different data streams
    necessary
  • Stochastic Background (Similar to the CMB)
  • Computationally least demanding but requires
    cross correlation
  • Periodic (Pulsars, Rotating Neutron Stars)
  • Signal sinusoidal in reference frame of source
  • All Sky Survey could promote Global Warming
    (Order 1020 FLOPS)
  • Binary Inspiral Search selected for initial
    adoption onto the OSG
  • Workflow well suited to Open Science Grid
  • Already using a similar set of Grid Technologies
    within LIGO Data Grid
  • Simple parametric parallelization of algorithms
  • Optimal filtering of data against tens of
    thousands of waveforms
  • Computationally demanding but interesting on the
    scale of the OSG
  • Expect other searches to follow once OSG
    trailblazing work done

6
Binary Inspiral Search Experiences on the Open
Science Grid
  • First attempt at July, 2005 OSG Consortium
    Meeting in Milwaukee, Wisconsin
  • Unsuccessful at submitting a binary inspiral
    workflow at any OSG site
  • Authentication was primary reason for failures
    (LIGO VO not part of 0.2.1)
  • Other issues discovered with the version of VDS
    distributed in 0.2.1
  • First successful completion of a binary inspiral
    workflow October 1st, 2005 on LIGOs OSG
    Integration Testbed Cluster at Caltech
  • Eight Node Dual CPU cluster with two terabytes of
    disk space
  • Running a patched version of VDS on top of OSG
    0.2.1
  • Used a test workflow involved 38 GBs of LIGO
    Data and workflows with about 700 DAG nodes.
  • Followed up by running at LIGOs OSG Productions
    sites at PSU(PBS) and UWM(Condor) (once VDS patch
    applied at each)
  • Collaborated with several CMS resources to
    further test outside LIGOs VO
  • Worked with clusters at San Diego, Nebraska and
    Caltech
  • All clusters added LIGOs VOMS to allow
    authentication
  • Updated OSG 0.2.1 with VDS patches
  • Mixed results do to size of LIGO data sets
    transferred for this test workflow
  • Worked with Deployment and Integration Teams to
    assure LIGOs functional requirements appeared in
    the OSG 0.4 software stack (just announced!)

7
Greatly Simplified LIGO DAG
8
LIGOs Next Move on the OSG
  • The OSG 0.4.0 release should greatly improve the
    OSG for LIGOs Binary Inspiral Workflow
  • A workflow geared toward actually conducting a
    scientific study would involve at least 16000 DAG
    nodes and close to two terabytes of data.
  • Recent OSG motivated activities in LIGO have
    produced a nearly 101 reduction is data through
    improved data selection and compression
  • Need to develop more flexible workflows that
    dont challenge the limited data storage
    resources typical of a present day OSG site
  • Pegasus is used to construct concrete DAGS from
    abstract DAX workflows
  • Flexibility here to recognize and adapt to OSG
    site specifics could facilitate greater
    utilization of the OSG as an abstract Grid
  • Develop ability to benefit from Storage Resource
    Management
  • Typical LIGO data analyses benefit from being
    able to repeat the analysis on the same data set
    with improved calibration and selection criteria
  • LIGO is currently bringing up an SE on our local
    ITB cluster at Caltech to experiment with SRM

9
Astronomy on the OSG
  • Sloan Digital Sky Survey (SDSS)
  • Experimental Astronomy Group (EAG)
  • Fermi National Accelerator Laboratory

10
Near Earth Objects
  • Near Earth Objects (NEOs)
  • Comets and Asteroids nudged by the gravitational
    attraction of planets into orbits that pass by
    the Earth's neighborhood
  • Composed of water ice and dust, formed early in
    the history of the Solar System
  • The scientific interest in comets and asteroids
    is due to their being remnants of the early solar
    system the interest in NEO is their potential
    for hitting the earth
  • 37 Near Earth Object candidates are identified in
    the SDSS imaging data
  • Apparent magnitudes r19 21 and proper motions
    of 1.3 to 18 degrees per day
  • The earth collision rate for this population
    (size greater than 20 m) is estimated to be one
    per century

11
How to find Near Earth Objects
12
NEO Workflow
13
NEO Job Statistics
Total Jobs 180 Total Input Data 91801620
GB Total Output Data121802160 K
14
Quasar Spectra Fitting using SDSS
  • Quasars are super massive black holes. Swirling
    clouds of gas and plasma falling into a black
    hole glowing at many different wavelengths. We
    measure the spectrum of the light to measure the
    properties of each quasar.
  • The SDSS provides us with 50,000 quasar spectra.
    We make fits to these spectra that include the
    following components
  • Power-law continuum, decreasing as e-l
  • A Balmer continuum due to ionized Hydrogen, with
    a characteristic bump from 2000 to 4000 Angstroms
  • Strong emission lines from ionized gas, such as
    Hydrogen, Nitrogen, Oxygen, and Magnesium
  • Many faint emission lines from Iron
  • Starlight from the galaxy that surrounds the
    quasar

15
Example Quasar Spectrum with Fit
16
Quasar Fit Production Science using the Generic
Grid Gofer (GGG)
  • All jobs are stored in jobs table.
  • Available grid sites are stored in pool table
  • Job Manager takes jobs from the database, creates
    Condor DAG files and submits them to sites from
    the pool in an automatic mode.
  • Two main parts Job Manager and DAG Creator
  • All completed stages of a job are recorded in the
    database together with submission time and
    execution time

17
Workflow in Generic Grid Gofer
Nickolai Kuropatkin
18
Astronomy Experiences on the Grid
Spectra CPU Intensive NEO DataCPU Intensive
Grid Match Ideal for Grid Grid not very happy
Total No. of Jobs 50000 180
Data Input/Job 1 Megabytes 9 Gigabytes
Data Output/Job 2 Megabytes 12 Kilobytes
Avg. Rate of Job Completion 800-1200 per day 10-15 per day ?
  • Experience tells us that Grid is more suitable
    for CPU Intensive Jobs
  • achieve parallelism
  • more jobs
  • finish sooner
  • Running locally would limit the number of jobs
    run simultaneously
  • On OSG, can run several run-rerun and camcols
    within a run-rerun in parallel
  • Current Workflow also will facilitate further
    analysis

19
Future Grid Projects in Astronomy
  • In the coming year 2005-2006 Experimental
    Astrophysics Group ( EAG) has 4 projects planned
    for the Open Science Grid
  • The Simulation effort for the Dark Energy Survey
    (DES)
  • Genetic algorithm fitting of Sloan Digital Sky
    Survey (SDSS) Quasar Spectra
  • Search for Near Earth Asteroids (NEOs) in the
    SDSS Imaging data
  • The Co-addition of the SDSS Southern Stripe
Write a Comment
User Comments (0)
About PowerShow.com