Astronomy Applications in the TeraGrid Environment - PowerPoint PPT Presentation

1 / 59
About This Presentation
Title:

Astronomy Applications in the TeraGrid Environment

Description:

3. Condor-G. Nice interface atop Globus, monitoring of all ... A DAG file controls the order in which the Condor files are run. NVO Summer School Sept 2004 ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 60
Provided by: usvo
Learn more at: http://www.us-vo.org
Category:

less

Transcript and Presenter's Notes

Title: Astronomy Applications in the TeraGrid Environment


1
Astronomy Applications in the TeraGrid
Environment
  • Roy Williams, Caltech
  • with thanks for material toSandra Bittner, ANL
    Sharon Brunett, Caltech Derek Simmel, PSC
    John Towns, NCSA Nancy Wilkins-Diehr, SDSC

2
The TeraGrid VisionDistributing the resources is
better than putting them at one site
  • Build new, extensible, grid-based infrastructure
    to support grid-enabled scientific applications
  • New hardware, new networks, new software, new
    practices, new policies
  • Expand centers to support cyberinfrastructure
  • Distributed, coordinated operations center
  • Exploit unique partner expertise and resources to
    make whole greater than the sum of its parts
  • Leverage homogeneity to make the distributed
    computing easier and simplify initial development
    and standardization
  • Run single job across entire TeraGrid
  • Move executables between sites

3
What is Grid Really?
  • A set of powerful Beowulf clusters
  • Lots of disk storage
  • Fast interconnection
  • Unified account management
  • Interesting software
  • The Grid is not
  • Magic
  • Infinite
  • Simple
  • A universal panacea
  • The hype that you have read

4
Grid as Federation
  • Teragrid as a federation
  • independent centers
  • ? flexibility
  • unified interface
  • power and strength
  • Large/small state compromise

5
TeraGrid Wide Area Network
6
Grid Astronomy
7
Quasar ScienceAn NVO-Teragrid projectPennState,
CMU, Caltech
  • 60,000 quasar spectra from Sloan Sky Survey
  • Each is 1 cpu-hour submit to grid queue
  • Fits complex model (173 parameter)
  • derive black hole mass from line widths

clusters
NVO data services
globusrun
manager
8
N-point galaxy correlationAn NVO-Teragrid
projectPitt, CMU
Finding triple correlation in 3D SDSS galaxy
catalog (RA/Dec/z) Lots of large parallel
jobs kd-tree algorithms
9
Palomar-Quest SurveyCaltech, NCSA, Yale
Transient pipeline computing reservation at
sunrise for immediate followup of
transients Synoptic survey massive
resampling (Atlasmaker) for ultrafaint
detection
P48 Telescope
50 Gbyte/night
ALERT
Caltech
Yale
TG ?
NCSA
NCSA and Caltech and Yale run different pipelines
on the same data
5 Tbyte
10
Transient from PQ
from catalog pipeline
11
PQ stacked images
from image pipeline
12
Wide-area Mosaicking (Hyperatlas)An NVO-Teragrid
projectCaltech
DPOSS 15ยบ
High-quality flux-preserving, spatial
accuracy Stackable Hyperatlas Edge-free Pyramid
weight Mining AND Outreach
Griffith Observatory "Big Picture"
13
2MASS Mosaicking portalAn NVO-Teragrid
projectCaltech IPAC
14
Teragrid hardware
15
TeraGrid Components
  • Compute hardware
  • Intel/Linux Clusters, Alpha SMP clusters, POWER4
    cluster,
  • Large-scale storage systems
  • hundreds of terabytes for secondary storage
  • Very high-speed network backbone
  • bandwidth for rich interaction and tight
    coupling
  • Grid middleware
  • Globus, data management,
  • Next-generation applications

16
Overview of Distributed TeraGrid Resources
Site Resources
Site Resources
HPSS
HPSS
External Networks
External Networks
Caltech
Argonne
External Networks
External Networks
NCSA/PACI 10.3 TF 240 TB
SDSC 4.1 TF 225 TB
Site Resources
Site Resources
HPSS
UniTree
17
Compute Resources NCSA2.6 TF ? 10.6 TF w/ 230
TB
30 Gbps to TeraGrid Network
GbE Fabric
8 TF Madison 667 nodes
2.6 TF Madison 256 nodes
Storage I/O over Myrinet and/or GbE
2p Madison 4 GB memory 2x73 GB
2p Madison 4 GB memory 2x73 GB
2p 1.3 GHz 4 or 12 GB memory 73 GB scratch
2p Madison 4 GB memory 2x73 GB
250MB/s/node 670 nodes
250MB/s/node 256 nodes
256 2x FC
Myrinet Fabric
Brocade 12000 Switches
92 2x FC
InteractiveSpare Nodes
230 TB
8 4p Madison Nodes
Login, FTP
18
Compute Resources SDSC 1.3 TF ? 4.3 1.1 TF
w/ 500 TB
30 Gbps to TeraGrid Network
GbE Fabric
3 TF Madison 256 nodes
1.3 TF Madison 128 nodes
2p Madison 4 GB memory 2x73 GB
2p 1.3 GHz 4 GB memory 73 GB scratch
2p Madison 4 GB memory 2x73 GB
128 250MB/s
128 250MB/s
128 250MB/s
128 2x FC
128 2x FC
128 2x FC
Myrinet Fabric
Brocade 12000 Switches
256 2x FC
500 TB
InteractiveSpare Nodes
6 4p Madison Nodes
Login, FTP
19
Compute Resources Caltech 100 GF w/ 100 TB
30 Gbps to TeraGrid Network
GbE Fabric
6 Opteron nodes
33 IA32 storage nodes 100 TB /pvfs
72 GF Madison 36 IBM/Intel nodes
34 GF Madison 17 HP/Intel nodes
2p Madison 6 GB memory 2x73 GB
2p Madison 6 GB memory 73 GB scratch
2p ia32 6 GB memory 100 TB /pvfs
4p Opteron 8 GB memory 66 TB RAID5 HPSS
Datawulf
2p Madison 6 GB memory 73 GB scratch
36 250MB/s
33 250MB/s
17 250MB/s
Myrinet Fabric
13 2xFC
Interactive Node
2p IBM Madison Node
Login, FTP
13 Tape drives 1.2 PB silo raw capacity
20
Using Teragrid
21
Wide Variety of Usage Scenarios
  • Tightly coupled jobs storing vast amounts of
    data, performing visualization remotely as well
    as making data available through online
    collections (ENZO)
  • Thousands of independent jobs using data from a
    distributed data collection (NVO)
  • Science Gateways "not a Unix prompt"!
  • from web browser with security
  • from application eg IRAF, IDL

22
Traditional Parallel Processing
  • Single executables to be on a single remote
    machine
  • big assumptions
  • runtime necessities (e.g. executables, input
    files, shared objects) available on remote
    system!
  • login to a head node, choose a submission
    mechanism
  • Direct, interactive execution
  • mpirun np 16 ./a.out
  • Through a batch job manager
  • qsub my_script
  • where my_script describes executable location,
    runtime duration, redirection of stdout/err,
    mpirun specification

23
Traditional Parallel Processing II
  • Through globus
  • globusrun -r some-teragrid-head-node.teragrid.or
    g/jobmanager -f my_rsl_script
  • where my_rsl_script describes the same details as
    in the qsub my_script!
  • Through Condor-G
  • condor_submit my_condor_script
  • where my_condor_script describes the same details
    as the globus my_rsl_script!

24
Distributed Parallel Processing
  • Decompose application over geographically
    distributed resources
  • functional or domain decomposition fits well
  • take advantage of load balancing opportunities
  • think about latency impact
  • Improved utilization of a many resources
  • Flexible job management

25
Pipelined/dataflow processing
  • Suited for problems which can be divided into a
    series of sequential tasks where
  • multiple instances of problem need executing
  • series of data needs processing with multiple
    operations on each series
  • information from one processing phase can be
    passed to next phase before current phase is
    complete

26
Security
  • ssh with password
  • Too much password-typing
  • Not very secure-- big break-in at TG April 04
  • One failure is a big failure
  • all TG!
  • Caltech and Argonne no longer allow this
  • SDSC does not allow password change

27
Security
  • ssh with public key single sign-on!
  • use ssh-keygen on Unix or puttykeygen on Windows
  • public key file (eg id_rsa.pub) AND
  • private key file (eg id_rsa) AND
  • passphrase
  • on remote machine, put public ke
  • .ssh/authorized_keys
  • on local machine, combine
  • private key and passphrase
  • ATM card model
  • On TG, can put public key on application form
  • immediate login, no snailmail

28
Security
  • X.509 certificates single sign-on!
  • from a Certificate Authority (eg verisign, US
    navy, DOE, etc etc)
  • It is
  • Distinguished Name (DN) AND
  • /CUS/ONational Center for Supercomputing
    Applications/CNRoy Williams
  • Private file (usercert.p12) AND
  • passphrase
  • Remote machine needs entry in gridmap file (maps
    DN to account)
  • use gx-map command
  • Can create certificate with ncsa-cert-request etc
  • Certificates can be lodged in web browser

29
3 Ways to Submit a Job
  • 1. Directly to PBS Batch Scheduler
  • Simple, scripts are portable among PBS TeraGrid
    clusters
  • 2. Globus common batch script syntax
  • Scripts are portable among other grids using
    Globus
  • 3. Condor-G
  • Nice interface atop Globus, monitoring of all
    jobs submitted via Condor-G
  • Higher-level tools like DAGMan

30
PBS Batch Submission
  • ssh tg-login.caltechncsasdscuc.teragrid.org
  • qsub flatten.sh v "FILEf544"
  • qstat or showq
  • ls .dat
  • pbs.out, pbs.err files

31
globus-job-submit
  • For running of batch/offline jobs
  • globus-job-submit Submit job
  • same interface as globus-job-run
  • returns immediately
  • globus-job-status Check job status
  • globus-job-cancel Cancel job
  • globus-job-get-output Get job stdout/err
  • globus-job-clean Cleanup after job

32
Condor-G Job Submission
tg-login.sdsc.teragrid.org
mickey.disney.edu
Globus job manager
Globus API
Condor-G
executable/wd/doit universeglobus globusschedule
rltgt globusrsl(maxtime10) queue
PBS
33
Condor-G
  • Combines the strengths of Condor
  • and the Globus Toolkit
  • Advantages when managing grid jobs
  • full featured queuing service
  • credential management
  • fault-tolerance
  • DAGman ( pipelines)

34
Condor DAGMan
  • Manages workflow interdependencies
  • Each task is a Condor description file
  • A DAG file controls the order in which the Condor
    files are run

35
Wheres the disk
  • Home directory
  • TG_CLUSTER_HOME
  • example /home/roy
  • Shared writeable global areas
  • TG_CLUSTER_PFS
  • example /pvfs/MCA04N009/roy

36
GridFtp
  • Moving a Test File
  • globus-url-copy "grid-cert-info -subject" \
    gsiftp//localhost5678/tmp/file1 \
    file///tmp/file2
  • Also uberftp and scp

37
Storage Resource Broker (SRB)
  • Single logical namespace while accessing
    distributed archival storage resources
  • Effectively infinite storage (first to 1TB wins a
    t-shirt)
  • Data replication
  • Parallel Transfers
  • Interfaces command-line, API, web/portal.

38
Storage Resource Broker (SRB)Virtual Resources,
Replication
NCSA
SDSC
SRB Client (cmdline, or API)

39
Allocations Policies
  • TG resources allocated via the PACI allocations
    and review process
  • modeled after NSF process
  • TG considered as single resource for grid
    allocations
  • Different levels of review for different size
    allocation requests
  • DAC up to 10,000
  • PRAC/AAB lt200,000 SUs/year
  • NRAC 200,000 SUs/year
  • Policies/procedures posted at
  • http//www.paci.org/Allocations.html
  • Proposal submission through the PACI On-Line
    Proposal System (POPS)
  • https//pops-submit.paci.org/

minimal review, fast turnaround
40
Requesting a TeraGrid Allocation
  • http//www.paci.org

41
24/7 Consulting Support
  • help_at_teragrid.org
  • advanced ticketing system for cross-site support
  • staffed 24/7
  • 866-336-2357, 9-5 Pacific Time
  • http//news.teragrid.org/
  • Extensive experience solving problems for early
    access users
  • Networking, compute resources, extensible
    TeraGrid resources

42
Links
  • www.teragrid.org/userinfo
  • getting an account
  • help_at_teragrid.org
  • news.teragrid.org
  • site monitors

43
DemoData intensive computing with NVO services
44
DPOSS flattening
Source
Target
2650 x 1.1 Gbyte files Cropping borders Quadratic
fit and subtract Virtual data
45
Driving the Queues
for f in os.listdir(inputDirectory) if the
file exists, with the right size and age, then we
keep it ofile outputDirectory "/" f
if os.path.exists(ofile) osize
os.path.getsize(ofile) if osize !
1109404800 print " -- wrong
target size, remaking", osize else
time_tgt filetime(ofile)
time_src filetime(file)
if time_tgt lt time_src
print(" -- target too old or nonexistant,
making") else
print " -- already have target file "
continue cmd "qsub flat.sh -v
\"FILE" f "\"" print " -- submitting
batch job ", cmd os.system(cmd)
  • Here is the driver that makes and submits jobs

46
PBS script
  • A PBS script. Can do "qsub script.sh v
    "FILEf345"

!/bin/sh PBS -N dposs PBS -V PBS -l
nodes1 PBS -l walltime10000 cd
/home/roy/dposs-flat/flat ./flat \ -infile
/pvfs/mydata/source/FILE.fits \ -outfile
/pvfs/mydata/target/FILE.fits \ -chop 0 0 1500
23552 \ -chop 0 0 23552 1500 \ -chop 0 22052
23552 23552 \ -chop 22052 0 23552 23552 \ -chop
18052 0 23552 4000
47
Atlasmakera service-oriented applicationon
Teragrid
Federated Images wavelength, time, ...
VO Registry
SIAP
  • SWarp

Hyperatlas
source detection average/max subtraction
48
Hyperatlas
Standard naming for atlases and pages TM-5-SIN-20
Page 1589
Standard Scales scale s means 220-s arcseconds
per pixel
Standard Layout
TM-5 layout
Standard Projections
HV-4 layout
49
Hyperatlas is a Service
  • All Pages ltbaseURLgt/getChart?atlasTM-5-SIN-20
  • 0 2.77777778E-4 'RA---SIN 'DEC--SIN' 0.0
    -90.0
  • 1 2.77777778E-4 'RA---SIN 'DEC--SIN' 0.0 -85.0
  • 2 2.77777778E-4 'RA---SIN 'DEC--SIN' 36.0
    -85.0
  • ...
  • 1731 2.77777778E-4 'RA---SIN 'DEC--SIN' 288.0
    85.0
  • 1732 2.77777778E-4 'RA---SIN 'DEC--SIN' 324.0
    85.0
  • 1733 2.77777778E-4 'RA---SIN 'DEC--SIN' 0.0
    90.0
  • Best Page ltbaseURLgt/getChart?atlasTM-5-SIN-20RA
    182Dec62
  • 1604 2.77777778E-4 'RA---SIN 'DEC--SIN'
    184.61538 60.0
  • Numbered Page ltbaseURLgt/getChart?atlasTM-5-SIN-2
    0page1604
  • 1604 2.77777778E-4 'RA---SIN' 'DEC--SIN'
    184.61538 60.0
  • Replicated Implementations
  • baseURL http//mercury.cacr.caltech.edu8080/hyp
    eratlas (try services)
  • baseURL http//virtualsky.org/servlet

50
GET services from Python
  • This code uses a service to find the best
    hyperatlas page for a given sky location

hyperatlasURL self.hyperatlasServer
"/getChart?atlas" atlas \ "RA"
str(center1) "Dec" str(center2) stream
urllib.urlopen(hyperatlasURL) result is a
tab-separated line, so use split() to
tokenize tokens stream.readline().split('\t') pr
int "Using page ", tokens0, " of atlas ",
atlas self.scale float(tokens1) self.CTYPE1
tokens2 self.CTYPE2 tokens3 rval1
float(tokens4) rval2 float(tokens5)
51
VOTable parser in Python
  • From a SIAP URL, we get the XML, and extract the
    columns that have the image references, image
    format, and image RA/Dec

stream urllib.urlopen(SIAP_URL) doc
xml.dom.minidom.parse(stream) Make a dictionary
for the columns col_ucd_dict for XML_TABLE
in doc.getElementsByTagName("TABLE") for
XML_FIELD in XML_TABLE.getElementsByTagName("FIELD
") col_ucd XML_FIELD.getAttribute("
ucd") col_ucd_dictcol_title
col_counter urlColumn col_ucd_dict"VOXImage_A
ccessReference" formatColumn
col_ucd_dict"VOXImage_Format" raColumn
col_ucd_dict"POS_EQ_RA_MAIN" deColumn
col_ucd_dict"POS_EQ_DEC_MAIN"
(need exception catching here)
52
VOTable parser in Python
  • Table is a list of rows, and each row is a list
    of table cells

table for XML_TABLE in doc.getElementsByTagName
("TABLE") for XML_DATA in XML_TABLE.getElemen
tsByTagName("DATA") for XML_TABLEDATA in
XML_DATA.getElementsByTagName("TABLEDATA")
for XML_TR in XML_TABLEDATA.getElementsByTag
Name("TR") row
for XML_TD in XML_TR.getElementsByTagName("TD")
data ""
for child in XML_TD.childNodes
data child.data
row.append(data) table.append(row)
53
SOAP client in Python
  • WCSTools (xy2sky and sky2xy) as web services

from SOAPpy import get fitsheader string as
FITS header get x1, x2 as coordinates on
image server SOAPProxy("http//mercury.cacr.cal
tech.edu9091") wcsR server.xy2sky(fitsheade
r, x1, x2) ra wcsR"c1" dec
wcsR"c2" status wcsR"status" message
wcsR"message" print "Sky coordinates are",
ra, dec print "status is ", status print
"Message is ", message
54
Future Science Gateways
55
Teragrid Impediments
and now do some science....
Learn Globus Learn MPI Learn PBS Port code to
Itanium Get certificate Get logged in Wait 3
months for account Write proposal
56
A better wayGraduated Securityfor Science
Gateways
power user
Write proposal - own account
big-ironcomputing....
Authenticate X.509 - browser or cmd line
morescience....
Register - logging and reporting
somescience....
Web form - anonymous
57
Secure Web servicesfor Teragrid Access
web form
(browser has certificate)
Clarens BOSS PBS Gridport Xforms
Embedded in existing client application (Root,
IRAF, IDL, ...)
auto-generated client API for scripted
submission (certificate in .globus/)
distribute jobs on grid
Embedded as part of other service (proxy agent)
58
Secure Web servicesfor Teragrid Access
Shell command
List files, get files
Submit job to TG queue (Condor / Dagman /
globusrun)
Monitor running jobs
59
Teragrid Wants YOU!
  • Your astronomy applications
  • Your science gateway projects
  • Teragrid has 100's of processors and 100's of
    terabytes
  • Talk To Me!
Write a Comment
User Comments (0)
About PowerShow.com