Grid Technology and Multidisciplinary Science - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Grid Technology and Multidisciplinary Science

Description:

Grid Technology and Multidisciplinary Science – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 27
Provided by: ianf194
Category:

less

Transcript and Presenter's Notes

Title: Grid Technology and Multidisciplinary Science


1
Grid Technology and Multidisciplinary Science
Ian Foster Computation Institute Argonne National
Lab University of Chicago

Virtual Observatory Science in Prague, August 21,
2006
2
TechnologyExponentials
  • Internet 100M hosts, 10 Gb/s
  • Collaboration sharing the norm
  • Moores law x103/10 yrs
  • Sensors as well as computers
  • Petascale data tsunami
  • Analysis gating step
  • our old technology?

3
New Methodologies Organizational Structures
4
GridA Unifying Concept Technology
  • Grid enables the federation of resources to
    support applications communities
  • Distributed computers, storage, data, people,
  • Networks provide connectivity
  • Software standards provide the glue
  • Infrastructure services facilitate operation

3D Model of the Node of Ranvier
VO Virtual Organization (distributed community,
trust, policies)
5
Software, Infrastructure, Standards
Database
Specialized resource
Computers
Storage
6
Network for Earthquake Engineering Simulation
System-Level Problem
Grid technology
www.nees.org
7
Earth System Grid
  • On-demand access to climate simulation data
  • Multiple archives
  • Interactive query
  • Per-collection control
  • Server-side processing
  • Major scientific impact
  • gt2000 users
  • gt100 TB downloaded
  • gt300 scientific papers

www.earthsystemgrid.org DOE OASCR
8
Under the Covers
9
Science Gateways Biology
Public PUMA Knowledge Base Information about
proteins analyzed against 2 million gene
sequences
Back OfficeAnalysis on Grid Millions of BLAST,
BLOCKS, etc., onOSG and TeraGrid
Natalia Maltsev et al., http//compbio.mcs.anl.gov
/puma2
10
600-1000 CPUs
Genome Analysis DB Update (GADU)
11
The Globus-BasedLIGO Data Grid
LIGO Gravitational Wave Observatory
Birmingham
Replicating gt1 Terabyte/day to 8 sites gt120
million replicas so far MTBF 1 month
www.globus.org/solutions
12
Data Replication Service
  • Pull missing files to a storage system

Data Location
Data Movement
GridFTP
Local ReplicaCatalog
Replica LocationIndex
Reliable File Transfer Service
Local Replica Catalog
GridFTP
Replica LocationIndex
Data Replication
List of required Files
Data Replication Service
Design and Implementation of a Data Replication
Service Based on the Lightweight Data Replicator
System, Chervenak et al., 2005
13
And Virtual Observatories?
  • Isnt a VO just about remote access to data?
  • No A VO is a VO!
  • (A virtual observatory is a virtual organization)
  • ? A virtual observatory must federate
  • Data, computers, storage, computers
  • securely, reliably, efficiently, scalably
  • And thus requires federation technologies

Web browser
14
Virtual Observatory Grid
web service
web service
publish WSDL
web service
job
Registry Workflow GLUE Authentication
MySpace
web service
grid connected
standard semantics
anything
application
results
web service
web service
15
Federating ComputersComplex Popular Services
Tell me about this star
Tell me something complex about 20K stars
Support 1000sof users
E.g., Sloan DigitalSky Survey, 10 TB others
much bigger
16
Analyzing Large DataMove Computation to the
Data
  • But
  • Amount of computation can be enormous
  • Load can vary tremendously
  • Users want to compose distributed services ? data
    must sometimes be moved, anyway
  • Fortunately
  • Networks are getting much faster (in parts)
  • Workloads can have significant locality of
    reference

17
Move Computation to the Core
Poorly connected periphery
Highly connected core
18
(No Transcript)
19
Highly Connected Core E.g., TeraGrid
75 Teraflops (trillion calculations/s) 12,500
faster than all 6 billion humans on earth
each doing one calculation per second
  • 16 Supercomputers - 9 different types, multiple
    sizes
  • Worlds fastest network
  • Globus Toolkit and other middleware providing
    single login, application management, data
    movement, web services

30 Gigabits/s to large sites 20-30 times major
uni links 30,000 times my home broadband 1
full length feature film per sec
ANL
Starlight
LA
Atlanta
SDSC
TACC
PU
IU
ORNL
NCSA
PSC
20
Using Grid Infrastructureto Respond to Popularity
  • Purpose
  • On-demand stacks of random locations within
    10TB dataset
  • Challenge
  • Rapid access to 10-10K random files
  • Time-varying load
  • Solution
  • Dynamic acquisition of compute, storage

Sloan Data
S4
Web page or Web Service
Joint work with Ioan Raicu Alex Szalay
21
Preliminary Performance (TeraGrid, LAN GPFS)
We reduce solution time from 3 months to 3
minutes
22
Mosaic of M42 created on TeraGrid B. Berriman,
J. Good (Caltech) J. Jacob, D. Katz (JPL) G.
Singh, M. Su, E. Deelman (ISI)
Ewa Deelman, deelman_at_isi.edu www.isi.edu/deelma
n pegasus.isi.edu
23
A Small Example Workflow
24
Montage Service
Region Name, Degrees
Pegasus
Grid Scheduling
mGridExec
Concrete Workflow
JPL
ISI
User Portal
and Execution
Abstract
Service
Condor DAGMAN
Workflow
DAGMan
Abstract
mDAGFiles
Workflow
TeraGrid
Clusters
Abstract
Grid
Workflow
JPL
SDSC
Service
Image
m2MASSList
mNotify
List
NCSA
2MASS
User
IPAC
Image List
Notification
IPAC
Service
Service
ISI
Condor Pool
  • Initial prototype implemented and tested on
    TeraGrid
  • Production service open to the community this year

25
Service Oriented Science
  • People create services (data or functions)
  • which I discover
  • maybe compose to create a new function ...
  • and then publish as a new service.
  • ? I find someone else to host services, so I
    dont have to become an expert in operating
    services computers!
  • ? I hope that this someone else can manage
    security, reliability, scalability,

!
!
Service-Oriented Science, Science, 2005
26
Summary
  • Grid is a unifying concept technology for
    distributed applications communities
  • Federation on-demand access to resources
  • Virtual organizations
  • Extremely widely applied in many domains
  • Virtual observatory is also about federation
  • of data, people, computers
  • Increasingly relevant as VO scales
  • Promising early applications
Write a Comment
User Comments (0)
About PowerShow.com