Title: The Grid Reality, Technology,
1The GridReality, Technology, Applications
Ian Foster Argonne National Laboratory University
of Chicago Globus Alliance Univa Corporation
2Overview
- eScience
- Grid technology
- Application case studies
- Next steps
3Context (1)Technological Evolution
- Internet 100M hosts, 10 Gb/s
- Collaboration sharing the norm
- Universal Moores law x103/10 yrs
- Sensors as well as computers
- Petascale data tsunami
- Analysis gating step
- our old infrastructure?
4Context (2)Global Knowledge Communities
5Context (3)A Powerful New Three-way Alliance
Requires much engineering and innovation
Changes culture, mores, andbehaviours
CS as the new mathematics George Djorgovski
6Cross-Cutting Requirement Assemble Expertise
Resources
Transform resources into on-demand services
accessible to any individual or team
7Overview
- eScience
- Grid technology
- Application case studies
- Next steps
8The Application-Infrastructure Gap
- Dynamicand/orDistributedApplications
9Bridging the GapGrid Technolgy
Users
- Service-oriented applications
- Wrap applications as services
- Compose applicationsinto workflows
Composition
Workflows
Invocation
ApplnService
ApplnService
10Bridging the GapGrid Technolgy
Users
- Service-oriented applications
- Wrap applications as services
- Compose applicationsinto workflows
Composition
Workflows
Invocation
ApplnService
ApplnService
- Service-orientedGrid infrastructure
- Provision physicalresources to support
application workloads
11Grid TechnologyService-Oriented Infrastructure
User Application
User Application
User Application
Database
Specialized resource
Computers
Storage
12Core
Globus Toolkit version 4 (GT4)
Contrib/Preview
Grid Telecontrol Protocol
www.globus.org
Depre-cated
Community Scheduling Framework
Delegation
Data Replication
Python WS Core
WebMDS
Data Access Integration
CommunityAuthorization
Trigger
C WS Core
Workspace Management
Web ServicesComponents
Authentication Authorization
Reliable File Transfer
Grid Resource Allocation Management
Index
Java WS Core
Pre-WS Authentication Authorization
GridFTP
Pre-WS Grid Resource Alloc. Mgmt
Pre-WSMonitoring Discovery
C Common Libraries
Non-WS Components
Replica Location
eXtensible IO (XIO)
Credential Mgmt
Data Mgmt
Security
CommonRuntime
Execution Mgmt
Info Services
13GlobalCommunity
14Overview
- eScience
- Grid technology
- Application case studies
- Earth System Grid
- SCEC Community Modeling Environment
- Open Science Grid
- Network for Earthquake Eng. Simulation
- LIGO gravitational wave observatory
- Next steps
15DOE Earth System Grid
- Goal address technical obstacles to the
sharing analysis of high-volume data from
advanced earth system models
www.earthsystemgrid.org
16Earth System Grid
- Facilitate the movement, management, and
publication of data - Publish and catalog past, current, and future
climate datasets onto the Earth System Grid - Create an easy-to-use Google-like web portal to
locate climate data - Provide convenient file-based download and
delivery mechanisms - Allow a user to request a subset of a remote
dataset
17Earth System GridTopology
LBNL
ANL
gridFTP SERVER
CAS
NCAR
HPSS
LAS SERVER
visualize
ORNL
HRM
gridFTP
gridFTP SERVER
DISK
RLI
LRC
HPSS
gridFTP SERVER
gridFTP
HRM
HRM
execute
MSS
LLNL
RLI
DISK
LRC
gridFTP
HRM
LRC
cross-update
cross-update
RLI
RLI
gridFTP SERVER
LRC
GRAM GATEKEEPER
query
ESG WEB PORTAL Tomcat/Struts
submit
ISI
authenticate
OGSA-DAI MySQL RDBMS
query
MyProxy
18The Many Dimensions of Scaling
Disk-to-disk onTeraGrid
gt2 Gbyte/s
GridFTP High-performance, reliable data movement
Replica location LIGO 40M files, 10 sites
19ESG Portal
20SCEC Community Modeling Environment
A collaboratory for system-level earthquake
science
KNOWLEDGE REPRESENTATION REASONING Knowledge
Server Knowledge base access, Inference Translatio
n Services Syntactic semantic translation
Knowledge Base
Ontologies Curated taxonomies, Relations
constraints
Pathway Models Pathway templates, Models of
simulation codes
DIGITAL LIBRARIES Navigation Queries Versioning
, Replication Mediated Collections Federated acce
ss
KNOWLEDGE ACQUISITION Acquisition
Interfaces Dialog planning, Pathway
construction strategies Pathway Assembly Template
instantiation, Resource selection, Constraint
checking
Code Repositories
FSM
RDM
AWM
SRM
Users
Data Simulation Products
Data Collections
GRID Pathway Execution Policy, Data ingest,
Repository access Grid Services Compute storage
management, Security
Pathway Instantiations
Storage
Computing
21Seismic Hazard Analysis
- Defn Max. intensity of shaking expected at a
site during a fixed time interval - Example National seismic hazard maps
- Intensity measure peak ground acceleration
- Interval 50 yrs
- Probability of exceedance 2
22SHA Computational Pathways
Standardized Seismic Hazard Analysis Ground
motion simulation Physics-based earthquake
forecasting Ground-motion inverse problem
1
2
3
Other Data Geology Geodesy
4
Unified Structural Representation
Invert
4
Faults Motions Stresses
Anelastic model
Ground Motions
AWM
SRM
RDM
FSM
3
2
Intensity Measures
Earthquake Forecast Model
Attenuation Relationship
1
AWP Anelastic Wave Propagation SRM Site
Response Model
FSM Fault System Model RDM Rupture Dynamics
Model
23SCEC Computations Grid
- Prepare input to Pathway2 wave propagation
code - Pathway2PGV converts output into hazard map
- Map is visualized
24- Grid2003 ? Open Science Grid
- 30 sites (2100-2800 CPUs) growing
- 400-1300 concurrent jobs
- 8 substantial applications CS experiments
- Running since October 2003
Korea
http//www.ivdgl.org/grid2003
25Open Science Grid Components
- Computers storage at 28 sites (to date)
- 2800 CPUs
- Uniform service environment at each site
- Globus Toolkit provides basic authentication,
execution management, data movement - Pacman installation system enables installation
of numerous other VDT and application services - Global virtual organization services
- Certification registration authorities, VO
membership services, monitoring services - Client-side tools for data access analysis
- Virtual data, execution planning, DAG management,
execution management, monitoring - IGOC iVDGL Grid Operations Center
26Grid2003 Applications To Date
- CMS proton-proton collision simulation
- ATLAS proton-proton collision simulation
- LIGO gravitational wave search
- SDSS galaxy cluster detection
- ATLAS interactive analysis
- BTeV proton-antiproton collision simulation
- SnB biomolecular analysis
- GADU/Gnare genone analysis
- Various computer science experiments
www.ivdgl.org/grid2003/applications
27ScalingGrid2003Workflows
Genome sequence analysis
Sloan digital sky survey
Physics data analysis
28Example Grid3 ApplicationNVO Mosaic Construction
Construct custom mosaics on demand from multiple
data sources User specifies projection,
coordinates, size, rotation, spatial sampling
NVO/NASA Montage A small (1200 node) workflow
Work by Ewa Deelman et al., USC/ISI and Caltech
29NEES Network for Earthquake Engineering
Simulation
Links instruments, data, computers, people
30Sub-structured Computational and Experimental
Simulation
- Use unique experimental facilities of distributed
NEES sites. - Incorporate various static analysis modules, such
as ABAQUS, FedeasLab, Matlab, OpenSees, Zeus-NL,
etc., into virtual experiment - Flexible combination of modules are possible all
experiments, combination of experiment and
computation, all computation
Time Integrator
Tested Structure
Static Analysis Module
Experiment Module
31Simulation Model
Sub-Structuring Technique
Deck
Pier 3
Pier 1
Soil 1
Soil 3
Dummy-Time 1sec.
32July MOST Experiment
UIUC Experimental Model
U. Colorado Experimental Model
SIMULATION COORDINATOR
NCSA Computational Model
33MOST Column Test Specimens
Illinois Test Specimen
Colorado Test Specimen
34The MOST Substructures
Slide courtesy of Bill Spencer and Narutoshi
Nakata, UIUC
35Experimental Results
36The Santa Monica Freeway Ramp Structure
N Venice Bvld.
S Venice Bvld
La Cienega Bvld.
Piers
Collector-Distributor 36 - Plan
37Structural Failure
38Preliminary Pier Test at UIUC
39LIGO Gravitational Wave ObservatoryReliable Wide
Area Data Replication
- 6 US sites 3 EU sites
- Data produced at 1 Terabyte/day
- LDR GridFTP RLS GSI pyGlobus
- Replicating gt 1 Terabyte/day (GridFTP)
- gt 30 million replicas so far (RLS)
- MTBF 1 month
40Overview
- eScience
- Grid technology
- Application case studies
- Next steps
41Summary
- eScience methods no longer optional but now
vital to scientific competitiveness - Particularly true of environmental sciences
- Weve demonstrated feasibility
- Grid technology is available ready to use
- Considerable infrastructure exists
- Now time for large-scale deployment
- NMI GRIDS Center can help with outreach,
training, community engagement, support - You need to create sustained, multidisciplinary
efforts to transform scientific methods
42Lessons Learned
- Application communities should be ready to
participate from the beginning - Leadership domain expert needed
- Policy issues must be considered up front
- Test-beds help to overcome cultural language
differences (e.g., MOST, EBD) - Social engineering will be at least as important
as software engineering - Well-defined user interfaces will be critical for
successful software development
43NSF-Supported Cyberinfrastructure
- NSF Middleware Initiative GRIDS Center
- Training and support
- Focused engagements to design domain-specific
integrated solutions - Community Engagement Workshops (next one
scheduled for April 2005) - Open build/test tools and facility
- Other NMI-funded projects
- Wealth of infrastructure software
- TeraGrid, Open Science Grid
- On-demand access to computing storage
44GRIDS Center Distributed Test Facility
45Scaling UpService-Oriented Science
Simulation code
Expt design
Simulation code
Content
Expt output
Certificate authority
Electronic notebook
Telepresence monitor
Simulation server
Services
Portal server
Data archive
Metadata catalog
Resources
Servers, storage, networks
Experimental apparatus
46For More Information
- Globus Alliance
- www.globus.org
- Global Grid Forum
- www.ggf.org
- Open Science Grid
- www.opensciencegrid.org
- NSF Middleware Initiative
- www.grids-center.org
- www.nsf-middleware.org
2nd Edition www.mkp.com/grid2