Title: The Anatomy of the Grid Enabling Scalable Virtual Organizations
1 The Anatomy of the GridEnabling Scalable
Virtual Organizations
- Ian Foster
- Mathematics Computer Science Division
- Argonne National Laboratory
- and
- Dept. of Computer Science
- The University of Chicago
- http//www.mcs.anl.gov/foster
David S. Angulo Dept. of Computer Science The
University of Chicago and Mathematics Computer
Science Division Argonne National
Laboratory http//www.cs.uchicago.edu/dangulo
2nd US-Hungarian Workshop on Cluster and Grid
Computing, February 6, 2002
2Abstract
- "Grid" computing has emerged as an important new
field - Distinguished from conventional distributed
computing by focus on - Large-scale resource sharing
- Innovative applications
- High-performance orientation (in some cases)
- In this talk, this new field is defined
- First, "Grid problem reviewed, which Ian Foster
defines as - flexible, secure, coordinated resource sharing
- among dynamic collections of individuals,
institutions, and resources (referred to as
virtual organizations) - Challenges in such settings
- authentication
- authorization
- resource access
- resource discovery
- and other challenges
3Abstract (Cont.)
- This class of problem addressed by Grid
technologies - Major Grid projects worldwide reviewed
- Describe their contributions to the realization
of this architecture. - Future Architecture Overview
- Open Grid Services Architecture is presented
4Partial Acknowledgements
- Globus ToolkitTM
- RD involves
- many fine scientists engineers at ANL/UofC,
USC/ISI, and elsewhere (see www.globus.org) - Led by
- Ian Foster _at_ Argonne/UofC
- Carl Kesselman _at_ USC/ISI
- Open Grid Services Architecture work performed by
- Ian Foster, Globus Co-PI _at_ Argonne/UofC
- Carl Kesselman, Globus Co-PI _at_ USC/ISI
- Steve Tuecke, Globus Toolkit Architect _at_ANL
- Jeff Nick, Steve Graham, Jeff Frey _at_ IBM
- Strong collaborations with many outstanding EU,
UK, US Grid projects - Support from DOE, NASA, NSF, Microsoft, IBM
5Grid Computing
6The Grid Problem
- Resource sharing coordinated problem solving
in dynamic, multi-institutional virtual
organizations
7Why Grids?
- A biochemist exploits 10,000 computers to screen
100,000 compounds in an hour - 1,000 physicists worldwide pool resources for
petaflop analyses of petabytes of data - Civil engineers collaborate to design, execute,
analyze shake table experiments - Climate scientists visualize, annotate, analyze
terabyte simulation datasets - A home user invokes architectural design
functions at an application service provider - An application service provider purchases cycles
from compute cycle providers
8Elements of the Problem
- Resource sharing
- Computers, storage, sensors, networks,
- Sharing always conditional issues of trust,
policy, payment, - Coordinated problem solving
- Beyond client-server distributed data analysis,
computation, - Dynamic, multi-institutional virtual orgs
- Community overlays on classic org structures
- Large or small, static or dynamic
9Grids Why Now?
- Moores law improvements in computing produce
highly functional end systems - The Internet and burgeoning wired and wireless
provide universal connectivity - Network exponentials produce dramatic changes in
geometry and geography
10Grids Why Now?
- Moores law improvements in computing produce
highly functional endsystems - The Internet and burgeoning wired and wireless
provide universal connectivity - Network exponentials produce dramatic changes in
geometry and geography - 9-month doubling double Moores law!
- 1986-2001 x340,000 2001-2010 x4000?
11A Little History
- Early 90s
- Gigabit testbeds, metacomputing
- Mid to late 90s
- Early experiments (e.g., I-WAY), software
projects (e.g., Globus), application experiments - 2002
- Major application communities emerging
- Major infrastructure deployments are underway
- Rich technology base has been constructed
- Global Grid Forum gt1000 people on mailing lists,
192 orgs at last meeting, 28 countries
12The Grid World Current Status
- Dozens of major Grid projects in scientific
technical computing/research education - Deployment, application, technology
- Considerable consensus on key concepts and
technologies - Globus Toolkit has emerged as de facto standard
for major protocols services - Global Grid Forum has emerged as a significant
force - And first Grid proposals at IETF
13Selected Major Grid Projects
Name URL Sponsors Focus
Access Grid www.mcs.anl.gov/FL/accessgrid DOE, NSF Create deploy group collaboration systems using commodity technologies
BlueGrid IBM Grid testbed linking IBM laboratories
DISCOM www.cs.sandia.gov/discomDOE Defense Programs Create operational Grid providing access to resources at three U.S. DOE weapons laboratories
DOE Science Grid sciencegrid.org DOE Office of Science Create operational Grid providing access to resources applications at U.S. DOE science laboratories partner universities
Earth System Grid (ESG) earthsystemgrid.orgDOE Office of Science Delivery and analysis of large climate model datasets for the climate research community
European Union (EU) DataGrid eu-datagrid.org European Union Create apply an operational grid for applications in high energy physics, environmental science, bioinformatics
New
New
14Selected Major Grid Projects
Name URL/Sponsor Focus
EuroGrid, Grid Interoperability (GRIP) eurogrid.org European Union Create technologies for remote access to supercomputer resources simulation codes in GRIP, integrate with Globus
Fusion Collaboratory fusiongrid.org DOE Off. Science Create a national computational collaboratory for fusion research
Globus Project globus.org DARPA, DOE, NSF, NASA, Msoft Research on Grid technologies development and support of Globus Toolkit application and deployment
GridLab gridlab.org European Union Grid technologies and applications
GridPP gridpp.ac.uk U.K. eScience Create apply an operational grid within the U.K. for particle physics research
Grid Research Integration Dev. Support Center grids-center.org NSF Integration, deployment, support of the NSF Middleware Infrastructure for research education
New
New
New
New
New
15Selected Major Grid Projects
Name URL/Sponsor Focus
Grid Application Dev. Software hipersoft.rice.edu/grads NSF Research into program development technologies for Grid applications
Grid Physics Network griphyn.org NSF Technology RD for data analysis in physics expts ATLAS, CMS, LIGO, SDSS
Information Power Grid ipg.nasa.gov NASA Create and apply a production Grid for aerosciences and other NASA missions
International Virtual Data Grid Laboratory ivdgl.org NSF Create international Data Grid to enable large-scale experimentation on Grid technologies applications
Network for Earthquake Eng. Simulation Grid neesgrid.org NSF Create and apply a production Grid for earthquake engineering
Particle Physics Data Grid ppdg.net DOE Science Create and apply production Grids for data analysis in high energy and nuclear physics experiments
New
New
16Selected Major Grid Projects
Name URL/Sponsor Focus
TeraGrid teragrid.org NSF U.S. science infrastructure linking four major resource sites at 40 Gb/s
UK eScience Grid grid-support.ac.uk U.K. eScience Support center for Grid projects within the U.K.
Unicore BMBFT Technologies for remote access to supercomputers
New
New
Also many technology RD projects e.g., Condor,
NetSolve, Ninf, NWS See also www.gridforum.org
17Grid Communities ApplicationsData Grids for
High Energy Physics
www.griphyn.org www.ppdg.net
www.eu-datagrid.org
18Grid Communities and ApplicationsMathematicians
Solve NUG30
- Communityan informal collaboration of
mathematicians and computer scientists - Condor-G delivers 3.46E8 CPU seconds in 7 days
(peak 1009 processors) in U.S. and Italy (8
sites) - Solves NUG30 quadratic assignment problem
14,5,28,24,1,3,16,15, 10,9,21,2,4,29,25,22, 13,26,
17,30,6,20,19, 8,18,7,27,12,11,23
www.mcs.anl.gov/metaneos Argonne, Iowa, NWU,
Wisconsin
19Grid Communities and ApplicationsNetwork for
Earthquake Eng. Simulation
- NEESgrid national infrastructure to couple
earthquake engineers with experimental
facilities, databases, computers, each other - On-demand access to experiments, data streams,
computing, archives, collaboration
NEESgrid Argonne, Michigan, NCSA, UIUC, USC
www.neesgrid.org
20The 13.6 TF TeraGridComputing at 40 Gb/s
Site Resources
Site Resources
26
HPSS
HPSS
4
24
External Networks
External Networks
8
5
Caltech
Argonne
External Networks
External Networks
NCSA/PACI 8 TF 240 TB
SDSC 4.1 TF 225 TB
Site Resources
Site Resources
HPSS
UniTree
TeraGrid/DTF NCSA, SDSC, Caltech, Argonne
www.teragrid.org
21Intl. Virtual Data Grid Lab.
www.ivdgl.org
22Access Grid
- Collaborative work among large groups
- 50 sites worldwide
- Use Grid services for discovery, security
- www.scglobal.org
Access Grid Argonne, others
www.accessgrid.org
23Grid Architecture Globus Toolkit
- The question
- What is needed for resource sharing coordinated
problem solving in dynamic virtual organizations
(VOs)? - The answer
- Major issues identified membership, resource
discovery access, , - Grid architecture captures core elements,
emphasizing pre-eminent role of protocols - Globus Toolkit has emerged as de facto standard
for major protocols services
24The Critical Role of Protocols
- Need for interoperability when different groups
want to share resources - E.g., IP lets me talk to your computer, but how
do we establish maintain sharing? - How do I discover, authenticate, authorize,
describe what I want to do, etc., etc.? - Need for shared infrastructure services to avoid
repeated development, installation, e.g. - One port/service for remote access to computing,
not one per tool/application - X.509 enables sharing of Certificate Authorities
25Grid Architecture
For more info www.globus.org/research/papers/anat
omy.pdf
26Globus Project and Toolkit
- Globus Project
- RD project at ANL, U.Chicago, USC/ISI
- Emphasis on identifying and defining core
protocols and services - O(40) researchers developers
- Globus Toolkit
- A major product of the Globus Project
- Open source software reference implementation of
core protocols services - Growing open source developer community
27Globus Architecture (1)Fabric Layer
- Diverse resources that may be shared
- Computers, clusters, Condor pools, file systems,
archives, metadata catalogs, networks, sensors,
etc., etc. - Speak connectivity, resource protocols
- The neck of the protocol hourglass
- May implement standard behaviors
- Reservation, pre-emption, virtualization
- Grid operation can have profound implications for
resource behavior
Registration, enquiry, management, access
protocol(s)
Grid resource
28Globus Architecture (2)Connectivity Layer
Protocols Services
- Communication
- Internet protocols IP, DNS, routing, etc.
- Security Grid Security Infrastructure (GSI)
- Uniform authentication authorization mechanisms
in multi-institutional setting - Single sign-on, delegation, identity mapping
- Public key technology, SSL, X.509, GSS-API
(several Internet drafts document extensions) - Supporting infrastructure Certificate
Authorities, key management, etc.
29GSI in Action Create Processes at A and B that
Communicate Access Files at C
User
Site B (Unix)
Site A (Kerberos)
Computer
Computer
Site C (Kerberos)
Storage system
30Globus Architecture (3)Resource Layer
Protocols Services
- Resource management GRAM
- Remote allocation, reservation, monitoring,
control of compute resources - Data access GridFTP
- High-performance data access transport
- Information MDS (GRRP, GRIP)
- Access to structure state information
- others emerging database access, code
repository access, accounting, - All integrated with GSI
31GRAM ResourceManagement Protocol
- Grid Resource Allocation Management
- Allocation, monitoring, control of computations
- Secure remote access to diverse schedulers
- Current evolution
- Immediate and advance reservation
- Multiple resource types manage anything
- Recoverable requests, timeout, etc.
- Evolve to Web Services
- Policy evaluation points for restricted proxies
Karl Czajkowski, Steve Tuecke, others
32Data Access Transfer
- GridFTP extended version of popular FTP protocol
for Grid data access and transfer - Secure, efficient, reliable, flexible,
extensible, parallel, concurrent, e.g. - Third-party data transfers, partial file
transfers - Parallelism, striping (e.g., on PVFS)
- Reliable, recoverable data transfers
- Reference implementations
- Existing clients and servers wuftpd, nicftp
- Flexible, extensible libraries
Bill Allcock, Joe Bester, John Bresnahan, Steve
Tuecke, others
33Grid Services Architecture (4)Collective Layer
Protocols Services
- Community membership policy
- E.g., Community Authorization Service
- Index/metadirectory/ brokering services
- E.g., Globus GIIS, Condor Matchmaker
- Replica management and replica selection
- Optimize aggregate data access performance
- Co-reservation and co-allocation services
- End-to-end performance
- Middle tier services
- MyProxy credential repository, portal services
34Data Grids
- Grid infrastructures, tools, and applications
focused on enabling distributed access to,
analysis of, large amounts of data - A specialization and extension of standard Grid
technologies - Current application domains include high energy
nuclear physics, climate data analysis,
astronomy, bioinformatics
35Grid Physics Network (GriPhyN)
- Enabling RD for advanced data grid systems,
focusing in particular on Virtual Data concept
ATLAS CMS LIGO SDSS
Paul Avery, Ian Foster, Co-PIs
www.griphyn.org
36Future Directions
- Initial exploration (1996-1999 Globus 1.0)
- Extensive appln experiments core protocols
- Data Grids (1999-?? Globus 2.0)
- Large-scale data management and analysis
- Open Grid Services Architecture (2001-??, Globus
3.0) - Integration w/ Web services, hosting envs.
- Integration with databases
- Integrated set of higher-level services
- Scalable systems (2003-??)
- Sensors, wireless, ubiquitous computing
37Summary
- The Grid problem Resource sharing coordinated
problem solving in dynamic, multi-institutional
virtual organizations - Grid architecture Protocol, service definition
for interoperability resource sharing - Globus Toolkit a source of protocol and API
definitionsand reference implementations - And many projects applying Grid concepts (
Globus technologies) to important problems - Timely to start applying technologies to
industrial problems, within outside STC
38For More Information
- The Globus Project
- www.globus.org
- Global Grid Forum
- www.gridforum.org
- Grid architecture
- www.globus.org/research/papers/anatomy.pdf
- Open Grid Services Architecture (soon)
- www.globus.org/research/papers/ogsa.pdf
- www.globus.org/research/papers/gsspec.pdf