Title: Grid Computing and the Globus Toolkit
1Grid Computing and the Globus Toolkit
- Jennifer M. Schopf
- Argonne National Lab
- National eScience Centre
2- A Bit of Background
- Grid Architecture Overview
- Working With Applications
- Role of Globus
- Pieces of the Globus Toolkit
- And an Example Application
- Globus Toolkit 4.0 and Futures
- Some Other Resources
3What is a Grid
- Resource sharing
- Computers, storage, sensors, networks,
- Sharing always conditional issues of trust,
policy, negotiation, payment, - Coordinated problem solving
- Beyond client-server distributed data analysis,
computation, collaboration, - Dynamic, multi-institutional virtual orgs
- Community overlays on classic org structures
- Large or small, static or dynamic
4Not A New Idea
- Late 70s Networked operating systems
- Late 80s Distributed operating system
- Early 90s Heterogeneous computing
- Mid 90s - Metacomputing
- Then the Grid Foster and Kesselman, 1999
- Also called parallel distributed computing
5Why is this hard/different?
- Lack of central control
- Where things run
- When they run
- Shared resources
- Contention, variability
- Communication
- Different sites implies different sys admins,
users, institutional goals, and often strong
personalities
6So why do it?
- Computations that need to be done with a time
limit - Data that cant fit on one site
- Data owned by multiple sites
- Applications that need to be run bigger, faster,
more
7History
- In the early 90s, Ian Foster (ANL, U-C) and Carl
Kesselman (USC-ISI) enjoyed helping scientists
apply distributed computing. - Opportunities seemed ripe for the picking.
- Application of technology always uncovers new and
interesting requirements. - Science is cool!
- Big/Innovative science is even cooler!
8History (continued)
- While helping to build/integrate a diverse range
of applications, the same problems kept showing
up over and over again. - Too many different security systems
- Too many different scheduling/execution
mechanisms - Too many different storage systems
- Too many different monitoring/status/event systems
9What Kinds of Applications?
- Computation intensive
- Interactive simulation (climate modeling)
- Very large-scale simulation and analysis (galaxy
formation, gravity waves, battlefield simulation) - Engineering (parameter studies, linked component
models) - Data intensive
- Experimental data analysis (high-energy physics)
- Image and sensor analysis (astronomy, climate
study, ecology) - Distributed collaboration
- Online instrumentation (microscopes, x-ray
devices, etc.) - Remote visualization (climate studies, biology)
- Engineering (large-scale structural testing,
chemical engineering) - In all cases, the problems were big enough that
they required people in several organization to
collaborate and share computing resources, data,
instruments.
10What Types of Problems?
- Too hard to keep track of authentication data
(ID/password) across institutions - Too hard to monitor system and application status
across institutions - Too many ways to submit jobs
- Too many ways to store access files and data
- Too many ways to keep track of data
- Too easy to leave dangling resources lying
around (robustness)
11Getting Started
- A bit of background
- Grid Architecture Overview
- Working with applications
- Role of Globus
- Pieces of the Globus Toolkit
- And an example application
- Globus Toolkit 4.0 and Futures
12Evolution of the Grid
App-specific Services
Open Grid Services Arch
Web services
Increased functionality, standardization
GGF OGSI, WSRF, (leveraging OASIS, W3C,
IETF) Multiple implementations, including Globus
Toolkit
X.509, LDAP, FTP,
Globus Toolkit
Defacto standards GGF GridFTP, GSI (leveraging
IETF)
Custom solutions
Time
13With Grid Computing Forget Homogeneity!
- Trying to force homogeneity on users is futile.
Everyone has their own preferences, sometimes
even dogma. - The Internet provides the model
14Service-Oriented Architecture
- Idea is simple (and old)
- Define remote activities in terms of interface
and behavior, not implementation - Devil is in the details
- How to describe, discover, access, various type
of service (semantically practically) - Latest instantiation Web services
- Broad adoption, flexible XML-based model
- WSDL, SOAP, WS-Security
- Interfaces still being defined to date
- Performance challenges
15Open Grid Services Architecture
- Define a service-oriented architecture
- the key to effective virtualization
- to address vital Grid requirements
- AKA utility, on-demand, system management,
collaborative computing, etc. - building on Web service standards.
- extending those standards when needed
16Grid and Web Services Convergence
- The definition of WSRF means that the Grid and
Web services communities can move forward on a
common base.
17WS Core Enables FrameworksE.g., Resource
Management
Applications of the framework(Compute, network,
storage provisioning,job reservation
submission, data management,application service
QoS, )
WS-Agreement(Agreement negotiation)
WS Distributed Management(Lifecycle, monitoring,
)
WS-Resource Framework WS-Notification
() (Resource identity, lifetime, inspection,
subscription, )
Web services(WSDL, SOAP, WS-Security,
WS-ReliableMessaging, )
An evolution of Open Grid Services
Infrastructure (OGSI)
18WSRF WS-Notification
- Naming and bindings (basis for virtualization)
- Every resource can be uniquely referenced, and
has one or more associated services for
interacting with it - Lifecycle (basis for fault resilient state
management) - Resources created by services following factory
pattern - Resources destroyed immediately or scheduled
- Information model (basis for monitoring
discovery) - Resource properties associated with resources
- Operations for querying and setting this info
- Asynchronous notification of changes to
properties - Service Groups (basis for registries collective
svcs) - Group membership rules membership management
- Base Fault type
19Theory -gt Practice
20- A Bit of Background
- Grid Architecture Overview
- Working With Applications
- Role of Globus
- Pieces of the Globus Toolkit
- And an Example Application
- Globus Toolkit 4.0 and Futures
- Some Other Resources
21Methodology
- Building a Grid system or application is
currently an exercise in software integration. - Define user requirements
- Derive system requirements or features
- Survey existing components
- Identify useful components
- Develop components to fit into the gaps
- Integrate the system
- Deploy and test the system
- Maintain the system during its operation
- This should be done iteratively, with many loops
and eddys in the flow.
22Who Is the Grid For?
- Any Grid (distributed/collaborative) application
or system will involve several classes of
people. - End users (e.g., Scientists, Engineers,
Customers) - Application/Product Developers
- System Administrators
- System Architects and Integrators
- Each user class has unique skills and unique
requirements. - The user class whose needs are met varies from
tool to tool (even within the Globus Toolkit).
23How it Really Happens
- Implementations are provided by a mix of
- Application-specific code
- Off the shelf tools and services
- Tools and services from the Globus Toolkit
- Tools and services from the Grid community
(compatible with GT) - Glued together by
- Application development
- System integration
24How it Really Happens
ComputeServer
SimulationTool
ComputeServer
WebBrowser
WebPortal
RegistrationService
Camera
TelepresenceMonitor
DataViewerTool
Camera
Database service
ChatTool
DataCatalog
Database service
CredentialRepository
Database service
Certificate authority
Resources implement standard access management
interfaces
Collective services aggregate /or virtualize
resources
Users work with client applications
Application services organize VOs enable access
to other services
25How it Really Happens(without the Grid)
ComputeServer
A
SimulationTool
ComputeServer
B
WebBrowser
WebPortal
RegistrationService
Camera
TelepresenceMonitor
DataViewerTool
Camera
Database service
C
ChatTool
DataCatalog
Database service
D
CredentialRepository
Database service
E
Certificate authority
Resources implement standard access management
interfaces
Collective services aggregate /or virtualize
resources
Users work with client applications
Application services organize VOs enable access
to other services
26How it Really Happens(with the Grid)
ComputeServer
GlobusGRAM
SimulationTool
ComputeServer
GlobusGRAM
WebBrowser
CHEF
Globus IndexService
Camera
TelepresenceMonitor
DataViewerTool
Camera
Database service
GlobusDAI
CHEF ChatTeamlet
GlobusMCS/RLS
Database service
GlobusDAI
MyProxy
Database service
GlobusDAI
CertificateAuthority
Resources implement standard access management
interfaces
Collective services aggregate /or virtualize
resources
Users work with client applications
Application services organize VOs enable access
to other services
27- A Bit of Background
- Grid Architecture Overview
- Working With Applications
- Role of Globus
- Pieces of the Globus Toolkit
- And an Example Application
- Globus Toolkit 4.0 and Futures
- Some Other Resources
28Globus IsStandard Plumbing for the Grid
- Not turnkey solutions, but building blocks and
tools for application developers and system
integrators. - Some components (e.g., file transfer) go farther
than others (e.g., remote job submission) toward
end-user relevance. - Since these solutions exist and others are
already using them (and theyre free), its
easier to reuse than to reinvent. - And compatibility with other Grid systems comes
for free!
29Leveraging Existingand Proposed Standards
- SSL/TLS v1 (from OpenSSL) (IETF)
- LDAP v3 (from OpenLDAP) (IETF)
- X.509 Proxy Certificates (IETF)
- GridFTP v1.0 (GGF)
- OGSI v1.0 (GGF)
- And others on the road to standardization
- WSRF (GGF, OASIS), DAI, WS-Agreement, WSDL 2.0,
WSDM, SAML, XACML
30What Is the Globus Toolkit?
- The Globus Toolkit is a collection of solutions
to problems that frequently come up when trying
to build collaborative distributed applications. - Heterogeneity
- To date (v1.0 - v4.0), the Toolkit has focused on
simplifying heterogenity for application
developers. - We aspire to include more vertical solutions in
future versions. - Standards
- Our goal has been to capitalize on and encourage
use of existing standards (IETF, W3C, OASIS,
GGF). - The Toolkit also includes reference
implementations of new/proposed standards in
these organizations.
31What Does the Globus Toolkit Cover?
32Areas of Competence
- Connectivity Layer Solutions
- Service Management (WSRF)
- Monitoring/Discovery (WSRF and MDS)
- Security (GSI and WS-Security)
- Communication (XIO)
- Resource Layer Solutions
- Computing / Processing Power (GRAM)
- Data Access/Movement (GridFTP, OGSA-DAI)
- Collective Layer Solutions
- Data Management (RLS, MCS, OGSA-DAI)
- Monitoring/Discovery (MDS)
- Security (CAS)
33What Is the Globus Toolkit?
- A Grid development environment
- Develop new OGSA-compliant Web Services
- Develop applications using Java or C/C Grid
APIs - Secure applications using basic security
mechanisms - A set of basic Grid services
- Job submission/management
- File transfer (individual, queued)
- Database access
- Data management (replication, metadata)
- Monitoring/Indexing system information
- Tools and Examples
- The prerequisites for many Grid community tools
- Note GT3 and GT4 releases include both WS and
pre-WS components!
34(No Transcript)
35How To Use the Globus Toolkit
- By itself, the Toolkit has surprisingly limited
end user value. - Theres very little user interface material
there. - You cant just give it to end users (scientists,
engineers, marketing specialists) and tell them
to do something useful! - The Globus Toolkit is useful to application
developers and system integrators. - Youll need to have a specific application or
system in mind. - Youll need to have the right expertise.
- Youll need to set up prerequisite
hardware/software. - Youll need to have a plan.
36Easy to Use But Few Applications are Easy
- The uses that the Toolkit has been aimed at are
not easy challenges! - The Globus Toolkit makes them easier.
- Providing solutions to the most common problems
and promoting standard solutions - A well-designed implementation that allows many
things to be built on it (lots of happy
developers!) - 6 years of providing support to Grid builders
- Ever-improving documentation, installation,
configuration, training
37GlobalCommunity
38100,000 ComputersA Healthy Computing Pyramid
Today
Supercomputer
Cluster
Desktop
39- A Bit of Background
- Grid Architecture Overview
- Working With Applications
- Role of Globus
- Pieces of the Globus Toolkit
- And an Example Application
- Globus Toolkit 4.0 and Futures
- Some Other Resources
40Review How it Really Happens
- Implementations are provided by a mix of
- Application-specific code
- Off the shelf tools and services
- Tools and services from the Globus Toolkit
- Tools and services from the Grid community
(compatible with GT) - Glued together by
- Application development
- System integration
41Iterative Design
- Ideal for cutting-edge activities where detailed
needs and the final goal arent fully known
ahead of time. - Provides maximum adaptability, course correction.
- Produces useful results early.
42- Grid2003 An Operational Grid
- 28 sites (2100-2800 CPUs) growing
- 400-1300 concurrent jobs
- 8 substantial applications CS experiments
- Running since October 2003
Korea
http//www.ivdgl.org/grid2003
43Computation-IntensiveScience Grid2003
- GriPhyN - Grid Physics Network (NSF)
- iVDGL - International Virtual Data Grid
Laboratory (NSF) - LCG - LHC Computing Grid (EU)
- PPDG - Particle Physics Data Grid (DOE)
44Grid2003 Project Goals
- Ramp up U.S. Grid capabilities in anticipation of
LHC experiment needs in 2005. - Build, deploy, and operate a working Grid.
- Include all U.S. LHC institutions.
- Run real scientific applications on the Grid.
- Provide state-of-the-art monitoring services.
- Cover non-technical issues (e.g., SLAs) as well
as technical ones. - Unite the U.S. CS and Physics projects that are
aimed at support for LHC. - Common infrastructure
- Joint (collaborative) work
45Grid2003 Requirements
- General Infrastructure
- Support Multiple Virtual Organizations
- Production Infrastructure
- Standard Grid Services
- Interoperability with European LHC Sites
- Easily Deployable
- Meaningful Performance Measurements
46Grid2003 Applications
- 6 VOs, 11 Apps
- CMS proton-proton collision simulation
- ATLAS proton-proton collision simulation
- LIGO gravitational wave search
- SDSS galaxy cluster detection
- ATLAS interactive analysis
- BTeV proton-antiproton collision simulation
- SnB biomolecular analysis
- GADU/Gnare genone analysis
- Various computer science experiments
47ExampleGrid2003Workflows
Genome sequence analysis
Sloan digital sky survey
Physics data analysis
48Grid2003 Components
- Security
- GT GSI, CAS, GSI-OpenSSH
- Monitoring
- GT MDS, MonALISA, Ganglia
- Job Submission
- GT GRAM, Condor-G, Chimera Pegasus
- Data Tools
- GT GridFTP, GT RLS, GT MCS
49Grid2003 Components
- Computers storage at 28 sites (to date)
- 2800 CPUs
- Uniform service environment at each site
- Globus Toolkit provides basic authentication,
execution management, data movement - Pacman installation system enables installation
of numerous other VDT and application services - Global virtual organization services
- Certification registration authorities, VO
membership services, monitoring services - Client-side tools for data access analysis
- Virtual data, execution planning, DAG management,
execution management, monitoring - IGOC iVDGL Grid Operations Center
50System Overview
51Grid2003 Operation
- All software to be deployed is integrated in the
Virtual Data Toolkit (VDT) distribution. - Each participating institution deploys the VDT on
their systems, which provides a standard set of
software and configuration. - A core software team (GriPhyN, iVDGL) is
responsible for integration and development. - A set of centralized services (e.g., directory
services) is maintained Grid-wide. - Applications are developed with VDT capabilities,
architecture, and services directly in mind.
52Grid2003 Deployment
- Software installed at more than 25 U.S. LHC
institutions, plus one Korean site. - More than 2000 CPUs in total.
- More than 100 individuals authorized to use the
Grid. - Peak throughput of 500-900 jobs running
concurrently, completion efficiency of 75.
53Grid2003 Interesting Points
- Each virtual organization includes its own set of
system resources (compute nodes, storage, etc.)
and people. VO membership info is managed
system-wide, but policies are enforced at each
site. - Throughput is a key metric for success, and
monitoring tools are used to measure it and
generate reports for each VO.
54Grid2003 Metrics
55Grid2003 Summary
- Working Grid for wide set of applications
- Joint effort between application scientists,
computer scientists - Globus software as a starting point, additions
from other communities as needed
56- A Bit of Background
- Grid Architecture Overview
- Working With Applications
- Role of Globus
- Pieces of the Globus Toolkit
- And an Example Application
- Globus Toolkit 4.0 and Futures
- Some Other Resources
57The Globus Toolkit Ecosystem
- Pieces of the Grid world-
- Globus Toolkit and associated software
- Security
- Monitoring
- Resource management
- Portals
- Packaging
58Why Grid Security is Hard
- Resources being used may be valuable the
problems being solved sensitive - Resources are often located in distinct
administrative domains - Each resource has own policies procedures
- Set of resources used by a single computation may
be large, dynamic, and unpredictable - Not just client/server, requires delegation
- It must be broadly available applicable
- Standard, well-tested, well-understood protocols
integrated with wide variety of tools
59Security Tools
- Basic Grid Security Mechanisms
- Certificate Generation Tools
- Certificate Management Tools
- Getting users registered to use a Grid
- Getting Grid credentials to wherever theyre
needed in the system - Authorization/Access Control Tools
- Storing and providing access to system-wide
authorization information
60Basic Grid Security Mechanisms
- Basic Grid authentication and authorization
mechanisms come in two flavors. - Pre-Web services
- Web services
- Both are included in the Globus Toolkit, and both
provide vital security features. - Grid-wide identities implemented as PKI
certificates - Transport-level and message-level authentication
- Ability to delegate credentials to agents
- Ability to map between Grid local identities
- Local security administration enforcement
- Single sign-on support implemented as proxies
- A plug in framework for authorization decisions
61Basic Grid Security Mechanisms
- Basic security mechanisms are provided as
libraries/classes and APIs. - Integrated with other GT tools and services
- Integrated with many Grid community tools and
services (and applications systems) - A few stand-alone tools are also included.
62A Cautionary Note
- Grid security mechanisms are tedious to set up.
- If exposed to users, hand-holding is usually
required. - These mechanisms can be hidden entirely from end
users, but still used behind the scenes. - These mechanisms exist for good reasons.
- Many useful things can be done without Grid
security. - It is unlikely that an ambitious project could go
into production operation without security like
this. - Most successful projects end up using Grid
security, but using it in ways that end users
dont see much.
63Globus Certificate Service
- An online service that issues low-quality GSI
certificates - Intended for people who want to experiment with
Grid components that require certificates but do
not have any other means of acquiring
certificates. - These certificates are not to be used on
production systems. - Not a true Certificate Authority (CA)
- No revoking or reissuing certificates
- No verification of identities
- The service itself is not especially secure.
64Simple CA
- A convenient method of setting up a certificate
authority (CA). - The Certificate Authority can then be used to
issue certificates for users and services that
work with GSI and WS-Security. - Simple CA is intended for operators of small Grid
testing environments and users who are not part
of a larger Grid. - Most production Grids will not accept
certificates that are not signed by a well-known
CA, so the certificates generated by Simple CA
will usually not be sufficient to gain access to
production services.
65MyProxy
- MyProxy is a remote service that stores user
credentials. - Users can request proxies for local use on any
system on the network. - Web Portals can request user proxies for use with
back-end Grid services. - Grid administrators can pre-load credentials in
the server for users to retrieve when needed. - Greatly simplifies certificate management!
66CAS Community Authorization Service
- CAS allows resource providers to specify
course-grained access control policies in terms
of communities as a whole. - Fine-grained access control is delegated to the
community. - Resource providers maintain ultimate authority
over their resources (including per-user control
and auditing) but are spared most day-to-day
policy administration tasks.
67VOMS
- A community-level group membership system
- Database of user roles
- Administrative tools
- Client interface
- voms-proxy-init
- Uses client interface to produce an attribute
certificate (instead of proxy) that includes
roles capabilities signed by VOMS server - Works with non-VOMS services, but gives more info
to VOMS-aware services - Allows VOs to centrally manage user roles
68Monitoring and Discovery Challenges
- Grid Information Service
- Requirements and characteristics
- Uniform, flexible access to information
- Scalable, efficient access to dynamic data
- Access to multiple information sources
- Decentralized maintenance
- Secure information provision
69Monitoring/Discovery Tools
- Basic WSRF Infrastructure Components
- Specialized Monitoring/Discovery Components
- Specialized collection/monitoring agents
- Viewing and display tools for showing system
information for a variety of specialized purposes
70WSRF Infrastructure Elements
- WS Core Monitoring Features
- Every service produces Resource Properties so
monitoring is baked right in to WSRF - Non-WSRF services can also provide information
from wrappers - Index Service
- Collection point for a set of data (registry)
- Also has last value of data in cache
- Indexes can be set up for a variety of uses,
projects
71Monitoring and Discovery Service in GT4 (MDS4)
- WS-RF compatible
- Monitoring of basic service data
- Primary use case is discovery of services
- Starting to be used for up/down statistics
72MDS4 Information Providers
- Code that generates resource property information
- Were called service data providers in GT3
- XML Based not LDAP
- Basic cluster data
- Interface to Ganglia
- GLUE schema
- Some service data from GT4 services
- Start, timeout, etc
- Soft-state registration
- Push and pull data models
73Ganglia Cluster Toolkit
- Ganglia is a toolkit for monitoring clusters and
aggregations of clusters (hierarchically). - Ganglia collects system status information and
makes it available via a web interface. - Ganglia status can be subscribed to and
aggregated across multiple systems. - Integrating Ganglia with MDS services results in
status information provided in the proposed
standard GLUE schema, popular in international
Grid collaborations.
74MDS4 Index Service
- Index Service is both registry and cache
- Subscribes to information providers
- Data, datatype, data provider information
- Caches last value of all data
- In memory default approach
75MDS4 Trigger Service
- Compound consumer-producer service
- Subscribe to a set of resource properties
- Set of tests on incoming data streams to evaluate
trigger conditions - When a condition matches, email is sent to
pre-defined address - GT3 tech-preview version in use by ESG
- GT4 version alpha is in GT4 alpha release
currently available
76MDS4 Archive Service
- Compound consumer-producer service
- Subscribe to a set of resource properties
- Data put into database (Xindice)
- Other consumers can contact database archive
interface - Will be Tech Preview in GT4 Final release
77Computing/Processing Tools
- Workflow Managers
- Organize and coordinate task execution within a
complicated application - Often coordinates data movement and task
execution - Metaschedulers
- Optimize use of distributed compute pools
- Virtual Data Tools
- Manage the trade-off between data storage and
processing power
78The Resource Management Challenge
- Enabling secure, controlled remote access to
heterogeneous computational resources and
management of remote computation - Authentication and authorization
- Resource discovery characterization
- Reservation and allocation
- Computation monitoring and control
- Addressed by a set of protocols services
- GRAM protocol as a basic building block
- Resource brokering co-allocation services
- GSI for security, MDS for discovery
79GRAM - Basic Job Submission and Control Service
- A uniform service interface for remote job
submission and control - Includes file staging and I/O management
- Includes reliability features
- Supports basic Grid security mechanisms
- Available in Pre-WS and WS
- GRAM is not a scheduler.
- No scheduling
- No metascheduling/brokering
- Often used as a front-end to schedulers, and
often used to simplify metaschedulers/brokers
80CondorG
- The Condor project has produced a helper
front-end to GRAM - Managing sets of subtasks
- Reliable front-end to GRAM to manage
computational resources - Note this is not Condor which promotes
high-throughput computing, and use of idle
resources
81Chimera Virtual Data
- Captures both logical and physical steps in a
data analysis process. - Transformations (logical)
- Derivations (physical)
- Builds a catalog.
- Results can be used to replay analysis.
- Generation of DAG (via Pegasus)
- Execution on Grid
- Catalog allows introspection of analysis process.
Sloan Survey Data
Galaxy cluster size distribution
82Pegasus Workflow Transformation
- Converts Abstract Workflow (AW) into Concrete
Workflow (CW). - Uses Metadata to convert user request to logical
data sources - Obtains AW from Chimera
- Uses replication data to locate physical files
- Delivers CW to DAGman
- Executes using Condor
- Publishes new replication and derivation data in
RLS and Chimera (optional)
ChimeraVirtual DataCatalog
MetadataCatalog
t
DAGman
ReplicaLocationService
Condor
ComputeServer
StorageSystem
ComputeServer
StorageSystem
StorageSystem
ComputeServer
ComputeServer
83Data Tools
- Virtual Data Tools
- Manage the trade-off between data storage and
processing power (already covered) - Movement/Transfer Tools
- Interfaces that meet specialized application or
user needs - Last mile integration to specialized storage
systems - Optimization Tools
- Help optimize the use of storage systems for
specialized user communities
84A Model Architecture for Data Grids
Attribute Specification
Replica Catalog
Metadata Catalog
Application
Multiple Locations
Logical Collection and Logical File Name
MDS
Selected Replica
Replica Selection
Performance Information Predictions
NWS
GridFTP Control Channel
Disk Cache
GridFTPDataChannel
Tape Library
Disk Array
Disk Cache
Replica Location 1
Replica Location 2
Replica Location 3
85GridFTP
- A high-performance, secure, reliable data
transfer protocol optimized for high-bandwidth
wide-area networks - FTP with well-defined extensions
- Uses basic Grid security (control and data
channels) - Multiple data channels for parallel transfers
- Partial file transfers
- Third-party (direct server-to-server) transfers
- Reusable data channels
- Command pipelining
- GGF recommendation GFD.20
86Striped GridFTP Service
- A distributed GridFTP service that runs on a
storage cluster - Every node of the cluster is used to transfer
data into/out of the cluster - Head node coordinates transfers
- Multiple NICs/internal busses lead to very high
performance - Maximizes use of Gbit WANs
87UberFTP
- UberFTP is an interactive (text prompt) client
for GridFTP. - Supports
- Parallelism
- Third-party transfer
88RFT - File Transfer Queuing
- A WSRF service for queuing file transfer requests
- Server-to-server transfers
- Checkpointing for restarts
- Database back-end for failovers
- Allows clients to requests transfers and then
disappear - No need to manage the transfer
- Status monitoring available if desired
89ExampleReliable File Transfer Service
Client
Client
Client
Request and manage file transfer operations
Grid Service
Notfn Source
File Transfer
Policy
Fault Monitor
Pending
interfaces
Query /or subscribe to service data
Performance
service data elements
Internal State
Policy
Perf. Monitor
Faults
Data transfer operations
90OGSA-DAI
- OGSA interface for accessing XML and relational
data stores - Implements the GGF DAIS WG standard (in progress)
Figure courtesy of Malcolm Atkinson and Rob
Baxter, UK eScience Center
91MCS - Metadata Catalog Service
- A stand-alone metadata catalog service
- WSRF service interface
- Stores system-defined and user-defined attributes
for logical files/objects - Supports manipulation and query
- Integrated with OGSA-DAI
- OGSA-DAI provides metadata storage
- When run with OGSA-DAI, basic Grid authentication
mechanisms are available
92RLS - Replica Location Service
- A distributed system for tracking replicated data
- Consistent local state maintained in Local
Replica Catalogs (LRCs) - Collective state with relaxed consistency
maintained in Replica Location Indices (RLIs) - Performance features
- Soft state maintenance of RLI state
- Compression of state updates
- Membership and partitioning information
maintenance - Note
- RLS (developed by Globus Alliance and the
DataGrid Project) replaces earlier components in
the Globus Toolkit 2.x.
93Web Portals
- Tools for building web interfaces that provide
access to system/application capabilities
94CHEF/Sakai
- The CompreHensive collaborativE Framework (CHEF)
is a flexible environment for supporting
distributed learning and collaborative work. - CHEF is rapidly evolving into Sakai, with
emphasis on JSR-168 and localization. - CHEF is highly extensible with support for
JetSpeed, Velocity, and other portal interfaces.
95Open Grid Computing Environment (OGCE)
- Extends CHEF/Sakai to include support for Grid
services - MyProxy
- GridPort
- GT services (GRAM, GridFTP, MDS, etc.)
- Java CoG
- Provides a quick start for building
Grid-enabled portals.
96System Packaging/Distribution
- Distribution and Packaging Tools
- Getting software distributed and installed
uniformly throughout a broad collaboration - Tools that help create integrated distributions
that work on a wide variety of systems - Integrated Distributions
- Customized distributions of common Grid software
97Grid Packaging Tools (GPT)
- GPT is the packaging used for the Globus Toolkit,
but it exists independently. - Adds metadata to tar.gz files, putting more
intelligence into build/install/config - Tools for developers and users
- Focus is multiplatform, tricky builds
- Works on most Unix systems
- Source Binary packages
- Dependency management
- Relocatable installations (multiple installs)
- Setup (config) awareness
- Bundles (aggregations of packages)
98Virtual Data Toolkit (VDT)
- VDT is a grid middleware distribution focused on
the needs of the NSF-funded GriPhyN and iVDGL
projects, both of which are focused on Physics
and Astronomy applications. - Ease of use (and installation) is key.
- Contents
- Globus Toolkit Condor, Condor-G
- Virtual Data Tools (Chimera, Pegasus, RLS)
- Utilities (GSI-OpenSSH, UberFTP, MonaLisa,
MyProxy, KX.509, etc.) - Uses PACMAN for distribution, install,
configuration. - Deployed on Grid3 (28 major U.S. sites)
99GT2 Evolution To GT4
- ALL of GT2 functionality is in GT4
- What happened to the GT2 key protocols?
- Security Adapting X.509 proxy certs to integrate
with emerging WS standards - GRIP/LDAP Abstractions integrated into WSRF as
resource properties - GRAM ManagedJobFactory and related service
definitions - GridFTP Server updated, but not WSRF-compliant,
RFT fills that role - Also rendering collective services in terms of
WSRF RFT, RLS, CAS, etc.
100- A Bit of Background
- Grid Architecture Overview
- Working With Applications
- Role of Globus
- Pieces of the Globus Toolkit
- And an Example Application
- Globus Toolkit 4.0 and Futures
101(No Transcript)
102Apache Axis Web Services Container
- Good news for Java WS developers GT4.0 works
with standard Axis and Tomcat - GT provides Axis-loadable libraries, handlers
- Includes useful behaviors such as inspection,
notification, lifetime mgmt (WSRF) - Others implement GRAM, etc.
- Major Globus contributions to Apache
- 50 of WS-Addressing code
- 15 of WS-Security code
- Many bug fixes
- WSRF code a possible next contribution
GT bits
App bits
Security Addressing
Axis
Modulo Axis and Tomcat release cycle issues
103WS Core Enables FrameworksE.g., Resource
Management
Applications of the framework(Compute, network,
storage provisioning,job reservation
submission, data management,application service
QoS, )
WS-Agreement(Agreement negotiation)
WS Distributed Management(Lifecycle, monitoring,
)
WS-Resource Framework WS-Notification
() (Resource identity, lifetime, inspection,
subscription, )
Web services(WSDL, SOAP, WS-Security,
WS-ReliableMessaging, )
An evolution of Open Grid Services
Infrastructure (OGSI)
104WSRF WS-Notification
- Naming and bindings (basis for virtualization)
- Every resource can be uniquely referenced, and
has one or more associated services for
interacting with it - Lifecycle (basis for fault resilient state
management) - Resources created by services following factory
pattern - Resources destroyed immediately or scheduled
- Information model (basis for monitoring
discovery) - Resource properties associated with resources
- Operations for querying and setting this info
- Asynchronous notification of changes to
properties - Service Groups (basis for registries collective
svcs) - Group membership rules membership management
- Base Fault type
105Globus 4.0 Structure
Your C Client
Your Python Client
Your Java Client
Your Python Client
Your Python Client
Your C Client
Your C Client
CLIENT
Your Java Client
Your Java Client
Your Python Client
Your C Client
Your Java Client
Interoperable WS-I-compliant SOAP messaging
X.509 credentials common authentication
RFT
GRAM
Delegation
Index
Trigger
Archiver
Your C Service
CAS
OGSA-DAI
Your Python Service
GTCP
Your Java Service
Your Java Service
RLS
Pre-WS MDS
SimpleCA
MyProxy
GridFTP
Pre-WS GRAM
C WS Core
pyGlobus WS Core
Java Services in Apache Axis Plus GT Libraries
and Handlers
C Services using GT Libraries and Handlers
Python hosting, GT Libraries
SERVER
106Globus Ecosystem(Just a Few Examples Listed Here)
- Tools provide higher-level functionality
- Nimrod-G, MPICH-G2, Condor-G, Ninf-G
- NTCP telecontrol
- GT4IDE Eclipse IDE
- Packages integrate GT with other s/w
- VDT, NMI, CTSS, NEESgrid, ESG
- Solutions package a set of functionality
- VO management, monitoring, replica mgmt
- Documentation, e.g.
- Borja Sotomayors tutorial
107Whats New inGT 4.0 (January 31, 2005)
- For all
- Additions data, security, execution, XIO,
- Improved packaging, testing, performance,
usability, doc, standards compliance (phew) - WS components ready for broader use
- For the end user
- More complementary tools solutions
- C, Java, Python APIs command line tools
- For the developer
- Java (Axis/Tomcat) hosting greatly improved
- Python (pyGlobus) hosting for the first time
108GT4.0 Release Schedule
109Wed Getting a Lot of Help,But Could do with A
Lot More
- Testing and feedback
- Users, developers, deployers plan to use the
software now provide feedback - Tell us what is missing, what performance you
need, what interfaces platforms, - Ideally, also offer to help meet needs (-
- Related software, solutions, documentation
- Adapt your tools to use GT4
- Develop new GT4-based components
- Develop GT4-based solutions
- Develop documentation components
110Documentation Overview
- Current document drafts are publicly accessible
- http//www-unix.globus.org/toolkit/docs/developmen
t/docmap.html - We need reviewers!
- Suggestions for ways we might improve our
documentation are appreciated - We need contributors!
- We are happy to collaborate to write new documents
111Testing Overview
- Nightly builds and tests
- Calls for Community Testing current calls
include - Delegation Service, CAS, RFT, GridFTP, RLS, WS
GRAM, WS MDS, Java WS Core - TestGrid at USC/ISI
- Stand up services for several weeks
- Perform stress tests
- TestGrid at LBNL
- Focus on WS Core performance and interoperability
tests - Performance and reliability testing is a major
focus - Bill Allcock (allcock_at_mcs.anl.gov) is
coordinating this effort - We welcome new testing collaborations!
112How to Get Involved
- Become a GT4 Friend!
- Open group of people from various organizations
working with GT4 pre-release code and documents - Reporting problems in code and documents
- Contributing ideas, tests, documentation
- Building GT4-enabled applications
- Weekly telephone calls
- Discussion list
- To subscribe to the GT4 friends list, send an
email to majordomo_at_globus.org which contains the
words subscribe gt4-friends in the message body
113Whats This Abouta Globus Company?
- Univa was announces yesterday (Dec 13, 2004)
- http//biz.yahoo.com/prnews/041213/nym040_1.html
- http//www.univa.com
- Steve Tuecke is CEO
- Both Carl Kesselman and Ian Foster are in
advisory role - Basic concept Redhat Linux for Globus
- This will NOT affect the GT open source policy
- This WILL allow greater industrial involvement
and investment in Grids
114- A Bit of Background
- Grid Architecture Overview
- Working With Applications
- Role of Globus
- Pieces of the Globus Toolkit
- And an Example Application
- Globus Toolkit 4.0 and Futures
- Some Other Resources
115GRIDS Center (NMI)
- GRIDS Center
- GRIDS Grid Research Integration Development and
Support - Partnership of leading teams in Grid computing
- Funded by NSF Middleware Initiative (NMI)
- Goal Design, Develop, Deploy and Support
- Define an integrated, modular architecture that
addresses current projected middleware
requirements for the SE communities - Create robust, tested, packaged, documented, and
well-supported middleware solutions that are
extensible within and beyond SE
116GRIDS CenterSoftware Suite
- Globus Toolkit
- Condor-G
- Enhanced version of the core Condor software
optimized to work with GT for managing Grid jobs.
- Network Weather Service (NWS)
- Monitors and dynamically forecasts performance of
network and computational resources. - Grid Packaging Tools (GPT)
- XML-based packaging data format defines complex
dependencies between components. - GSI-OpenSSH
- Modified version adds support for Grid Security
Infrastructure (GSI) authentication and single
sign-on capability
117GRIDS CenterSoftware Suite (cont.)
- MyProxy
- Repository lets users retrieve a proxy credential
on demand, without managing private key and
certificate files across sites and applications. - MPICH-G2
- Grid-enabled implementation of the Message
Passing Index (MPI) standard, based on the
popular MPICH library. - GridConfig
- Manages the configuration of GRIDS components,
letting users regenerate configuration files in
native formats and ensure consistency. - KX.509 and KCA
- A tool from EDIT that bridges Kerberos and PKI
infrastructure.
118Global Grid Forum (GGF)
- An Open Process for Development of Standards
- Grid Recommendations process modeled after
Internet Standards Process (IETF) - Persistent, Reviewed Document Series (similar to
RFC) - A Forum for Information Exchange
- Experiences, patterns, structures
- Useful even if every application Grid were
completely separate and not interoperablebut
ideally will result in interoperability! - A Regular Gathering to Encourage Shared Effort
- In code development libraries, tools
- Via resource sharing shared Grids
- In infrastructure consensus standards
- http//www.ggf.org
119OASIS
- Not-for-profit business consortium that drives
the development, convergence and adoption of
eBusiness standards - Large space of standards
- Web Services, e-Commerce, Security, Law
Government, Supply Chain, Computing Mgmt,
Application Focus, Document-Centric, XML
Processing, Conformance/Interop, Industry Domains - Web Services Resource Framework (WSRF) is here
- http//www.oasis-open.org/home/index.php
120Conclusions
121Overall, We are Doing Well
- Communities individuals are, increasingly,
using the Grid to advance their science - Broad consensus on many key architecture
concepts, if not always their implementation - Significant base of open source software, widely
used in applications infrastructure - Service-oriented arch facilitates cooperation on
software development code reuse - Grid standards are making a difference on a daily
basis e.g., GSI, GridFTP
122Overall, We are Doing Well (2)
- A real understanding of how to operate Grid
infrastructures is emerging - Production infrastructures are appearing and are
being relied upon for real science - Productive international cooperation is occurring
at many levels - A vibrant community has formed and shows no signs
of slowing down - Real connections have been formed between
computer science applications
123Problem-Driven, Collaborative RD Methodology
Deploy
Build
Deploy
Global Community
Apply
Apply
Design
Apply
Apply
Analyze
124Lessons Learned
- The Globus Toolkit consists of the basic building
blocks needed - But to meet applications needs, more should be
examined - The Grid community (collectively) has many useful
tools that can be reused! - System integration expertise is mandatory.
- OGSA, WSRF, and community standards (GGF, OASIS,
W3C, IETF) are extremely important in getting all
of this to work together. - Theres much more to be done!
125Standard Plumbing for the Grid
- Not turnkey solutions, but building blocks and
tools for application developers and system
integrators. - Some components (e.g., file transfer) go farther
than others (e.g., remote job submission) toward
end-user relevance. - Since these solutions exist and others are
already using them (and theyre free), its
easier to reuse than to reinvent. - And compatibility with other Grid systems comes
for free!
126Wed Getting a Lot of Help,But Could do with More
- Testing and feedback
- Users, developers, deployers plan to use the
software now provide feedback - Tell us what is missing, what performance you
need, what interfaces platforms, - Ideally, also offer to help meet needs (-
- Related software, solutions, documentation
- Adapt your tools to use G4
- Develop new G4-based components
- Develop G4-based solutions
- Develop documentation components
127Summary
- Things that are working
- Key standards are emerging
- Open source infrastructure appearing
- Success stories experience gained
- Challenges that remain
- Complexity of some WS infrastructure
- Missing specifications
- Limited practical experience
- Progress being made on all fronts
128Thanks to
- Ian Foster, Carl Kesselman and Steve Tuecke
- Bill Allcock, Kate Keahey, Lee Liming, Gregor von
Laszewski, Mike Wilde _at_ Argonne - Globus Alliance members at Argonne, U.Chicago,
USC/ISI, Edinburgh, PDC, NCSA - Other partners in Grid technology, application,
infrastructure projects - And thanks to DOE, NSF (esp. NMI and TeraGrid
programs), NASA, IBM, and the UK eScience Program
for generous support
129General Globus Help and Support
- Globus-discuss list
- discuss_at_globus.org
- http//globus.org/about/contacts.html
- Bugzilla
- Bugzilla.globus.org
- GT4 Information
- gt4-friends_at_globus.org
- Weekly telecons for early testers
130For More Information
- Jennifer Schopf
- jms_at_mcs.anl.gov
- www.mcs.anl.gov/jms
- Globus Alliance
- www.globus.org
- Global Grid Forum
- www.ggf.org
- GlobusWORLD 2005
- Feb 7-11, Boston
2nd Edition www.mkp.com/grid2