Title: Observations on Architecture, Protocols, Services, APIs, SDKs, and the Role of the Grid Forum
1Observations on Architecture,Protocols,
Services, APIs, SDKs, and the Role of the Grid
Forum
- Ian Foster
- With Carl Kesselman, Steven Tuecke
- Thanks also to Bill Johnston, Marty Humphrey,
Rusty Lusk, Reagan Moore, and others
2Overview
- The Grid problem controlled resource sharing in
multi-institutional settings - Standards as a means of enabling sharing of code,
resources, services - Aside definition, role, and importance of
protocols, services, SDKs, APIs, etc. - A Grid Architecture a categorization of
protocols, services, SDKs, and APIs - Questions for the Grid Forum
3The Grid Problem
- Grid RD has its origins in high-end computing
metacomputing, but - In practice, the Grid problem is about resource
sharing coordinated problem solving in dynamic,
multi-institutional virtual organizations - Lack of central control, omniscience, trust
- Primary challenge to enable, maintain, and
control the sharing of resources to achieve a
common goal
4Examples of Virtual Organizations
- Members of a scientific collaboration
- E.g., NSF PACIs, IPG, NEESgrid, GriPhyN
- Sharing computers, storage, software,
- Application service provider customers
- Sharing ASP computers
- Participants in peer-to-peer network
- E.g., Gnutella, Napster, Entropia,
- Sharing resources on individual PCs
- Tremendous variety in scope, timescale, types
of sharing, etc.
5Universal Nature of the Grid Problem
- Sharing fundamental in many settings
- Application Service Providers, Storage Service
Providers, etc. Peer-to-peer computing
Distributed computing Business to business - Sharing issues not adequately addressed by
existing technologies - Sharing at a deep level, across broad ranges of
resources and in a general way - E.g., user provides ASP with controlled access to
their data on an SSP how?? - Grid community has unique experience
6Creating Usable GridsWhat are the Challenges?
- Approaches to problem solving
- Data Grids, distributed computing, peer-to-peer,
collaboration grids, - Structuring and writing programs
- Abstractions, tools
- Enabling resource sharing across distinct
institutions - Resource discovery, access, reservation,
allocation authentication, authorization,
policy communication fault detection and
notification
7What is the Role of Grid Forum in Enabling Grid
Computing?
- Information exchange, of course
- Experiences, patterns, structures
- Useful even if every application Grid is a
vertical stovepipe - Advocacy
- Enabler of shared effort
- In code development libraries, tools,
- Via resource sharing shared Grids
- In infrastructure
- Opinion Long term, only the third is
sufficiently compelling to justify GF
8Q How do we Enable Shared Effort?A Standards
are Required
- To enable portability/sharing of code
- E.g., MPI lets me write portable // programs
- To enable resource sharing
- E.g., IP lets my computer speak to yours
- To enable shared infrastructure
- E.g., X.509 lets me share Certificate Authorities
- But what sorts of standards?
- Variously, APIs/SDKs, protocols, syntax,
- Observe that these are sometimes confused, so
lets spend some time on definitions
9Some Important Definitions
- Resource
- Network protocol
- Network enabled service
- Application Programmer Interface (API)
- Software Development Kit (SDK)
- Syntax
- Not discussed, but important policies
10Resource
- An entity that is to be shared
- E.g., computers, storage, data, software
- Does not have to be a physical entity
- E.g., Condor pool, distributed file system,
- Defined in terms of interfaces, not devices
- E.g. scheduler such as LSF and PBS define a
compute resource - Open/close/read/write define access to a
distributed file system, e.g. NFS, AFS, DFS
11Network Protocol
- A formal description of message formats and a set
of rules for message exchange - Rules may define sequence of message exchanges
- Protocol may define state-change in endpoint,
e.g., file system state change - Good protocols designed to do one thing
- Protocols can be layered
- Examples of protocols
- IP, TCP, TLS (was SSL), HTTP, Kerberos
12Network Enabled Services
- Implementation of a protocol that defines a set
of capabilities - Protocol defines interaction with service
- All services require protocols
- Not all protocols are used to provide services
(e.g. IP, TLS) - Examples FTP and Web servers
13Application Programmer Interface
- A specification for a set of routines to
facilitate application development - Refers to definition, not implementation
- E.g., there are many implementations of MPI
- Spec often language-specific (or IDL)
- Routine name, number, order and type of
arguments mapping to language constructs - Behavior or function of routine
- Examples
- GSS API (security), MPI (message passing)
14Software Development Kit
- A particular instantiation of an API
- SDK consists of libraries and tools
- Provides implementation of API specification
- Can have multiple SDKs for an API
- Examples of SDKs
- MPICH, Motif Widgets
15Syntax
- Rules for encoding information, e.g.
- XML, Condor ClassAds, Globus RSL
- X.509 certificate format (RFC 2459)
- Cryptographic Message Syntax (RFC 2630)
- Distinct from protocols
- One syntax may be used by many protocols (e.g.,
XML) useful for other purposes - Syntaxes may be layered
- E.g., Condor ClassAds -gt XML -gt ASCII
- Important to understand layerings when comparing
or evaluating syntaxes
16A Protocol can have Multiple APIsE.g., TCP/IP
- TCP/IP APIs include BSD sockets, Winsock, System
V streams, - The protocol provides interoperability programs
using different APIs can exchange information - I dont need to know remote users API
Application
Application
WinSock API
Berkeley Sockets API
TCP/IP Protocol Reliable byte streams
17An API can have Multiple ProtocolsE.g., Message
Passing Interface
- MPI provides portability any correct program
compiles runs on a platform - Does not provide interoperability all processes
must link against same SDK - E.g., MPICH and LAM versions of MPI
18Back to GridsThe Programming Systems Problems
- Approaches to problem solving
- Data Grids, distributed computing, peer-to-peer,
collaboration grids, - Structuring and writing programs
- Abstractions, tools
- Enabling resource sharing across distinct
institutions - Resource discovery, access, reservation,
allocation authentication, authorization,
policy communication fault detection and
notification
19Back to GridsThe Programming Systems Problems
- The programming problem
- Facilitate development of sophisticated applns
- Facilitate code sharing
- Requires prog. envs APIs, SDKs, tools
- The systems problem
- Facilitate coordinated use of diverse resources
- Facilitate infrastructure sharing e.g.,
certificate authorities, info services - Requires systems protocols, services
- E.g., port/service/protocol for accessing
information, allocating resources
20Aspects of the Programming Problem
- Need for abstractions and models to add to
speed/robustness/etc. of development - E.g., OO abstractions, MPI for messaging
- Need for code/tool sharing to allow reuse of code
components developed by others - E.g., MPI allows reuse of message passing
- E.g., standard profilers, debuggers
- Primary need is for standard programming
environments APIs and SDKs
21Aspects of the Systems Problem
- Need for interoperability when different groups
want to share resources - Diverse components, policies, mechanisms
- E.g., standard notions of identity, means of
communication, resource descriptions - Need for shared infrastructure services to avoid
repeated development, installation - E.g., one port/service for remote access to
computing, not one per tool/application - E.g., Certificate Authorities expensive to run
- Need standard protocols, services, syntax
22I.e., Standard APIs and Protocols are Both
Important For Different Reasons
- Standard APIs/SDKs are important
- They enable application portability
- But w/o standard protocols, interoperability is
hard (every SDK speaks every protocol?) - Standard protocols are important
- Enable cross-site interoperability
- Enable shared infrastructure
- But w/o standard APIs/SDKs, application
portability is hard (different platforms access
protocols in different ways)
23Grid Architecture
- We now proceed to analyze Grid systems with
respect to standards - Identify key areas where protocols, services,
APIs, and SDKs can occur - Result is a layered protocol architecture
- We assert this can be useful as a means of
describing and structuring Grid Forum activities
24Layered Grid Architecture(By Analogy to Internet
Architecture)
25Protocols, Services, and InterfacesOccur at Each
Level
Applications
Languages/Frameworks
User Service APIs and SDKs
User Service Protocols
User Services
Collective Service APIs and SDKs
Collective Service Protocols
Collective Services
Resource APIs and SDKs
Resource Service Protocols
Resource Services
Connectivity APIs
Connectivity Protocols
Local Access APIs and Protocols
Fabric Layer
26An Aside on Terminology
- Is this an architecture or just a
categorization or taxonomy? - A matter of opinion (c.f. IAB Many members of
the Internet community would argue that there is
no architecture) - Our opinion it is somewhere in between, but is
useful regardless - Becomes more architectural if/as we define
necessary pieces at each level - Note that protocols says nothing about SDKs/APIs
architecture ( vice versa)
27Important Points
- We build on Internet protocols
- Communication, routing, name resolution, etc.
- Layering here is conceptual, does not imply
constraints on who can call what - Protocols/services/APIs/SDKs will, ideally, be
largely self-contained - But some things are fundamental e.g.,
communication and security - But, advantageous for higher-level functions to
use common lower-level functions
28Example User Portal
Appln
Web Portal
Source code discovery, application configuration
User
Brokering, co-allocation, certificate authorities
Collective
Access to data, access to computers, access to
network performance data
Resource
Communication, service discovery (DNS),
authentication, authorization, delegation
Connect
Storage systems, schedulers
Fabric
29ExampleHigh-Throughput Computing System
Appln
High Throughput Computing System
Dynamic checkpoint, job management, failover,
staging
User
Brokering, certificate authorities
Collective
Access to data, access to computers, access to
network performance data
Resource
Communication, service discovery (DNS),
authentication, authorization, delegation
Connect
Storage systems, schedulers
Fabric
30Standards, AgainIntergrid Protocols and Grid
APIs
- One or many protocols?
- No one right protocol for any one function
- But interoperability requires that we define and
commit to core Intergrid protocols - Definition A resource is Grid-enabled if it
speaks Intergrid protocols - One or many APIs and SDKs?
- Many APIs, SDKs, programming models can target
Intergrid protocols - But code sharing requires standards
- So, e.g., standard Grid collaboration APIs
31Questions for the Grid Forum
- Is the Grid architecture described here a
useful framework? - Could it be made more useful?
- Are there things that it fails to capture or
misrepresents? - Would it be a useful discipline for us to try to
place GF efforts in this context - E.g., be clear whether we are defining a
protocol, service, API, SDK, syntax (or something
else which is fine, too) - E.g., explain (and argue about) where in the
stack different pieces fit
32Questions for the Grid Forum
- Are some things easier, or more important, to
standardize than others? - Protocols vs. APIs vs. syntax
- Connectivity vs. resource vs. collective vs. user
layer protocols/services/APIs/SDKs - I would suggest that
- Items lower in the stack tend to have broader
impact, but standards useful at all levels - Size of community effected (e.g., number of
adopters) is the key figure of merit - We should ask explicitly for such an analysis as
part of a WG charter
33Questions for the Grid Forum
- Can we define core intergrid protocols?
- I.e., instantiate (lower) layers in the diagram
- We have avoided it until now (implies choice)
- Until we do, interoperability is difficult
- Possible approaches
- Avoid seeking consensus, instead standardize
where it makes sense and where we can rely on
sense of best practice emerging - Or, create an architecture WG, charged with
defining requirements for core protocols?? - I think latter is better, unsure if it can work
34Summary
- Grids are about large-scale sharing
- Hence require standard protocols to enable
interoperability and shared infrastructure - And, of course standard APIs and SDKs to enable
portability code sharing - Both important but very different
- Well defined architecture can help understanding
progress - Provides a framework for figuring out where the
pieces fit - Facilitates asking questions such as where are
standards particularly important?
35Questions?