Title: Grid Middleware and prospects on high level of standardization
1Grid Middleware and prospects on high level of
standardization
and p
e
2Why is Grid MiddlewareSuch a Challenge?(the
road from CHEP2000 to CHEP2003)
3Because developing good software is not easy
and Distributed Computingis a hard
problem!(do not try it at home!)
4Claims for benefits provided by Distributed
Processing Systems
P.H. Enslow, What is a Distributed Data
Processing System? Computer, January 1978
- High Availability and Reliability
- High System Performance
- Ease of Modular and Incremental Growth
- Automatic Load and Resource Sharing
- Good Response to Temporary Overloads
- Easy Expansion in Capacity and/or Function
5Benefits to Science
- Democratization of Computing you do not have
to be a SUPER person to do SUPER computing. - Speculative Science Since the resources are
there, lets run it and see what we get. - Data Mining Lets do a pair wise comparison of
these 120,000 proteins - Function shipping Find the image in this 3 TB
collection that has a red car.
6CERN 92
7The 94 Worldwide Condor Flock
Amsterdam
Delft
3
30
10
200
3
3
3
Madison
Warsaw
10
10
Geneva
Dubna/Berlin
8The NUGn Quadratic Assignment Problem (QAP)
aijbp(i)p(j)
min p??
9Solution Characteristics (n30)
Scientists 4
Wall Clock Time 6220431
Avg. workers 653
Max. workers 1007
CPU Time Approx. 11 years
Nodes 11,892,208,412
LAPs 574,254,156,532
Parallel Efficiency 92
10CMS Integration Grid Testbed Managed by ONE
Linux box at Fermi
11Why is this so hard?
- Declarative rather than procedural definition of
requests. - Submit and forget.
- Effective planning of request execution.
- Trustworthy execution of requests.
- Distributed ownership of resources.
- Physical distribution of resources.
12The Ethernet Protocol
- IEEE 802.3 CSMA/CD - A truly distributed (and
very effective) access control protocol to a
shared service. - Client responsible for access control
- Client responsible for error detection
- Client responsible for fairness
13Client Responsibilities
- Use algorithms that can generate very large
numbers of independent tasks use pleasantly
parallel algorithms - Implement self-contained portable workers this
code can run anywhere! - Detect failures and react gracefully use
exponential back off! - Be well informed and opportunistic get your
work done and out of the way!
14The Layers of Condor
Application
Submit (client)
Application Agent
Customer Agent
Matchmaker
Owner Agent
Execute (service)
Remote Execution Agent
Local Resource Manager
Resource
15Grid
WWW
Master
Worker
16Being a Master
- Customer deposits task(s) with the master that
is responsible for - Obtaining resources and/or workers
- Deploying and managing workers on obtained
resources - Assigning and delivering tasks to
obtained/deployed workers - Receiving and processing results
17Client
Master
MW Library (master)
Condor-G
Communication Library
GT-Gate Keeper
StartD
18PlanningandExecuting
19Customer orders agentPlace y F(x) at
L!Grid delivers.
20A simple plan for yF(x) -gt L
- Allocate (size(x)size(y)size(F)) at SE(i)
- Move x from SE(j) to SE(i)
- Place F on CE(k)
- Compute F(x) at CE(k)
- Move y to L
- Release allocated space
Storage Element (SE) Compute Element (CE)
21Data Placement (DaP) is an integral part
ofend-to-end functionality
Space management and Data transfer
22Client
Application
DAGMan
Planning
Stork
Condor-G
GateKeeper
RFT
NeST
StartD
23How to deliver a common (best
practice)middleware suiteto thecommunity?
24Best practice
- Requires a true collaboration between all
parties - Requires extensive testing
- Requires deployment and adoption
- Requires evaluation metrics
- Requires open minds
25Common Middleware
- Requires robust implementation.
- Requires professional support.
- Requires continuity and longevity.
- Requires willingness to resolve show stoppers.
- Requires commitment to the concept.
- Requires an exit strategy.
26TheVirtual Data Toolkit(VDT)is an attemptto
deliver such a middleware suite
27What is VDT?
- Part of the GriPhyN and VDT projects
- Contributions from many middlewae providers
(Condor, Globus, PACMAN, EDG, PPDG, ) - Extensive testing in semi-production conditions
- Close interaction with the experiments
- Support infrastructure
- Adopted by key players
28Standard middleware?
- adjective
- 1. Serving as or conforming to a standard of
measurement or value. - 2. Widely recognized as a model of authority or
excellence a standard reference work. - 3. Acceptable but of less than top quality a
standard grade of beef. - 4. Normal, familiar, or usual the standard
excuse. - 5. Commonly used or supplied standard car
equipment. - 6. Linguistics. Conforming to established
educated usage in speech or writing.