Title: gLite, the next generation middleware for Grid computing
1gLite, the next generation middleware for Grid
computing
- Oxana Smirnova (Lund/CERN)
- Nordic Grid Neighborhood Meeting
- Linköping, October 20, 2004
- Uses material from E.Laure and F.Hemmer
2gLite
- What is gLite
- the next generation middleware for grid
computing - collaborative efforts of more than 80 people in
10 different academic and industrial research
centers - Part of the EGEE project (http//www.eu-egee.org
) - bleeding-edge, best-of-breed framework for
building grid applications tapping into the power
of distributed computing and storage resources
across the Internet
EGEE Activity Areas
(quoted from http//www.glite.org)
Nordic contributors HIP, PDC, UiB
3Architecture guiding principles
- Lightweight services
- Easily and quickly deployable
- Use existing services where possible as basis for
re-engineering - Lightweight does not mean less services or non-
intrusiveness it means modularity - Interoperability
- Allow for multiple implementations
- Performance/Scalability Resilience/Fault
Tolerance - Large-scale deployment and continuous usage
- Portability
- Being built on Scientific Linux and Windows
- Co-existence with deployed infrastructure
- Reduce requirements on participating sites
- Flexible service deployment
- Multiple services running on the same physical
machine (if possible) - Co-existence with LCG-2 and OSG (US) are
essential for the EGEE Grid service - Service oriented approach
60 external dependencies
4Service-oriented approach
- By adopting the Open Grid Services Architecture,
with components that are - Loosely coupled (messages)
- Accessible across network modular and
self-contained clean modes of failure - Can change implementation without changing
interfaces - Can be developed in anticipation of new use cases
- Follow WSRF standardization
- No mature WSRF implementations exist to-date so
start with plain WS - WSRF compliance is not an immediate goal, but the
WSRF evolution is followed - WS-I compliance is important
5gLite vs LCG-2
- Intended to replace LCG-2
- Starts with existing components
- Aims to address LCG-2 shortcoming and advanced
needs from applications (in particular feedback
from DCs) - Prototyping short development cycles for fast
user feedback - Initial web-services based prototypes being
tested with representatives from the application
groups
6Approach
- Exploit experience and components from existing
projects - AliEn, VDT, EDG, LCG, and others
- Design team works out architecture and design
- Architecture https//edms.cern.ch/document/476451
- Design https//edms.cern.ch/document/487871/
- Feedback and guidance from EGEE PTF, EGEE NA4,
LCG GAG, LCG Operations, LCG ARDA - Components are initially deployed on a prototype
infrastructure - Small scale (CERN Univ. Wisconsin)
- Get user feedback on service semantics and
interfaces - After internal integration and testing components
to be deployed on the pre-production service
7Subsystems/components
LCG2 components gLite services
User Interface
AliEn
Computing Element Computing Element
Worker Node
Workload Management System Workload Management System
Package Management
Job Provenance
Logging and Bookkeeping Logging and Bookkeeping
Data Management Data Management
Information Monitoring Information Monitoring
Job Monitoring
Accounting
Site Proxy
Security Security
Fabric management
8Workload Management System
9Computing Element
- Works in push or pull mode
- Site policy enforcement
- Exploit new Globus GK and Condor-C (close
interaction with Globus and Condor team)
CEA Computing Element Acceptance JC Job
Controller MON Monitoring LRMS Local Resource
Management System
10Data Management
- Scheduled data transfers (like jobs)
- Reliable file transfer
- Site self-consistency
- SRM based storage
11Storage Element Interfaces
- SRM interface
- Management and control
- SRM 1.1 (with possible evolution)
- Posix-like File I/O
- File Access
- Open, read, write
- Not real posix (like rfio)
Control
SRM interface
POSIXAPI File I/O
User
rfio
dcap
chirp
aio
dCache
NeST
Castor
Disk
12Catalogs
- File Catalog
- Filesystem-like view on logical file names
- Keeps track of sites where data is stored
- Conflict resolution
- Replica Catalog
- Keeps information at a site
- (Metadata Catalog)
- Attributes of files on the logical level
- Boundary between generic middleware and
application layer
Metadata Catalog
Metadata
File Catalog
LFN
GUID
Site ID
Site ID
13Information and Monitoring
- R-GMA for
- Information system and system monitoring
- Application Monitoring
- No major changes in architecture
- But re-engineer and harden the system
- Co-existence and interoperability with other
systems is a goal - E.g. MonaLisa
e.g D0 application monitoring
MPP Memory Primary Producer DbSP Database
Secondary Producer
14Security
CredentialStorage
myProxy
Obtain Grid (X.509)credentials for Joe
PseudonymityService (optional)
1.
Joe ? Zyx
2.
tbd
3.
AttributeAuthority
Issue Joesprivileges to Zyx
4.
Joe
VOMS
The Grid
UserZyx IssuerPseudo CA
GSI LCAS/LCMAPS
15GAS Package Manager
- Grid Access Service (GAS)
- Discovers and manages services on behalf of the
user - File and metadata catalogs already integrated
- Package Manager
- Provides application software at execution site
- Based upon existing solutions
- Details being worked out together with
experiments and operations
16Current Prototype
- WMS
- AliEn TaskQueue, EDG WMS, EDG LB (CNAF)
- CE (CERN, Wisconsin)
- Globus Gatekeeper, Condor-C, PBS/LSF , Pull
component (AliEn CE) - WN
- 23 at CERN 1 at Wisconsin
- SE (CERN, Wisconsin)
- External SRM implementations (dCache, Castor),
gLite-I/O - Catalogs (CERN)
- AliEn FileCatalog, RLS (EDG), gLite Replica
Catalog - Data Scheduling (CERN)
- File Transfer Service (Stork)
- Data Transfer (CERN, Wisc)
- GridFTP
- Metadata Catalog (CERN)
- Simple interface defined
- Information Monitoring (CERN, Wisc)
- R-GMA
- Security
- VOMS (CERN), myProxy, gridmapfile and GSI
security - User Interface (CERN Wisc)
- AliEn shell, CLIs and APIs, GAS
- Package manager
- Prototype based on AliEn PM
17Summary, plans
- Most Grid systems (including LCG2) are batch-job
production oriented, gLite addresses distributed
analysis - Most likely will co-exist, at least for a while
- A prototype exists, new services are being added
- Dynamic accounts, gLite CEmon, Globus RLS, File
Placement Service, Data Scheduler, fine-grained
authorization, accounting - A Pre-Production Testbed is being set up
- more sites, tested/stable services
- First release due end of March 2005
- Functionality freeze at Christmas
- Intense integration and testing period from
January to March 2005 - 2nd release candidate November 2005
- May revised architecture doc, June revised
design doc