Title: The gLite middleware distribution
1The gLite middleware distribution
- OSG Consortium Meeting
- Seattle, 21-23 August 2006
2Outline
- Background and approach adopted
- Architecture
- Software process
- Status
- Summary
3Background Approach
- gLite
- Exploit experience and existing components from
VDT (Condor, Globus), EDG/LCG, AliEn, and
others - Develop a lightweight stack of generic
middleware useful to EGEE applications (HEP and
Biomedics are pilot applications). - Pluggable components cater for different
implementations - Follow SOA approach, WS-I compliant where
possible - Focus is on re-engineering and hardening
- Business friendly open source license
- Plan to switch to Apache-2
4Service Oriented Architecture
- gLite follows a Service Oriented Architecture
- Facilitate interoperability among Grid services
- Allow easier compliance with upcoming standards
- The services work together in a concerted way but
can also be deployed and used independently,
allowing their exploitation in different contexts - Services communicate through the exchange of
messages - Slowly moving to WS- interfaces
- Still missing a real standard. Many WS-
specifications - Activity inside GGF-GIN
5Middleware structure
- Applications have access both to Higher-level
Grid Services and to Foundation Grid Middleware - Higher-Level Grid Services are supposed to help
the users building their computing infrastructure
but should not be mandatory - Foundation Grid Middleware will be deployed on
the EGEE infrastructure - Must be complete and robust
- Should allow interoperation with other major grid
infrastructures - Should not assume the use of Higher-Level Grid
Services
Applications
Higher-Level Grid Services
Workload Management Replica Management Visualizat
ion Workflow Grid Economies ...
Foundation Grid Middleware Security model and
infrastructure Computing (CE) and Storage
Elements (SE) Accounting Information and
Monitoring
6gLite Grid Middleware Services
Access
API
CLI
Security
Information Monitoring
Authorization
Auditing
Information Monitoring
Application Monitoring
Authentication
Workload Management
Data Management
MetadataCatalog
File ReplicaCatalog
JobProvenance
PackageManager
Accounting
StorageElement
DataMovement
ComputingElement
WorkloadManagement
Site Proxy
Overview paper http//doc.cern.ch//archive/electro
nic/egee/tr/egee-tr-2006-001.pdf
7Grid foundation Accounting
- Resource usage by VO, group or single user
- Resource metering sensors running on resources
to determine usage - Pricing policies associate a cost to resource
usage - if enabled allowed market-based resource
brokering - privacy access to accounting data granted only
to authorized people (user, provider, VO manager)
- Basic functionality in APEL, full functionality
in DGAS
8Grid foundation Computing Element
- The CE software
- accepts batch jobs (and job control requests)
through a gatekeeper - LCG-CE (GT2 GRAM GSI-enabled Condor)
- gLite-CE (GSI-enabled Condor-C)
- CREAM (WS-I based interface)
- performs the necessary AAA operations and map to
a local user - through LCAS/LCMAPS and the GRAM or glexec
- passes the job to a layer that interacts with the
local resource manager - BLAH
- monitors the status of the jobs and reports it to
the client - Condor
- CEMon (in CREAM)
- Web service interface to the CE info
Client
Grid
Monitoring
Computing Element
Site
AAA and local mapping
Job Controller
LRMS
9Grid foundation Storage Element
- Site File Name (SFN) identifies a Storage
Element and the logical name of the file inside
it - Physical File Name (PFN) argument of file open
- Storage Resource Manager (SRM)
- hides the storage system implementation (disk or
active tape) - checks the access rights to the storage system
and the files - translates SFNs to PFNs
- disk-based DPM, dCache tape-based Castor,
dCache - File I/O posix-like access from local nodes or
the grid - GFAL
10High Level Services EDS
- Encrypted Data Storage
- encrypt and decrypt data on-the-fly
- Key-store Hydra
- N instances at least M (ltN) need to be available
for decryption - fault tolerance and security
Will be LFC
Will be DPM (now d-Cache)
- Demonstrated with the SRM-DICOM demo at EGEE Pisa
conference (Oct05)
Will be GFAL
11High Level Services Workload Manag.
- Resource brokering, workflow management, I/O data
management - Web Service interface WMProxy
- Task Queue keep non matched jobs
- Information SuperMarket optimized cache of
information system - Match Maker assigns jobs to resources according
to user requirements - Job submission monitoring
- Condor-G
- ICE (to CREAM)
- External interactions
- Information System
- Data Catalogs
- LoggingBookkeeping
- Policy Management system (G-PBox)
12High Level Services Job Information
- Logging and Bookkeeping service
- Tracks jobs during their lifetime (in terms of
events) - Job Provenance stores long term job information
- Supports job rerun
13gLite Software Process
JRA1 Development
Directives
Error Fixing
Software
Serious problem
SA3 Integration
SA3 Testing Certification
SA1 Pre-Production
Deployment Packages
Testbed Deployment
Problem
Fail
SA1 Production Infrastructure
Pre-Production Deployment
Fail
Integration Tests
Pass
Functional Tests
Pass
Fail
Installation Guide, Release Notes, etc
Scalability Tests
Pass
Release
14gLite Software Process
- Technical Coordination Group (TCG)
- gathers prioritizes user requirements
- from HEP, biomed, (industry), sites
- gLite development is client-driven!
- Software from EGEE-JRA1 and other projects
- JRA1 preview testbed (currently being set up)
- early exposure to users of uncertified
components - SA3 Integration Team
- Ensures components are deployable and work
- Deployment Modules implemented high-level gLite
node types - (WMS, CE, R-GMA Server, VOMS Server, FTS, etc)
- Build system now spun off into the ETICS project
(Jan 2006) - SA3 Certification Team
- Merge of the JRA1 testing and SA1 certification
teams - Dedicated testbed test release candidates and
patches - Develop test suites
- SA1 Pre-Production System
- Scale tests by users
15gLite status
- Convergence of LCG 2.7.0 and gLite 1.5.0 in
spring 2006 - continuity on the production infrastructure
ensured usability by experiments - new features from gLite 1.5.0
- Current activities
- Improve usability, efficiency and performance
- Migration to VDT 1.3.11 (GT4 pre-WS)
- Support for Scientific Linux 4 and 64-bit
- Support for other platforms will follow
- New data management components for Biomed
applications on the production infrastructure - Certify new components (CREAM, Job Provenance,
GPBox, ...) - Interoperation with other projects and adherence
to standards - Open source (Apache) license
16Summary
- gLite 3 is an important milestone in EGEE program
- New components from gLite 1.5 being deployed for
the first time on the Production Infrastructure - Address requirements in terms of functionality
and scalability - Components deployed for the first time need
extensive testing! - New organization in EGEE II
- more controlled software process
- development is client driven (TCG)
- Development is continuing to provide increased
robustness, usability and functionality - Collaboration with other projects for
interoperability and definition/adoption of
international standards
17www.glite.org