Title: Open Grid Forum HPC Interoperability Demonstration: WMProxy Alessandro Maraschini alessandro'marasch
1Open Grid Forum HPC Interoperability
Demonstration WMProxyAlessandro Maraschini
alessandro.maraschini_at_datamat.it
SC06, Tampa, FL USA 11-17 November 2006
2Contents
- Enabling Grids for E-sciencE (EGEE) project
- gLite middleware
- Workload Manager Proxy (Wmproxy)
- Job Submission Languages
- Basic Execution Service (BES)
- BES and JSDL
- Wmproxy and BES
- Interoperability Demo
3gLite Middleware
- EDG/LCG experience Evolution
- New features
- Better Security
- More Effective Job Submission
- Grid Service based on Service Oriented
Architecture - Easy to connect to other Grid services
- Facilitate compliance with upcoming Grid
standards
4The EGEE Project
- EGEE The Biggest Multi Science GRID
Infrastructure in the World - Aim of EGEE
- to establish a seamless European Grid
infrastructure for the support of the European
Research Area (ERA) - EGEE
- 1 April 2004 31 March 2006
- 71 partners in 27 countries, federated in
regional Grids - EGEE-II
- 1 April 2006 31 March 2008
- Expanded consortium
- 91 partners
5Background Approach
- gLite
- Exploit experience and existing components from
VDT (Condor, Globus), EDG/LCG, and others - gLite is a distribution that combines
- components from many different
- providers!
- Develop a lightweight stack of generic
middleware useful to EGEE applications - Pluggable components
- Follow SOA approach, WS-I compliant where
possible - Focus is on re-engineering and hardening
- Business friendly open source license
- Plan to switch to Apache-2
6Service Oriented Architecture
- gLite follows a Service Oriented Architecture
- Facilitate interoperability among Grid services
- Allow easier compliance with upcoming standards
- The services work together in a concerted way but
can also be deployed and used independently,
allowing their exploitation in different contexts - Services communicate through the exchange of
messages - Moving to WS- interfaces
- Still missing a real standard. Many WS-
specifications - Activity inside OGF-GIN
7gLite Middleware Stack
- Job management Services
- Workload Management
- Computing Element
- Logging and Bookkeeping
- Job Provenance
- Data management Services
- File and Replica catalog
- File Transfer and Placement Services
- Information Services
- R-GMA
- Service Discovery
- Security
- Deployment Modules
- Distribution available as RPMs, Binary Tarballs,
Source Tarballs and APT cache
Access Services
Grid AccessService
API
Security Services
Information Monitoring
Services
Authorization
Application Monitoring
Information Monitoring
Auditing
Authentication
Data Services
Job Management Services
MetadataCatalog
JobProvenance
Logging Bookkeeping
File ReplicaCatalog
Accounting
StorageElement
WorkloadManagement
DataManagement
ComputingElement
Site Proxy
8WMProxy overview
- Workload Manager Proxy (WMProxy)
- Web Service Interface
- Simplicity/Extensibility
- Accessibility
- greater community of users
- Integration
- Eases interoperability among different
middlewares - Stepping towards Interoperability
- SOA conformance WSDL service description
- WS-I compliance support for multi-language,
multi-platform clients - Features
- Provide Acces to gLite Workload Management System
- Serve a large number of requests
- Improve performances
- Improve usability
- Provide new functionalities
9WMProxy Architecture (1/4)
- Service Container
- Apache GridSite FastCGI
- SOAP tooling
- Stub generation is performed through gSOAP (C)
- Security
- SSL Authentication
- DN / FQAN-based Authorization
- Operations
- Job Submission and Control services
implementation (C) - Gridsite Delegation service implementation (C)
10WMProxy Architecture (2/4)
- gSOAP layer
- intermediate layer between gSOAP and the WMProxy
core - Directory Manager
- Job reserved area Creation/Management on the
server file system - LB Access
- interaction with LB components
- job registration
- logging for job events
- job information queries
- Request Delivery
- deliver request to Workload Management
11WMProxy Architecture (3/4)
WMProxy Server
Apache
LB Proxy
Request Queue
Local File System
Server Host
- WMProxy integration with the WMS
12WMProxy Architecture (4/4)
gSOAP Independant
WMProxy Server
13WMProxy Job Operations
- Main operations provided
- Submission
- jobListMatch - finds the resources matching job
requirements - jobRegister - registers a job for submission
- getSandboxDestURI - returns the URI of the
location where jobs input sandbox file have to
be uploaded - jobStart - starts a previously registered job
- jobSubmit - one shot job submission (job
registration job start) - Control
- jobCancel - cancels a previously submitted job
- jobPurge - clean-up of jobs reserved area on
WMS node - getOutputFileList - returns the URIs of job
output files - getPerusalFiles - allows inspection of jobs
files while the job is running - Miscellanoeus
- getJDL returns the description of a previously
submitted job - getFreeQuota - gets the user available disk space
for jobs sandboxes on the WMS node - get/add/removeACLItems - handle jobs GACL file
entries
14WMProxy New Features
- Bulk Submission Compound Jobs Submitted to
WMProxy - Direct Acyclic Graph (DAG)
- Set of jobs where the input/output/execution of
one or more jobs depends on one or more other
jobs - Collection
- Group of jobs with no dependencies
- Parametric
- Job having one or more attributes that vary their
values according to parameters - Main benefits
- One Shot submission for (up to thousands of) jobs
- Submission time reduction
- Single call to WMProxy server
- Single AuthN and AuthZ process
- Sharing of files between jobs
- Single JobId to manage all jobs
- Job Sandboxes management
- Shared Input sandbox
- Archived/Compressed input sandbox transfer
- External Input/Output sandboxes
- Job file perusal
15WMProxy Job Submission Chain
- Submission Chain steps
- jobRegister
- Submission Request specified by Client
- Job Description Language (next slide)
- Security Issues checked
- Authentication Client SSL verification
- Authorization Client mapping to local (server)
user with LCMAPS - Job Identifier (job Id) generated
- Local Directories and Files generated
- LB Job Registeration
- Job identifier is returned to the user
- jobStart
- WMProxy checks if the job has been previously
registered and not yet started - Server specific attributes needed by WMS are
inserted in JDL - For compound requests
- Sub-jobs registration to LB is performed
- Sub-jobs local directories and files are created
with appropriate ownership and permissions - Input sandbox archive files (if any) uncompressed
- Job Delivery to Workload Manager
16gLite Job Description Language
- Job Description Language (JDL)
- gLite approach to Submission Description
- classads-based language
- Allow the user to specify
- Characteristics of the application
- Executable
- Arguments
- Input/Output Sandbox files
- Requirements/preferences about resources
- computational
- storage
- Hints for gLite WMS on how to handle the
application - number of retries
- proxy renewal
-
17 Executable "my-executable.sh"
StdOutput "std.out" StdError "std.err"
InputSandbox "/home/maraska/my-executable.sh"
OutputSandbox "std.out", "std.err"
requirements other.GlueCEStateStatus
"Production" rank -other.GlueCEStateEstimate
dResponseTime virtualOrganisation
"EGEE" RetryCount 3
Job Description Language
18gLite Job Description Language
- Features
- Support for
- Simple
- Complex MPI, Interactive, Partitionable,
Checkpointable - Aggregates of jobs DAGs, Collections, Parametric
- Fully extensible
- Not bound to any resource description schema
- GLUE schema is the default
- Issues
- Quite strongly gLite-specific
- Tightly bound to gLite WMS
- Other Approaches
- JSDL (next slide)
- JSDL support within WMProxy through the
application of a XSLT stylesheet (JSDL2JDL
transformation)
19Job Submission Description Language (JSDL)
- The Need for Interoperability
- Different Grid Environment, involves
- Different Organisations, which involves
- Different Job Management Systems, which involves
- Different Job Submission languages...
- Job Submission Desc. Language
- Generic GRID environment Language
- XML-based language
- Normative XML schema provided (v1.0 is out)
- Normative extensions
- POSIX-Application
- Others to come
- HPCProfile-Application
- Parallel Application (MPI)
- Parameter Sweep
20Job Submission Description Language
- JSDL Features
- Web Service oriented (XML)
- Not bound to a specific Grid M/w
- Independent from language bindings
- GGF/OGF recommendation
- On the right way for becoming a standard
- Already adopted by many Grid Projects (Naregi,
UniGrids, GridSAM ) - JSDL Issues
- Currently supports only simple applications
- POSIXApplication extension
- No support for jobs aggregates (collections,
dags, parametrics..) - Poor/static schema for expressing requirements on
resources
21Job Submission Description Language
- JSDL structure
- JobDefinition
- Main type, contain all attributes
- JobIdentification
- Specific job identifiers
- Application (POSIX)
- Job executable properties (arguments, env
variables, ) - Job restrictions (files, coredumps, memory size
limits, ) - Resources
- Actual resources needed by the job to be executed
- Matches (partially) the JDL Requirements
expression - DataStaging
- Job Input/Output files managing
22lt?xml version"1.0" encoding"UTF-8"?gtltjsdlJobDe
finition xmlns"http//www.example.org/"
xmlnsjsdl"http//schemas.ggf.org/jsdl/20
05/11/jsdl" xmlnsjsdl-posix"htt
p//schemas.ggf.org/jsdl/2005/11/jsdl-posix"
xmlnsxsi"http//www.w3.org/2001/XMLS
chema-instance"gt ltjsdlJobDescriptiongt
ltjsdlJobIdentificationgt
ltjsdlJobNamegtMy Gnuplot
invocationlt/jsdlJobNamegt
ltjsdlJobProjectgtEGEElt/jsdlJobProjectgt
ltjsdlDescriptiongt Simple
application invocation
lt/jsdlDescriptiongt
lt/jsdlJobIdentificationgt
ltjsdlApplicationgt
ltjsdl-posixPOSIXApplicationgt
ltjsdl-posixExecutablegt/bin/myexe.shlt/jsdl-posix
Executablegt
ltjsdl-posixArgumentgtcontrol1.txtlt/jsdl-posixArgu
mentgt ltjsdl-posixArgumentgtco
ntrol2.txtlt/jsdl-posixArgumentgt
ltjsdl-posixArgumentgtcontrol3.txtlt/jsdl-posix
Argumentgt lt/jsdl-posixPOSIXApplic
ationgt lt/jsdlApplicationgt
Job Submission Description Language
- JSDL example (1/3)
- Identification Application
23 ltjsdlResourcesgt
ltjsdlIndividualPhysicalMemorygt
ltjsdlLowerBoundedRange jsdlexclusiveBound"true"
gt 2097152.0
lt/jsdlLowerBoundedRangegt
lt/jsdlIndividualPhysicalMemorygt
ltjsdlTotalCPUCountgt
ltjsdlExactgt1.0lt/jsdlExactgt
lt/jsdlTotalCPUCountgt ltjsdlCandidateHosts
gt ltjsdlHostNamegtlxgrid01.pd.infn.it
lt/jsdlHostNamegt ltjsdlHostNamegtlxgr
id02.pd.infn.itlt/jsdlHostNamegt
lt/jsdlCandidateHostsgt
ltjsdlCPUArchitecturegt
ltjsdlCPUArchitectureNamegtIntellt/jsdlCPUArchitect
ureNamegt lt/jsdlCPUArchitecturegt
lt/jsdlResourcesgt
Job Submission Description Language
- JSDL example (2/3)
- Resources
24 ltjsdlDataStaginggt
ltjsdlFileNamegtcontrol.txtlt/jsdlFileNamegt
ltjsdlCreationFlaggtoverwritelt/jsdlCreationFlaggt
ltjsdlDeleteOnTerminationgttruelt/jsdlDelet
eOnTerminationgt ltjsdlSourcegt
ltjsdlURIgthttp//foo.bar.com/me/control.txtlt/
jsdlURIgt lt/jsdlSourcegt
lt/jsdlDataStaginggt ltjsdlDataStaginggt
ltjsdlFileNamegtoutput1.pnglt/jsdlFileNamegt
ltjsdlCreationFlaggtoverwritelt/jsdlCreationFla
ggt ltjsdlDeleteOnTerminationgttruelt/jsdlD
eleteOnTerminationgt ltjsdlTargetgt
ltjsdlURIgtrsync//spoolmachine/userdirlt/js
dlURIgt lt/jsdlTargetgt
lt/jsdlDataStaginggt lt/jsdlJobDescriptiongtlt/jsdl
JobDefinitiongt
Job Submission Description Language
- JSDL example (3/3)
- DataStaging
25Basic Execution Service
- The Basic Execution Service (OGSA-BES)
- Interoperability issues
- Defining the IDEA of execution within different
Oss - Using JSDL as the job description language
- JSDL High Performances Computer Profile Extension
- Ongoing activity (still poor attributes set)
- Defined BES Interfaces
- Activity
- Create, Monitor and Control single Activity
- Factory
- Create, Monitor and Control sets of Activities
- Management
- Monitor and control BES details (administrators)
- SC06 DEMO Scope BES-Factory Implementation
26Basic Execution Service
- BES-FACTORY interface (1/2)
- CreateActivity
- Perform the submission of a Job
- INPUT JSDL used to describe Job characteristics
- OUTPUT EndPointReference (EPR) containing a
unique identification of the created activity - GetActivityStatuses
- Monitor the activity/activities
- INPUT one (or more) EPR referring to a
previously created Activity - OUTPUT corresponding status (Pending,Running,
Cancelled, Falied, Finished) - TerminateActivities
- Terminate the activity/activities
- INPUT one (or more) EPR referring to a
previously created Activity - OUTPUT corresponding termination result(s)
true/false
27Basic Execution Service
- BES-FACTORY interface (2/2)
- GetActivityDocuments
- Retrieve the JSDL submitted when creating the
activity (activities) - INPUT one (or more) EPR referring to a
previously created Activity - OUTPUT one (or more) JSDL corresponding to the
created Activity - GetFactoryAttributesDocument
- Retrieve a set of information from the contacted
endpoint - INPUT no input required
- OUTPUT a structure with
- Operating System, CPU details
- Phisical/Virtual Memory
- Accepting new Activities
- Number of running Activities
- ...
28BES WMProxy Integration
- WMProxy
- BES-Factory Implementation
- Operation Mapping
- BES
- Create Activity
- GetActivitiesStatuses
- TerminateActivities
- GetActivityDocuments
- GetFactoryAttributesDocument
- WMPROXY
- jobSubmit
- Was not present (implemented)
- Previously delegated to another service
- jobCancel
- getJDL
- Was not present (implemented)
- Security
- WS-Security Username Token Profile
29WMProxy Architecture with BES
MOD FastCGI
MOD SSL
BES Factory
WMProxy Server
Logging Bookkeeping
Client
Gridsite Delegation
Job Sub/Contr.
WSS
SOAP/
HTTPS
WS-A
Apache
LB Proxy
LB Data Base
Request Queue
Local File System
Workload Manager
Server Host
- WMProxy integration with the WMS
30SC06 interoperability
31SC06 interoperability
- SuperComputing 06 the Interoperability DEMO
- Several Projects, each one
- Exposing a BES Service Interface
- Implementing a BES Client
- DEMO scope
- Try and allow all other clients to perform BES
operation with all other implemented Services - Try and Perform BES operations with the
implemented client to all others services
32SC06 interoperability
- Wiki Pages the Results
- http//forge.ggf.org/sf/wiki/do/viewPage/projects.
ogsa-hpcp-wg/wiki/HomePage
33SC06 The Demo
- DEMO Scenarios
- Case 1
- Create an Activity
- Monitorize the status of the Activity
- Retrieve back the Activity document
- Terminate the Activity
- Case 2
- Create an Activity
- Wait for the Activity to successfully end
- Eventually Display the Activity produced document
(via browser) - Case 3
- Create Several (possibly different)Activities
- Monitorize the Activities Statuses
- Retrieve back the JSDL Activities documents
- Eventually Terminate the Activities
34Conclusions
- Issues
- Security and Delegation
- Heterogeneous approaches to security (e.g.
message vs transport level) in the different
middlewares - No clear specification/support for delegation in
BES - JSDL support richer extensions are needed
- WS tooling still immature support for some WS-
specs - Future Plans
- Include BES interface in the official WMProxy
distribution - Expose both the legacy (richer) WMProxy interface
and the BES one - Complete implementation of WS-Addressing and
WS-Security (e.g. Web Services Security X.509
Certificate Token Profile) - Extend JSDL support to all extensions that can be
mapped to gLite JDL job types - Delegation
- Current approach aligned with the GT4 one
- Explore alternative approaches (e.g. Explicit
Trust Delegation model)
35Questions
- ?
- ? ?
- ?
- very shy people question contact egee_at_datamat.it