Title: The CrossBroker Resource Management for the Grid
1The CrossBroker Resource Management for the Grid
- Enol Fernández
- Computer Architecture Operating Systems
Department - Universitat Autònoma de Barcelona
European Condor Week Barcelona 2008
2Outline
- Introduction
- The CrossBroker GRMS
- Interactive Job Support
- Parallel Job Support
- Conclusions
3Batch Execution on Grids
Internet
REMOTE SITE
REMOTE SITE
4Parallel Interactive Job Execution
- Use of resources from different sites
- Resource-sets search
- Co-allocation synchronization
- Fast start-up
- Execution in high-occupancy situations
Internet
REMOTE SITE
REMOTE SITE
MPI
5CrossBroker
- Automatic job management on Grid Environments
- Search and selection of available resources, job
conditioning, job launching, job monitoring, job
retry (in case of failures) and results
retrieval. - Sequential and parallel applications.
- Support for interactive and batch execution modes
- Best effort approach to deal with
failures/problems
6CrossBroker - Architecture
Information Index
User Interface
Scheduling Agent
Resource Searcher
CrossBroker
Replica Manager
Application Launcher
DAGMan
Condor-G
EGEE/Globus
EGEE/Globus
LRMS
LRMS
Computing Element
Computing Element
WN
WN
7CrossBroker - Architecture
- Scheduling Agent
- Receives each job and keeps it in a persistent
queue - Contacts Resource Searcher and gets a list of
available resources - Selects resources and passes them to the
Application Launcher - Resource Searcher
- Given a job description, performs the matchmaking
between job needs and available resources. - Uses the Condor ClassAd library, originally
designed for matches of a single job with a
single resource. - A set matching has been developed to support
matches of a single job to a group of resources.
8CrossBroker Job Execution
Resource Level
Grid Level
Application Launcher
Computing Element
Condor-G
LRMS
Synchronization
Job Starter
Users Application
Interactive Shadow
Interactive Agent
9CrossBroker Job Execution
- Application Launcher
- Responsible for providing a reliable submission
service of parallel applications on the Grid. - Handle the synchronization and monitor
application - Uses services of Condor-G
- Job Starter
- Initiate applications at the Worker Nodes
- Responsible for file staging at the remote site
(executable and input/output files) - Handle details of LRMS and parallel communication
libraries - Interactive Agent
- Create interactive sessions between application
and user - Split execution Shadow and Agent
10Job Description Language
- Text file using extended version of JDL (Job
Description Language)
VirtualOrganisation imain" JobType
Normal" Executable
tester-app" Arguments -f 23
-d" StdOutput std.out StdError
std.err InputSandbox
"tester-app, data" OutputSandbox
output-data, std.out Rank
other.GlueHostBenchmarkSI00 Requirements
other.GlueCEStateStatus"Production"
11Interactive Job Support
- Job Description Language file
- INTERACTIVE true/false. Indicates that the job
is interactive and the broker should treat it
with higher proirity - INTERACTIVEAGENT
- INTERACTIVEAGENTARGUMENTS
- These attributes specify the command (and its
arguments) used to communicate with the user.
12Interactive Job Support
- Type "Job"
- VirtualOrganisation "imain"
- JobType Normal"
- Interactive TRUE
- InteractiveAgent glogin
- InteractiveAgentArguments -r p
195.168.105.6523433 - Executable "test-app"
- InputSandbox "test-app", "inputfile"
- OutputSanbox "std.out", "std.err"
- StdErr "std.err
- StdOutput "std.out"
- Rank other.GlueHostBenchmarkSI00
- Requirements
- other.GlueCEStateStatus "Production"
13Interactive Job Support
- Scheduling priority
- Interactive jobs are sent to sites with available
machines - If there are not available machines, use time
sharing - Support for interactivity in all kinds of jobs
- sequential and parallel jobs
- CrossBroker injects interactive agents that
enable communication between user and job - Transparent to the user
- Full integration with glogin gVid
- Condor Bypass supported
14Time sharing
- The idea
- Each job is encapsulated in an agent that takes
control over the WN independently of its LRMS - Lightweight Virtual Machines
- Each Worker Node is divided in 2 execution slots
- Each VM can execute jobs independently (e.g.
batch and interactive) - NOT a full virtual machine (Xen, VMWare,)
- NO need for special priviledges in the WN
15Time sharing
Computing Element
CrossBroker
Job
LRMS
WN
Grid Resource
16Time sharing
Computing Element
CrossBroker
Agent
Job
Job
LRMS
WN
Slot 1
Slot 2
Grid Resource
17Time sharing interactivity
- Batch jobs create execution slots in WN
- Extra Overhead due to creation of slots
- Interactive jobs only use available resources at
submission time - Free WN
- Available slots created by Glidein
- Priority adjustment
- Batch job priority is decreased
- Interactive jobs get more CPU
18Time sharing Interactivity
Computing Element
CrossBroker
Int. Job
200 s
LRMS
Priority adjustment
50 s
WN
Slot 1
Slot 2
Batch Job
Startup-time Reduction Only one layer involved
Grid Resource
19Parallel Job Support
- Support for parallel jobs
- Open MPI
- PACX-MPI
- MPICH-P4
- MPICH-G2
- Takes into account sites capabilites
- Ability to define user Job Starters to initiate
the parallel job - mpi-start is configured automatically and used by
default.
20Parallel Job Support
- Job Description Language file
- JOBTYPE
- Normal sequential jobs, just one CPU
- Parallel more than one CPU
- SUBJOBTYPE
- openmpi
- pacx-mpi
- mpich
- mpich-g2
- plain
- JOBSTARTER (if not defined, mpi-start)
- JOBSTARTERARGUMENTS
21Parallel Job Support
- Type "Job"
- VirtualOrganisation "imain"
- JobType "Parallel"
- SubJobType "pacx-mpi"
- NodeNumber 5
- Executable "test-app"
- Arguments "-v"
- InputSandbox "test-app", "inputfile"
- OutputSanbox "std.out", "std.err"
- StdErr "std.err
- StdOutput "std.out"
- Rank other.GlueHostBenchmarkSI00
- Requirements
- other.GlueCEStateStatus "Production"
22MPI Across Sites
Groups with 1 CEs Rank2000
aocegrid.uab.es2119/jobmanager-pbs-workq
freeCPUs 10 Groups with 2 CEs
Rank1500 zeus.cyf-kr.edu.pl2119/jobmanager
-pbs-workq freeCPUs 2
bee001.ific.uv.es2119/jobmanager-pbs-workq
freeCPUs 3 Rank1000 bee001.ific.uv.es2
119/jobmanager-pbs-workq freeCPUs 3
lngrid02.lip.pt2129/jobmanager-pbs-workq
freeCPUs 2
23MPI Across Sites
Startup server
Cross Broker
MPI SubTask
MPI SubTask
1. Launch a PACX Startup Server
2. Submit MPI Subtasks 3. MPI-START will start
each of the Subtasks
4. Subtask notify the startup server and start
running 5. CrossBroker monitors the application
24MPI Across Sites
- CrossBroker searches and selects sets of
resources for the jobs - There is no guarantee that all tasks of the same
job will start at the same time - 1st choice select only sites with free
resources. The job will run immediately.
Unfortunately, free resources are not always
available - 2nd choice allocate a resource temporally and
wait until all other tasks show up. Timeshare the
resource with a back filling policy to avoid
resource idleness
25Time sharing MPI Across Sites
Computing Element
CrossBroker
Job
LRMS
All tasks Ready!
WN
Slot 1
Slot 2
MPI task
Job
Priority Lowered
MPI task waiting
Back filling while the MPI waits
Grid Resource
26Simulation of Fusion Devices
- Simulation of 100,000 particles
- M/W MPI application
- worker simulate trajectories
- master renders OpenGL video
- glogin gvid transmit and encodes video
- User can interact with simulation with a GUI
- Type "Job"
- JobType "Parallel"
- SubJobType openmpi"
- NodeNumber 5
- Interactive TRUE
- InteractiveAgent glogin
- InteractiveAgentArguments -r p
195.168.105.6523433 - Executable fusion-app"
- InputSandbox fusion-app", "inputfile"
27Conclusions
- CrossBroker supports both Parallel and
Interactive jobs - Automatically
- Interoperable with EGEE
- Time sharing
- Fast startup of jobs
- Co-allocation without reservation or wasting
resources - Used in production environments
- Used in EU CrossGrid and int.eu.grid projects
(12K 55K jobs per month) - Applications using the CrossBroker features
- Visualization of plasma in fusion devices
- Evolution of pollution clouds in the atmosphere
- Ultrasound Computing Tomography Reconstruction
of a 3D volume
28Questions?
European Condor Week Barcelona 2008