Title: Special Jobs
1Special Jobs
- Paola Celio, Claudio Cherubino
- Universita Roma TRE
- INFN Sezione Roma TRE
- INFN Catania
2Outline
- MPI jobs on gLite
- DAG
- Job Collection
- Parametric jobs
3MPI Overview
- Execution of parallel jobs is an essential issue
for modern informatics and applications. - Most used library for parallel jobs support is
MPI (Message Passing Interface) - At the state of the art, parallel jobs can run
inside single Computing Elements (CE) only - several projects are involved into studies
concerning the possibility of executing parallel
jobs on Worker Nodes (WNs) belonging to different
CEs.
4Requirements settings
- In order to guarantee that MPI job can run, the
following requirements MUST BE satisfied - the MPICH software must be installed and placed
in the PATH environment variable on each WNs of
the CE. -
- Some MPIs applications require a shared
filesystem among the WNs to run. - The variable VO_ltname_of_VOgt_SW_DIR will contain
the name of a directory in case of SHARED
filesystem. - The variable VO_ltname_of_VOgt_SW_DIR will contain
. if there is NO SHARED filesystem.
5- From the users point of view, jobs to be run as
MPI are specified setting the JDL JobType
attribute to MPICH and specifying the NodeNumber
attribute as well. - E.g.
-
- JobType MPICH
- NodeNumber 4
-
6- When the previous two attributes are included in
a JDL, the User Interface (UI) automatically adds
the following expression - to the JDL Requirements expression in order to
find out the best resource where the job can be
executed.
7MPI exercise
- Create the file mpi-glite.jdl inside
HOME/EXAMPLES/gLite/Other and put this contents
inside the file -
- Type "Job"
- JobType "MPICH"
- Executable cpi"
- NodeNumber 2
- StdOutput cpi.out"
- StdError cpi.err"
- InputSandbox "cpi"
- OutputSandbox cpi.err",cpi.out"
- RetryCount 0
-
8MPI submission
- sevilla25_at_glite-tutor specjob_exercise
edg-job-submit -o id mpi-glite.jdl - Selected Virtual Organisation name (from proxy
certificate extension) gilda - Connecting to host glite-rb.ct.infn.it, port 7772
- Logging to host glite-rb.ct.infn.it, port 9002
- glite-job-submit Success
- The job has been successfully submitted to the
Network Server. - Use glite-job-status command to check job
current status. Your job identifier is - - https//glite-rb.ct.infn.it9000/bsrbbzbcXZWSzU
3iUYlm6g - The job identifier has been saved in the
following file - /home/sevilla25/Paola/specjob_exercise/id
9MPI status and output
- Query the status of the job using the following
command - glite-tutor /home/claudio gt edg-job-status -i
id - .
- When the status of the job is DONE, you can
retrieve output with the following command - glite-tutor /home/claudio gt edg-job-get-output
-i id
10MPI on the web
- LCG-2 User Guide Manuals Series
- https//edms.cern.ch/file/454439/LCG-2-UserGuide.p
df - http//oscinfo.osc.edu/training/
- http//www.netlib.org/mpi/index.html
- http//www-unix.mcs.anl.gov/mpi/learning.html
- http//www.ncsa.uiuc.edu/
11Workload Manager Proxy
12WMProxy overview
- WMProxy (Workload Manager Proxy)
- is a new service providing access to the gLite
Workload Management System (WMS) functionality
through a simple Web Services based interface. - has been designed to efficiently handle a large
number of requests for job submission and control
to the WMS - the service interface addresses the Web Services
and SOA architecture standards, in particular
adhering to WS-I - developed in C using gsoap 2.7.6b as soap stubs
generator
13New request types
- Support for new types strongly relies on newly
developed JDL converters and on the DAG
submission support - all JDL conversions are performed on the server
- a single submission for several jobs
- All new request types can be monitored and
controlled through a single handle (the request
id) - each sub-jobs can be however followed-up and
controlled independently through its own id - Smarter WMS client commands/API
- allow submission of DAGs, collections and
parametric jobs exploiting the concept of shared
sandbox - allow automatic generation and submission of
collections and DAGs from sets of JDL files
located in user specified directories on the UI
14WMProxy submission monitoring
- In order to submit jobs with WMProxy, its
mandatory to delegate credentials - The submission/monitoring commands are slightly
different, but most of the old options are
supported
glite-wms-job-delegate-proxy -d del_ID
glite-wms-job-submit -d del_ID collection.jdl gli
te-wms-job-status jobID glite-wms-job-output
\ https//glite-rb.ct.infn.it9000/LHIIGaCVdl7Olm
sz0jpI_g
15DAG job
- A DAG job is a set of jobs where input, output,
or execution of one or more jobs can depend on
other jobs - Dependencies are represented through Directed
Acyclic Graphs, where the nodes are jobs, and the
edges identify the dependencies
16JDL structure
17Attribute Nodes
18Attribute Dependencies
19DAG jdl
type "dag" max_nodes_running 4
nodes nodeA file
"nodes/nodeA.jdl" nodeB
file "nodes/nodeB.jdl" nodeC
file "nodes/nodeC.jdl" nodeD
file "nodes/nodeD.jdl"
dependencies nodeA, nodeB,
nodeA, nodeC, nodeB,nodeC, nodeD
Node description could also be done here, instead
of using separate files
20(No Transcript)
21Job Collection
- A job collection is a set of independent jobs
that user wants to submit and monitor via a
single request - Jobs of a collection are submitted as DAG nodes
without dependencies - JDL is a list of classad, which describes the
subjobs -
- Type "collection"
- VirtualOrganisation gilda"
- nodes
- ltjob descr 1 gt,
- ltjob descr 2 gt,
-
-
-
22Scattered Input Sandboxes
- Input Sandbox can contain
- file paths on the UI machine (i.e. the usual way)
- URI pointing to files on a remote gridFTP/HTTPS
server - InputSandbox
- "gsiftp//neo.datamat.it2811/var/prg/sim.exe",
- "https//ghemon.cnaf.infn.it8443/data/idat_1",
- "file///home/pacio/myconf
- A base URI to be applied to all sandbox files can
also be specified - InputSandboxBaseURI "gsiftp//matrix.datamat.it
2811/var" - Only local files (file//) are uploaded to the
WMS node - File pointed by URIs are directly downloaded on
the WN by the JobWrapper just before the job is
started
23Job collection example
type "collection" InputSandbox
"date.sh" RetryCount 0 nodes
file "jobs/job1.jdl" ,
Executable "/bin/sh" Arguments
"date.sh" Stdoutput "date.out" StdError
"date.err" OutputSandbox "date.out",
"date.err" , file
"jobs/job3.jdl"
All nodes will share this Input Sandbox
24(No Transcript)
25Parametric Job
- A parametric job is a job where one or more of
its attributes are parameterized - Values of attributes vary according to a
parameter - Job monitoring / managing is always done through
an unique jobID, as if the job was single (see
submission of collection
JobType "Parametric" Executable
"/bin/sh" Arguments "md5.sh
input_PARAM_.txt" InputSandbox "md5.sh",
"input_PARAM_.txt" StdOutput
"out_PARAM_.txt" StdError "err_PARAM_.txt"
Parameters 4 ParameterStart 1
ParameterStep 1 OutputSandbox
"out_PARAM_.txt", "err_PARAM_.txt"
26Parametric Job / 2
- Parameter can be also a list of string
- InputSandbox (if present) has to be coherent with
parameters
ui-test /home/giorgio/param gt cat param2.jdl
JobType "Parametric" Executable
/bin/cat" Arguments input_PARAM_.txt
InputSandbox "input_PARAM_.txt"
StdOutput "myoutput_PARAM_.txt" StdError
"myerror_PARAM_.txt" Parameters
earth,moon,mars OutputSandbox
myoutput_PARAM_.txt ui-test
/home/giorgio/param gt ls inputEARTH.txt
inputMARS.txt inputMOON.txt param2.jdl
27(No Transcript)
28References
- JDL attributes specification for WM proxy
- https//edms.cern.ch/document/590869/1
- WMProxy quickstart
- http//egee-jra1-wm.mi.infn.it/egee-jra1-wm/wmprox
y_client_quickstart.shtml - WMS user guides
- https//edms.cern.ch/document/572489/1
29Questions