Introduction for Jobs Submission - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

Introduction for Jobs Submission

Description:

... a job and the basic command line interface to submit and manage simple jobs. ... The Command Line Interface (CLI) Advanced jobs. References. Hands-on ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 49
Provided by: dms9
Category:

less

Transcript and Presenter's Notes

Title: Introduction for Jobs Submission


1
Introduction for Jobs Submission
  • Giuseppe LA ROCCA
  • giuseppe.larocca_at_ct.infn.it
  • INFN Catania, ITALY

2
Outline
An introduction to the WMS and JDL The
gLite WMS architecture The Command Line
Interface (CLI) Advanced jobs References
Hands-on
3
Overview of gLite Middleware
4
Overview
  • The Workload Management System (WMS) is the gLite
    3 component that allows users to submit jobs, and
    performs all tasks required to execute them,
    without exposing the user to the complexity of
    the Grid.
  • It is the responsibility of the user to describe
    his jobs and their requirements, and to retrieve
    the output when the jobs are finished.
  • In the WLCG/EGEE Grid, two different workload
    management systems are deployed the legacy LCG-2
    system and the new system from the EGEE project,
    which is an evolution of the former and therefore
    has more functionalities.
  • In the following sections, we will describe the
    basic concepts of the language used to describe a
    job and the basic command line interface to
    submit and manage simple jobs.

5
Workload Management System
  • Workload Management System (WMS) comprises a set
    of Grid middleware components responsible for
    distribution and management of tasks across Grid
    resources.
  • The Workload Manager(WM) aims to accept and
    satisfy requests for job management coming from
    its clients.
  • WM will pass the job to an appropriate CE for
    execution taking into account requirements and
    the preferences expressed in the job description.
  • The decision of which resource should be used is
    the outcome of a matchmaking process.
  • The Logging and Bookkeeping service tracks jobs
    managed by the WMS. It collects events from many
    WMS components and records the status and history
    of the job.

6
Job Description Language
  • The Job Description Language (JDL) is a
    high-level language based on the Classified
    Advertisement (ClassAd) language, used to
    describe jobs and aggregates of jobs with
    arbitrary dependency relations.
  • The JDL is used in WLCG/EGEE to specify the
    desired job characteristics and constraints,
    which are taken into account by the WMS to select
    the best resource to execute the job.
  • A job description is a file (called JDL file)
    consisting of lines having the format attribute
    expression
  • Expressions can span several lines, but only the
    last one must be terminated by a semicolon.

7
Job Description Language
  • The character cannot be used in the JDL.
  • Comments must be preceded by a sharp character
    () or a double slash (//) at the beginning if
    each line.
  • Multi-line comments must be enclosed between /
    and / .

Attention! The JDL is sensitive to blank
characters and tabs. No blank characters or tabs
should follow the semicolon at the end of a line.
8
Simple JDL example
  • Executable "/bin/hostname"
  • StdOutput "std.out"
  • StdError "std.err"
  • The Executable attribute specifies the command
    to be run by the job. If the command is already
    present on the WN, it must be expressed as a
    absolute path if it has to be copied from the
    UI, only the file name must be specified, and the
    path of the command on the UI should be given in
    the InputSandbox attribute.
  • Executable "test.sh"
  • InputSandbox "/home/doe/test.sh"
  • StdOutput "std.out"
  • StdError "std.err"

9
  • The Arguments attribute can contain a string
    value, which is taken as argument list for the
    executable
  • Arguments "fileA 10"
  • In the Executable and in the Arguments attributes
    it may be necessary to use special characters,
    such as , \, , gt, lt. These characters should be
    preceded by triple \ in the JDL, or specified
    inside quoted strings e.g. Arguments "-f
    file1\\\file2"
  • The attributes StdOutput and StdError define the
    name of the files containing the standard output
    and standard error of the executable, once the
    job output is retrieved.

10
  • If files have to be copied from the UI to the
    execution node, they must be listed in the
    InputSandbox attribute
  • InputSandbox "test.sh", .. ,"fileN"
  • The files to be transferred back to the UI after
    the job is finished can be specified using the
    OutputSandbox attribute
  • OutputSandbox "std.out","std.err"

11
  • Wildcards are allowed only in the InputSandbox
    attribute.
  • Absolute paths cannot be specified in the
    OutputSandbox attribute.
  • The InputSandbox cannot contain two files with
    the same name, even if they have a different
    absolute path, as when transferred they would
    overwrite each other.
  • The shell environment of the job can be modified
    using the Environment attribute.
  • Environment "CMS_PATHHOME/cms",
    "CMS_DBCMS_PATH/cmdb"

12
  • JobType
  • Normal (simple, sequential job), Interactive,
    MPICH, Checkpointable, Partitionable, Parametric
  • Or combination of them
  • Checkpointable, Interactive
  • Checkpointable, MPI
  • Interactive MPI not yet permitted
  • JobType Interactive
  • JobType Interactive,Checkpointable

13
  • The Requirements attribute can be used to express
    constraints on the resources where the job should
    run.
  • Its value is a Boolean expression that must
    evaluate to true for a job to run on that
    specific CE.
  • Note Only one Requirements attribute can be
    specified (if there are more than one, only the
    last one is considered). If several conditions
    must be applied to the job, then they all must be
    combined in a single Requirements attribute.
  • For example, let us suppose that the user wants
    to run on a CE using PBS as batch system, and
    whose WNs have at least two CPUs. He will write
    then in the job description file
  • Requirements other.GlueCEInfoLRMSType "PBS"
    other.GlueCEInfoTotalCPUs gt 1

14
  • The WMS can be also asked to send a job to a
    particular queue in a CE with the following
    expression
  • Requirements other.GlueCEUniqueID
    "lxshare0286.cern.ch2119/jobmanager-pbs-short"
  • It is also possible to use regular expressions
    when expressing a requirement.
  • Let us suppose for example that the user wants
    all his jobs to run on any CE in the domain
    cern.ch. This can be achieved putting in the JDL
    file the following expression
  • Requirements RegExp("cern.ch",other.GlueCEU
    niqueID)
  • The opposite can be required by using
  • Requirements
  • (!RegExp("cern.ch", other.GlueCEUniqueID))

15
  • If the job must run on a CE where a particular
    experiment software is installed and this
    information is published by the CE, something
    like the following must be written
  • Requirements Member(BLAST-1.0.3",
  • other.GlueHostApplicationSoftwareRunTimeEnvironmen
    t)

Note The Member operator is used to test if its
first argument (a scalar value) is a member of
its second argument (a list). In fact, the
GlueHostApplicationSoftwareRunTimeEnvironment
attribute is a list of strings and is used to
publish any VO-specific information relative to
the CE (typically, information on the VO software
available on that CE).
16
  • It is possible to have the WMS automatically
    resubmitting jobs which, for some reason, are
    aborted by the Grid. Two kinds of resubmission
    are available for the gLite 3 WMS the deep
    resubmission and the shallow resubmission (only
    the former is available in the LCG-2 WMS).
  • The resubmission is deep when the job fails after
    it has started running on the WN, and shallow
    otherwise.
  • The user can limit the number of times the WMS
    should resubmit a job by using the JDL attributes
    RetryCount and ShallowRetryCount for the deep and
    shallow resubmission respectively.
  • For example, to disable the deep resubmission and
    limit the attempts of shallow resubmission to 3
  • RetryCount 0
  • ShallowRetryCount 3

17
  • The proxy renewal feature of the WMS is
    automatically enabled, as long as the user has
    stored a long term proxy in the default MyProxy
    server (usually defined in the MYPROXY SERVER
    environment variable. However it is possible to
    indicate to the WMS a different MyProxy server in
    the JDL file
  • MyProxyServer myproxy.ct.infn.it"

18
  • The choice of the CE where to execute the job,
    among all the ones satisfying the requirements,
    is based on the rank of the CE, a quantity
    expressed as a floating-point number. The CE with
    the highest rank is the one selected.
  • By default, the rank is equal to
    other.GlueCEStateEstimatedResponseTime, where the
    estimated response time is an estimation of the
    time interval between the job submission and the
    beginning of the job execution.
  • Rank other.GlueCEStateFreeCPUs
  • which will rank best the CE with the most free
    CPUs.

19
An introduction to the WMS and JDL The
gLite WMS architecture The Command Line
Interface (CLI) Advanced jobs
References Hands-on
20
The WMProxy
  • The WMProxy is the service responsible to provide
    access to the WMS functionality through a Web
    Service Interface
  • The gLite WMProxy Server can be either accessed
    directly through the published WSDL, the C
    command line interface, or the API
  • It has been designed to efficiently handle a
    large number of requests for job submission and
    control to the WMS
  • it provides additional features such as bulk
    submission and the support for shared and
    compressed sandboxes for compound jobs.
  • Its the natural replacement of the NS in the
    passage to the SOA approach.

21
gLite WMS Architecture
22
gLite WMS Architecture
Job management requests (submission,
cancellation) expressed via a Job
Description Language (JDL)
23
gLite WMS Architecture
Finds an appropriate CE for each submission
request, taking into account job requests and
preferences, Grid status, utilization policies
on resources
24
gLite WMS Architecture
Keeps submission requests Requests are kept
for a while if no resources are immediately
available
25
gLite WMS Architecture
Repository of resource information available to
matchmaker Updated via notifications and/or
active polling on resources
26
gLite WMS Architecture
Performs the actual job submission and
monitoring
27
An introduction to the WMS and JDL The
gLite WMS architecture The Command Line
Interface (CLI) Advanced jobs
References Hands-on
28
The Command Line Interface
  • The gLite WMS implements two different services
    to manage jobs the Network Server and the
    WMProxy.
  • The recommended method to manage jobs is through
    the gLite WMS via WMProxy, because it gives the
    best performance and allows to use the most
    advanced functionalities
  • The WMProxy implements several
  • functionalities, among which
  • submission of job collections
  • faster authentication
  • faster match-making
  • faster response time for users
  • higher job throughput.

29
Delegating a proxy to WMProxy
  • Each job submitted to WMProxy must be associated
    to a proxy credential previously delegated by the
    owner of the job to the WMProxy server.
  • This proxy is then used any time WMProxy needs to
    interact with other services for job related
    operations (e.g. submission to the CE, a GridFTP
    file transfer etc.)
  • There are two possible mechanisms to ask for a
    delegation of the user credentails
  • asking the automatic delegation of the
    credentials during the submission operation
  • asking for an explicit delegation

30
  • To explicitly delegate a user proxy to WMProxy,
    the command to use is glite-wms-job-delegate-pro
    xy -d ltdelegIDgt
  • where ltdelegIDgt is a string chosen by the user.
  • For example, to delegate a proxy
  • glite-wms-job-delegate-proxy -d mydelegID
  • Connecting to the service
  • https//rb102.cern.ch7443/glite_wms_wmproxy_serve
    r
  • glite-wms-job-delegate-proxy Success
  • Your proxy has been successfully delegated to the
    WMProxy
  • https//rb102.cern.ch7443/glite_wms_wmproxy_serve
    r
  • with the delegation identifier mydelegID


31
Submitting a simple job
  • Starting from a simple JDL file, we can submit it
    via WMProxy by doing
  • glite-wms-job-submit d mydelegID test.jdl
  • Connecting to the service
  • https//rb102.cern.ch7443/glite_wms_wmproxy_serve
    r
  • glite-wms-job-submit Success
  • The job has been successfully submitted to the
    WMProxy
  • Your job identifier is
  • https//rb102.cern.ch9000/vZKKk3gdBla6RySximq_vQ

32
Troubleshooting /1
  • To submit jobs via WMProxy, it is required to
    have a valid VOMS proxy, otherwise the submission
    will fail with an error like
  • Error - Operation failed
  • Unable to delegate the credential to the
    endpoint
  • https//rb102.cern.ch7443/glite_wms_wmproxy_serve
    r
  • User not authorized
  • unable to check credential permission
    (/opt/glite/etc/glite_wms_wmproxy.gacl)
  • (credential entry not found)
  • credential type person
  • input dn /CCH/OCERN/OUGRID/CNJohn Doe

33
Authorization
  • The client must be properly authorized when
    interacts with the WMProxy service.
  • This means that either the FQAN or the DN (in
    case of globus-style proxies) of the client must
    be properly listed and authorized in the
    glite_wms_wmproxy.gacl file on the WMProxy
    machine.
  • cat glite_wms_wmproxy.gacl
  • ltgacl version'0.0.1'gt
  • ltentry ltvomsgtltfqangtbio/RoleNULLlt/fqan
    gtlt/vomsgt
  • ltallowgtltexec/gtlt/allowgt
  • lt/entrygt
  • ..
  • lt/gaclgt

34
Troubleshooting /2
  • If the command returns the following error
  • Error - WMProxy Server Error
  • LCMAPS failed to map user credential
  • Method getFreeQuota
  • Error code 1208
  • it means that there are authentication problems
    between the UI and the WMProxy server (you may
    not be authorized to use that WMProxy server).

35
Listing CE(s) that matching a job
  • It is possible to see which CEs are eligible to
    run a job described by a given JDL using
  • glite-wms-job-list-match d mydelegID --rank
    test.jdl
  • Connecting to the service
  • https//rb102.cern.ch7443/glite_wms_wmproxy_serve
    r

  • COMPUTING ELEMENT IDs LIST
  • The following CE(s) matching your job
    requirements have been found
  • CEId Rank
  • - CE.pakgrid.org.pk2119/jobmanager-lcgpbs-cms 0
  • - grid-ce0.desy.de2119/jobmanager-lcgpbs-cms -10
  • - gw-2.ccc.ucl.ac.uk2119/jobmanager-sge-default
    -56
  • - grid-ce2.desy.de2119/jobmanager-lcgpbs-cms
    -107


36
Retrieving the status of a job
  • glite-wms-job-status https//rb102.cern.ch9000/
    fNdD4FW_Xxkt2s2aZJeoeg

  • BOOKKEEPING INFORMATION
  • Status info for the Job https//rb102.cern.ch90
    00/fNdD4FW_Xxkt2s2aZJeoeg
  • Current Status Done (Success)
  • Exit code 0
  • Status Reason Job terminated successfully
  • Destination ce1.inrne.bas.bg2119/jobmanager-lcgp
    bs-cms
  • Submitted Mon Dec 4 150543 2006 CET

  • The verbosity level controls the amount of
    information provided. The value of the -v option
    ranges from 0 to 3.
  • The commands to get the job status can have
    several jobIDs as arguments, i.e.
    glite-wms-job-status ltjobID1gt ... or, more
    conveniently, the -i ltfile pathgt option can be
    used to

37
Retrieving the output(s)
  • glite-wms-job-output
  • https//rb102.cern.ch9000/yabp72aERhofLA6W2-LrJw
  • Connecting to the service
  • https//128.142.160.937443/glite_wms_wmproxy_serv
    er

  • JOB GET OUTPUT OUTCOME
  • Output sandbox files for the job
  • https//rb102.cern.ch9000/yabp72aERhofLA6W2-LrJw
  • have been successfully retrieved and stored in
    the directory
  • /tmp/doe_yabp72aERhofLA6W2-LrJw

  • The default location for storing the outputs
    (normally /tmp) is defined in the UI
    configuration, but it is possible to specify in
    which directory to save the output using the
    --dir ltpath namegt option.

38
Cancelling a job
  • glite-wms-job-cancel https//rb102.cern.ch9000/
    P1c60RFsrIZ9mnBALa7yZA
  • Are you sure you want to remove specified job(s)
    y/ny y
  • Connecting to the service
  • https//128.142.160.937443/glite_wms_wmproxy_serv
    er
  • glite-wms-job-cancel Success
  • The cancellation request has been successfully
    submitted for the following job(s)
  • - https//rb102.cern.ch9000/P1c60RFsrIZ9mnBALa7yZ
    A

  • If the cancellation is successful, the job will
    terminate in status CANCELLED

39
Real Time Output Retrieval /1
  • The user can enable the job perusal by setting
    the attribute PerusalFileEnable to true in the
    job JDL.
  • This makes the WN to upload, at regular time
    intervals (defined by the PerusalTimeInterval
    attribute and expressed in seconds), a copy of
    the output files specified using the
    glite-wms-job-perusal command to the WMS machine
    (by default), or to a GridFTP server specified by
    the attribute PerusalFilesDestURI

Executable "job.sh" StdOutput
"stdout.log" StdError "stderr.log" InputSandbo
x "job.sh" OutputSandbox
"stdout.log","stderr.log","testfile.txt" Perusa
lFileEnable true PerusalTimeInterval
30 RetryCount 0
40
Real Time Output Retrieval /2
  • After the job has been submitted with
    glite-wms-job-submit, the user can choose which
    output files should be inspected
  • glite-wms-job-perusal --set -f testfile.txt \
  • https//wms104.cern.ch9000/B02xR3EQg9ZHHoRc-1nJkQ
  • Connecting to the service https//128.142.160.937
    443/glite_wms_wmproxy_server
  • Connecting to the service
  • https//128.142.160.937443/glite_wms_wmproxy_serv
    er
  • glite-wms-job-perusal Success
  • Files perusal has been successfully enabled for
    the job
  • https//wms104.cern.ch9000/B02xR3EQg9ZHHoRc-1nJkQ


41
Real Time Output Retrieval /3
  • .. and, when the job starts, the user can see one
    output file
  • glite-wms-job-perusal --get -f testfile.txt \
  • https//wms104.cern.ch9000/B02xR3EQg9ZHHoRc-1nJkQ
  • Connecting to the service
  • https//137.138.45.797443/glite_wms_wmproxy_serve
    r
  • Connecting to the service
  • https//137.138.45.797443/glite_wms_wmproxy_serve
    r
  • glite-wms-job-perusal Success
  • The retrieved files have been successfully stored
    in
  • /tmp/doe_OoDVmWCAnhx_HiSPvASGsg


42
An introduction to the WMS and JDL The
gLite WMS architecture The Command Line
Interface (CLI) Advanced jobs References
Hands-on
43
DAG job
  • DAG is a set of jobs where the input, output, or
    execution of one or more jobs depends on one or
    more other ones
  • The jobs are nodes (vertices) in the graph
  • the edges (arcs) identify the dependencies
  • Their management has been improved with
  • Shared sandboxes
  • Attributes Inheritance
  • Attribute references between nodes
  • and with the parent

44
Type "dag" InputSandbox
"/tmp/foo/.exe", "/home/larocca/bar",
"gsiftp//neo.datamat.it5678/tmp/cms_sim.exe ",
"file///tmp/myconf" nodes nodeA
description JobType "Normal"
Executable "a.exe" InputSandbox
"/home/larocca/myfile.txt", root.InputSandbox
nodeF description
JobType "Normal" Executable "b.exe"
Arguments "1 2 3" OutputSandbox
"myoutput.txt", "myerror.txt"
nodeD description JobType
"Checkpointable" Executable "b.exe"
Arguments "1 2 3" InputSandbox
"file///home/larocca/data.txt",
root.nodes.nodeF.description.OutputSandbox0
nodeC file
"/home/larocca/nodec.jdl" nodeB
file "foo.jdl" dependencies
nodeA, nodeB , nodeA, nodeC , nodeA,
nodeF , nodeB, nodeC, nodeF , nodeD

45
Job Collection
  • Job collection is a set of independent jobs that
    user can submit and monitor as it was a single
    job
  • Jobs of a collection are submitted as DAG nodes,
    without dependencies
  • The JDL is a list of ClassAds which describe the
    subjobs
  • Type "collection
  • nodes
  • ltjob descr 1 gt,
  • ltjob descr 2 gt,
  • ...

46
  • Type "collection"
  • InputSandbox "input_common1.txt","input_com
    mon2.txt"
  • nodes
  • JobType "Normal"
  • NodeName "node1"
  • Executable "/bin/sh"
  • Arguments "script_node1.sh"
  • InputSandbox "script_node1.sh",
    root.InputSandbox0
  • StdOutput "myoutput1"
  • StdError "myerror1"
  • OutputSandbox "myoutput1","myerror1"
  • ShallowRetryCount 1
  • ,
  • JobType "Normal"
  • NodeName "node2"
  • Executable "/bin/sh"

1st. sub-job
Collection
2nd. sub-job
3rd. sub-job
47
Parametric jobs /1
  • A parametric job is a job where one or more of
    its attributes are parametric
  • Value of attributes varies according to parameter
  • Job monitoring / managing is always done through
    an unique jobID, as if the job was single

JobType "Parametric" Executable
/bin/echo" Arguments _PARAM_ StdOutput
"myoutput_PARAM_.txt" StdError
"myerror_PARAM_.txt" Parameters 3
ParameterStep 1 ParameterStart 1
OutputSandbox myoutput_PARAM_.txt
48
Parametric jobs /2
Executable /bin/cat" Arguments
inputMOON.txt InputSandbox
"inputMOON.txt" StdOutput
"myoutputMOON.txt" StdError
"myerrorMOON.txt" OutputSandbox
myoutputMOON.txt
Executable /bin/cat" Arguments
inputMARS.txt InputSandbox
"inputMARS.txt" StdOutput
"myoutputMARS.txt" StdError
"myerrorMARS.txt" OutputSandbox
myoutputMARS.txt
Executable /bin/cat" Arguments
inputEARTH.txt InputSandbox
"inputEARTH.txt" StdOutput
"myoutputEARTH.txt" StdError
"myerrorEARTH.txt" OutputSandbox
myoutputEARTH.txt
49
An introduction to the WMS and JDL The
gLite WMS architecture The Command Line
Interface (CLI) Advanced jobs References
Hands-on
50
References
  • WMProxy Users guide
  • https//edms.cern.ch/file/674643/1/EGEE-JRA1-TEC-
    674643-WMPROXY-guide-v0-3.pdf
  • JDL Attributes Specification
  • https//edms.cern.ch/file/555796/1/EGEE-JRA1-TEC-
    555796-JDL-Attributes-v0-8.pdf
  • https//edms.cern.ch/file/590869/1/EGEE-JRA1-TEC-
    590869-JDL-Attributes-v0-9.pdf
  • gLite 3.1 users guide
  • https//edms.cern.ch/file/722398/1.2/gLite-3-User
    Guide.pdf
  • Complex jobs
  • https//grid.ct.infn.it/twiki/bin/view/GILDA/WmPr
    oxyUse
  • WMProxy API usage
  • https//grid.ct.infn.it/twiki/bin/view/GILDA/A
    piJavaWMProxy https//grid.ct.infn.it/twiki/bin/v
    iew/GILDA/WMProxyCPPAPI

51
Hands-on
https//grid.ct.infn.it/twiki/bin/view/GILDA/Authe
nticationAuthorization https//grid.ct.infn.it/tw
iki/bin/view/GILDA/SimpleJobSubmission https//gr
id.ct.infn.it/twiki/bin/view/GILDA/WmProxyUse
Connect to the gLite User Interface
  • ssh taipeiXX_at_glite-tutor.ct.infn.it
  • OS passwd GridTAIXX
  • PassPhrase TAIPEI
  • where XX 01,..,60
Write a Comment
User Comments (0)
About PowerShow.com