Sylvain Reynaud, Pascal Calvat - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

Sylvain Reynaud, Pascal Calvat

Description:

JJS was developed by Pascal Calvat (CC-IN2P3) in 2003, to submit jobs to the ... JJS give a score to selected sites and use it for subsequent match-makings ... – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 47
Provided by: hepKi
Category:

less

Transcript and Presenter's Notes

Title: Sylvain Reynaud, Pascal Calvat


1
Grid interoperability using
  • Sylvain Reynaud, Pascal Calvat
  • CC-IN2P3

2
Plan
  • demo of
  • overview of
  • demo of
  • summary and perspectives

JUX
JSAGA is an API for uniform access to grids. JJS
and JUX are tools using JSAGA.
3
JJS Overview
  • JJS was developed by Pascal Calvat (CC-IN2P3) in
    2003, to submit jobs to the DATAGRID
    infrastructure
  • has evolved to submit jobs to the EGEE
    infrastructure
  • JJS is designed to ease job submission from web
    servers hosted in laboratories
  • it is an alternative to User Interface Resource
    Broker (or to gLite-UI gLite-WMS)
  • JJS is optimized for submitting short-life jobs
  • based on observed QoS of sites JJS give a score
    to selected sites and use it for subsequent
    match-makings
  • but it can also be used with long-life jobs

11/11/2009
3
4
JJS Demo
1 job
1 job
1 job
Overall performance for short-life jobs (install
povray on-the-fly, then generate part of the
image)
11/11/2009
4
5
JJS Overview
  • JJS was initially developed on top of cog-jglobus
    API
  • cog-jglobus is being replaced with JSAGA for
  • security (done)
  • data management (done)
  • execution management (in a near future)
  • job collection management (in a near future)
  • Using JSAGA enables JJS to become independent of
    gLite middleware evolutions
  • from Globus proxy to VOMS proxy (done)
  • from GSIFTP to SRM (work in progress)
  • from LCG-CE to gLite-CREAM (in a near future)

11/11/2009
5
6
JSAGA targeted use cases
  • Motivations for using several grid
    infrastructures
  • increasing the number of computing resources
    available to user
  • need for resources with specific constraints
  • super-computer
  • confidentiality
  • small overhead (e.g. consolidation)
  • interactivity
  • availability, on a given grid, of
  • the data
  • the software

7
  • Ready-to-use software, adapted to targeted
    scientific field
  • Hide heterogeneity between grid infrastructures
  • Hide heterogeneity between middlewares
  • As many interfaces as ways to implement each
    functionality
  • As many interfaces as used technologies

8
SAGA code example
  • // use factories to create SAGA objects
  • Session session SessionFactory.createSession()
  • URL url URLFactory.createURL("gsiftp//cclcgseli
    01.in2p3.fr/tmp/")
  • NSDirectory dir NSFactory.createNSDirectory(sess
    ion, url)
  • // use SAGA objects
  • ListltURLgt result dir.list()
  • for (URL r result)
  • System.out.println(r)

9
  • Ready-to-use software, adapted to targeted
    scientific field
  • Hide heterogeneity between grid infrastructures
  • Hide heterogeneity between middlewares
  • As many interfaces as ways to implement each
    functionality
  • As many interfaces as used technologies

end user
application developer
plug-ins developer
10
Plug-ins interfaces
  • close to application developer needs
  • object-oriented
  • high-level
  • uniform interface to all the supported
    technologies
  • design objectives
  • easy to use
  • but ltlt certainly not simple to implement gtgt (T.
    Kielmann)
  • engine code 2 x plug-ins code
  • close to existing middleware APIs
  • service-oriented
  • low-level
  • as many interfaces as ways to implement each
    functionality
  • optional interfaces
  • design objectives
  • easy to implement
  • enable efficient usage of middleware APIs

11
Plug-ins execution management
Streaming Plug-in interfaces direct/buffered/redi
rected streams used before/during/after execution
Monitoring Plug-in interfaces querying /
listening individual job / list of jobs /
filtered jobs
set stream for interactive
set stream for non- interactive
get stream for interactive
query status for individual job
listen status for individual job
query status for filtered jobs
getInput getOutput getError
getState waitFor
SAGA user interface getInput / getOutput
SAGA user interface getState / waitFor
Job control
Job monitoring
gatekeeper
gLite-WMS
wsgram
unicore6
ssh
fork
cream
PBS
remote
naregi
gatekeeper
gLite-LB
wsgram
unicore6
ssh
fork
cream

12
Plug-ins provided
Security
InMemCred
Globus
G. Legacy
G. RFC820
MyProxy
VOMS
X509
SSH
Login / pwd
JKS
Data
catalog
rns
lfn
srb / irods
http
https
sftp
rbyteio
file
zip
gsiftp
tar
ftp
mail
cache
srm
Exec. (control)
Exec. (monitor)
Job control
gatekeeper
gLite-WMS
wsgram
unicore6
ssh
fork
cream
PBS
remote
naregi
gatekeeper
gLite-LB
wsgram
unicore6
ssh
fork
cream

Expression
Language
basic
default
JEP
BeanShell
JSDLext.
SAGA
JDL
RSL-2
RSL-4
13
This is still not enough
job desc.
JSAGA
gLite plug-ins
Globus plug-ins
14
This is still not enough
job desc.
JSAGA
gLite plug-ins
Globus plug-ins
staging graph
JDL
RSL
15
  • Ready-to-use software, adapted to targeted
    scientific field
  • Hide heterogeneity between grid infrastructures
  • Hide heterogeneity between middlewares
  • As many interfaces as ways to implement each
    functionality
  • As many interfaces as used technologies

end user
application developer
plug-ins developer
16
Description of infrastructures
example execution management
gatekeeper
  • Middleware heterogeneity
  • e.g. CREAM, WMS, SSH, GK
  • Infrastructures heterogeneity
  • Grid/site policy
  • e.g. network filtering, shared FS
  • Environment variables
  • e.g. VO_?_SW_DIR, /usr/local
  • Configuration attributes (client)
  • e.g. monitor service URL, shell path on cygwin,
    default SE URL
  • Command line interfaces (worker)
  • e.g. globus-url-copy, srmcp, Scp, wget, tar

srb//
srm//
CC-IN2P3
lfn//
WMS
gsiftp//
EGEE
gatekeeper
wsgram
OpenPlast
Grid
http//
tar//
gatekeeper
World
localhost
17
Transfer path depends on
  • When using a single grid infrastructure
  • all files can be transported to/from the worker
    nodes through a single storage node
  • When using several grid infrastructures
  • need to dynamically build a more complex transfer
    graph, according to

url//
job desc.
JSAGA
18
Transfer path depends on
  • grid or site
  • network filtering policy
  • commands available on workers
  • services available from workers (close Storage
    Element, shared FS)
  • supported context instances
  • data to stage
  • shared by several jobs
  • installed on some worker nodes
  • file size
  • required data protection level
  • execution service
  • protocols supported for staging
  • transfer protocol
  • access mode (RO, WO, RW)
  • third-party transfer
  • supported data protection level

url//
job desc.
JSAGA
19
Transfer path depends on
  • grid or site
  • network filtering policy
  • commands available on workers
  • services available from workers (close Storage
    Element, shared FS)
  • supported context instances
  • data to stage
  • shared by several jobs
  • installed on some worker nodes
  • file size
  • required data protection level
  • execution service
  • protocols supported for staging
  • transfer protocol
  • access mode (RO, WO, RW)
  • third-party transfer
  • supported data protection level

20
Transfer path depends on
C
C'
C''
common
result
R1
  • grid or site
  • network filtering policy
  • commands available on workers
  • services available from workers (close Storage
    Element, shared FS)
  • supported context instances
  • data to stage
  • shared by several jobs
  • installed on some worker nodes
  • file size
  • required data protection level
  • execution service
  • protocols supported for staging
  • transfer protocol
  • access mode (RO, WO, RW)
  • third-party transfer
  • supported data protection level

std-error
E1
C"
E src
E
iGet
21
Example of generated graph
C
C'
C''
common
result
R1
std-error
E1
Data flow
example with several protocols used, but only 3
jobs submitted on 1 grid
22
  • Ready-to-use software, adapted to targeted
    scientific field
  • Hide heterogeneity between grid infrastructures
  • Hide heterogeneity between middlewares
  • As many interfaces as ways to implement each
    functionality
  • As many interfaces as used technologies

end user
application developer
plug-ins developer
23
Command line interfaces
  • JSAGA provides command line interfaces for
  • security
  • jsaga-context-init
  • jsaga-context-info
  • jsaga-context-destroy
  • execution management
  • jsaga-job-run
  • jsaga-job-status
  • jsaga-job-cancel
  • data management
  • jsaga-cat
  • jsaga-cp
  • jsaga-ls
  • jsaga-mkdir
  • jsaga-mv
  • jsaga-rm
  • jsaga-rmdir
  • jsaga-stat
  • jsaga-test
  • jsaga-logical

24
Related projects
  • JSAGA is used by
  • Elis_at_
  • a web portal for submitting jobs to industrial
    and research grid infrastructures
  • JJS (Java Job Submission)
  • a tool for submitting jobs to EGEE
  • optimized for short-life jobs (resource selection
    based on QoS observed while submitting jobs)
  • JUX (Java Universal eXplorer)
  • a multi-protocols file browser

/
25
JUX Overview
  • JUX is a file explorer designed to be independent
    of
  • Operating System
  • tested on Windows, Scientific Linux, Ubuntu, Mac
  • Data management protocol
  • tested with gsiftp, srb, irods, http, https,
    sftp, zip, (srm)
  • Security mechanism
  • tested with GSI, VOMS, Login/Password, X509, SSH
  • File content viewer
  • provided viewers are for text file, image viewer,
    audio player
  • can use local applications (only for protocol
    "file//" on OS "Windows")

png, gif, jpg, bmp, tiff, dicom
mp3, wav
26
JUX Overview
  • Data management and security
  • JUX does not only use the SAGA API
  • it also uses the JSAGA introspection API to
    discover
  • list of available protocols
  • list of configured security contexts
  • list supported security context types, for each
    protocol
  • this allows JUX to be completely independent of
    technologies used
  • just copy your own JSAGA plug-in in JUX "lib/"
    directory to add the support for a new technology
    !

27
  • Demo of JUX
  • and then conclusion about

28
Software quality
  • Build process fully automated, including
  • build tools installation
  • code generation
  • testing
  • unitary tests
  • integration tests
  • project web site generation
  • http//grid.in2p3.fr/jsaga/
  • installer GUI generation (see next slide)
  • Plug-ins
  • external dependencies reduced
  • e.g. gLite-UI not needed
  • most plug-ins supports
  • a maven 'archetype' generates skeleton of new
    plug-in project
  • plug-ins automatically validated with a reusable
    SAGA test suite

SAGA protocols test-suite configuration gsiftp.b
asegsiftp//ccrugceli01.in2p3.fr/tmp/ gsiftp.base
2gsiftp//agena.c-s.fr/grid/tmp/ gsiftp.contextO
penPlast_proxy https.basehttp//grid.in2p3.fr/ht
ml/Private/ https.contextWeb_X509 file.basefile
///c/tmp/ file.base2file///c/
29
Installer GUI
30
License(s)
  • LGPL license
  • for the core engine and most plug-ins
  • Optional licenses
  • for plug-ins having external dependencies, which
    license is not compatible with LGPL
  • then, end-user must
  • either accept the terms of the license agreement
  • or uncheck these plug-ins (see previous slide)

31
SummaryMain assets of JSAGA
  • Implement standard specifications from
  • SAGA
  • JSDL
  • Provide high-level abstraction layer with no
    sacrifice on efficiency or scalability
  • thanks to design (definition of plug-ins
    interface)
  • thanks to cache mechanisms
  • Use grid infrastructures as they are (i.e. no
    pre-requisite)
  • thanks to
  • Hide heterogeneity
  • of middlewares
  • of grid infrastructures

32
Perspectives
  • Support new technologies
  • develop plug-ins
  • gLite-CREAM
  • French research grid middleware ?
  • integrate plug-ins developed by partners
  • Implement new specifications
  • SAGA Extension Service Discovery API
  • discussions on candidate spec. has just finished,
    the final spec. should be available soon
  • JSAGA
  • has no equivalent for this
  • plug-in based implementation
  • JSDL Extension Parameter Sweep Job
  • proposed for public comments
  • JSAGA does this in a non-standard way

33
  • Backup slides

34
Plan
  • overview
  • summary and perspectives
  • overview
  • summary and perspectives
  • overview
  • summary and perspectives

JUX
35
JJS Performance
  • For short-life jobs, grid overhead is not
    negligible ? need to optimize each step of job
    submission
  • ? job submission multi-threaded
  • ? data staging input/output files are grouped
    in tarballs
  • ? monitoring get all job status with a single
    request
  • ? job life-time waiting and running jobs have a
    timeout limit

and last but not least select the execution
sites, which are the most efficient for
short-life jobs (based on observed QoS)
11/11/2009
35
36
JJS Performance (submission)
Time elapsed before entering state WAITING (i.e.
time for transferring the input sandboxes
submitting the jobs)
11/11/2009
36
37
JJS Performance (monitoring)
Use naming convention on GSIFTP server instead of
Globus monitoring (detecting job failure is not
needed because all the jobs timeout shortly)
11/11/2009
37
38
JJS Summary
  • Optimized for short-life jobs
  • QoS-based selection of execution sites
  • pragmatic usage of deployed grid technologies
  • Easy to install, configure and use
  • Robust
  • designed to be not sensible to grid middleware
    failures
  • because developed when grid was not mature
    (DATAGRID)

http//cc.in2p3.fr/docenligne/269
11/11/2009
38
39
JJS - Perspectives
  • Finish integration of JSAGA
  • for job submission (SAGA)
  • for job collection management (JSDL Parameter
    Sweep Job Extension)
  • job description independent of language
  • data staging independent of protocols and
    infrastructure constraints
  • JJS is also waiting
  • for SRM data management JSAGA plug-in
  • for Service Discovery API (SAGA Extension)
    support in JSAGA
  • in order to enable efficient usage of SRM with
    short-life jobs (by discovering GSIFTP servers
    through the SRM web service)

40
Plan
  • overview
  • summary and perspectives
  • overview
  • summary and perspectives
  • overview
  • summary and perspectives

JUX
41
JUX Screenshots
The connection manager enables user to create
connection profiles with URL and security
context. Only the security contexts compatible
with selected protocols appear in the popup list.
11/11/2009
41
42
JUX Screenshots
Connection is kept open until the nodes are
collapsed (left side). Copy several files with a
single drag-and-drop.
11/11/2009
42
43
JUX Related work
  • Similar tools exist
  • HERMES (Australia)
  • VBrowser (Holland)
  • Using JSAGA for JUX enables
  • to factorize development efforts with JJS (for
    data staging)
  • to manage logical files through a common
    interface (SAGA)
  • protocol-specific optimizations
  • e.g. third-party transfer, filtered file list
  • to automatically recover some errors
  • e.g. create parent directory if missing, retry if
    error is IncorrectState

based on Apache Commons VFS
44
JUX Summary
  • JUX can work with potentially any
  • protocol
  • security mechanism
  • file content
  • JUX is easy to use
  • targeted users are scientists
  • JUX is lightweight
  • currently 11 MB with all plug-ins

you can develop the plug-ins missing for your
use-case
http//cc.in2p3.fr/docenligne/821
45
JUX Perspectives (meta-data)
46
JUX Perspectives (meta-data)
SEARCH
.txt
entry name
and
Study Date
Patient's Name
John S
and
M
Patient's Sex
Patient's Age
size
Search
? Recursive
Write a Comment
User Comments (0)
About PowerShow.com