NorduGrid ARC Middleware - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

NorduGrid ARC Middleware

Description:

NorduGrid ARC Middleware. Ivan Degtyarenko. ivan.degtyarenko dog csc dot fi ... type : Proxy draft (pre-RFC) compliant impersonation proxy. strength : 512 bits ... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 27
Provided by: arto
Category:

less

Transcript and Presenter's Notes

Title: NorduGrid ARC Middleware


1
NorduGrid ARC Middleware
Introduction to ARC
Ivan Degtyarenko ivan.degtyarenko dog csc dot
fi CSC The Finnish IT Center for Science June
3rd, 2008
2
Contents
  • Part 1 Introduction to Grids and NorduGrid
  • What is the Grid?
  • Part 2 NorduGrid middleware ARC
  • ARC services
  • getting started on NorduGrid
  • certificates and loggining
  • writing job description files
  • submitting jobs in NorduGrid
  • fetching results
  • monitoring jobs / resources with the Grid Monitor
  • using storage elements
  • runnning real life applications

3
NorduGrid ARC Middleware
http//www.nordugrid.org
  • NorduGrid middleware (or Advanced Resource
    Connector, ARC)
  • open source out-of-the-box Grid solution software
    which enables production quality computational
    and data Grids (released in May 2002)
  • development is coordinated by NorduGrid
    collaboration
  • emphasis is put on scalability, stability,
    reliability and performance
  • builds upon standard OS solutions OpenLDAP,
    OpenSSL, SASL and Globus Toolkit
  • adds services not provided by Globus such as
    scheduling
  • extends or completely replaces some Globus
    components

4
NorduGrid ARC Middleware (cont.)
http//www.nordugrid.org
  • provides a reliable implementation of the
    fundamental Grid services, such as information
    services, resource discovery and monitoring, job
    submission and management, brokering and data
    management and resource management
  • integrates computing resources and storage
    elements via a secure Grid layer
  • provides a light-weight standalone client, the
    User Interface, which allows to submit, manage
    and monitor jobs on the Grid, move data around
    and query recourse info
  • UI built-in broker allows to select the best
    matching resource for a job
  • Grid job requirements are expressed via extended
    Resource Specifiction Language (xRSL)

5
KnowARC
  • 3 year EU-funded project
  • goals
  • to create a novel, powerful Next Generation Grid
    middleware based on NorduGrid's ARC, widely
    respected for its simplicity, non-invasiveness
    and cost-efficiency
  • to promote Grid standardization and
    interoperability
  • to contribute to Grid technologies take-up,
    bridging the gaps between business and academia
    in Grid development
  • http//www.knowarc.eu/

6
NDGF
  • collaboration between Denmark, Finland, Norway
    and Sweden
  • purpose to help Nordic researchers to create and
    to participate in computational challenges of
    scope and size unreachable for the national
    research groups alone
  • NDGF currently serves
  • Nordic High Energy Physics community - the ALICE,
    ATLAS and CMS Virtual Organizations
  • BioGRID and CO2 sequestration projects
  • Coordinate Nordic Tier-1 for CERN
  • http//www.ndgf.org/

7
Steps to start using NorduGrid
  • get an account for a system with the NorduGrid
    client installed (or install it on your own PC)
  • request a certificate from a Certificate
    Authority (CA)
  • install the certificate
  • log in to the Grid
  • write a job description using xRSL language
  • submit the job
  • monitor the progress of the job
  • fetch the results

8
Installing the NorduGrid Client
  • required to submit jobs to NorduGrid
  • download from http//ftp.nordugrid.org/download/
  • binaries for various Linux distributions, source
    code also available
  • the easiest way to install the client is to use
    the standalone version
  • uncompress in a directory (no root privileges
    required) tar zxvf nordugrid-standalone-ltlatest
    gt.i386.tgz
  • run the environment setup script cd
    nordugrid-standalone-ltlatestgt . ./setup.sh
  • RPM packages are recommended for multi-user
    installations

9
Requesting and Installing the Certificate
  • create a certificate request
  • grid-cert-request -int
  • generates the .globus subdirectory with a key
    (userkey.pem) and the request (usercert_request.pe
    m)
  • identity string e.g. /OGrid/ONorduGrid/OUbccs.
    uib.no/CNPer Hansen
  • remember to select a good passphrase and keep the
    key secret!
  • send the file /.globus/usercert_request.pem to a
    Certification Authority (CA)
  • see the instructions at your local site / country
    which CA to contact
  • wait for an answer from the CA
  • signed certificate returned by the Certificate
    Authority should be saved as file
    .globus/usercert.pem

10
Security Policies
  • policies vary from in different grids and VOs
  • you will need to accept these terms to use these
    resource

11
Logging in to the Grid
  • "Log in" grid-proxy-init
  • the command does not actually log in anywhere,
    but decrypts the private key and uses it to
    create a time-limited proxy
  • the proxy is used for authenticating to the
    resources
  • "Log out" grid-proxy-destroy
  • destroys the proxy
  • "whoami" grid-proxy-info
  • Shows information about the validity of the proxy
  • subject /OGrid/ONorduGrid/OUcsc.fi/CNMichae
    l Gindonis/CN413289378
  • issuer /OGrid/ONorduGrid/OUcsc.fi/CNMichae
    l Gindonis
  • identity /OGrid/ONorduGrid/OUcsc.fi/CNMichae
    l Gindonis
  • type Proxy draft (pre-RFC) compliant
    impersonation proxy
  • strength 512 bits
  • path /tmp/x509up_u7060
  • timeleft 115939

12
Writing a job description file
  • Resource Specification Language (RSL) files are
    used to specify job requirements and parameters
    for submission
  • NorduGrid uses an extended language (xRSL) based
    on the Globus RSL
  • similar to scripts for local queueing systems,
    but include some additional attributes
  • job name
  • executable location and parameters
  • location of input and output files of the job
  • architecture, memory, disk and CPU time
    requirements
  • runtime environment requirements

13
xRSL example
  • hellogrid.sh
  • !/bin/sh echo Hello Grid!
  • hellogrid.xrsl
  • (executablehellogrid.sh) (jobnamehellogrid)
    (stdouthello.out) (stderrhello.err) (gmlogg
    ridlog) (cputime10) (memory32) (disk1)

14
Submitting the job
  • submit the job
  • ngsub -d 1 -f hellogrid.xrsl
  • a job id is returned
  • gt Job submitted with jobid gsiftp//ametisti.gri
    d. helsinki.fi2811/jobs/455611239779372141331307

15
Grid Monitor on the NorduGrid Website
  • shows currently connected resources
  • almost all elements "clickable"
  • browse queues and job states by cluster
  • list jobs belonging to a certain user
  • no authentication, anyone can browse the info
  • privacy issues

16
Monitoring the Job
  • Query the status using the command line
  • ngstat hellogrid
  • gt Job gsiftp//ametisti.grid.helsinki.fi2811/
    jobs/455611239779372141331307 Jobname
    hellogrid Status INLRMSQ
  • Most common status values are ACCEPTED,
    PREPARING, INLRMSQ, INLRMSR, FINISHING,
    FINISHED
  • Or use the Grid Monitor

17
Fetching the results
  • print the job output
  • ngcat hellogrid
  • shows the standard output of the job
  • this can be done also during the job is running
  • download the result files
  • ngget hellogrid
  • gt ngget downloading files to
    /home/ajt/455611239779372141331307 ngget
    download successful - deleting job from
    gatekeeper.

18
Using a storage element
  • Storage Elements are disk servers accessible via
    the Grid
  • can be used to store job output while user is
    logged out and client machine disconnected from
    the Grid
  • allows to store input files close to the cluster
    where theprogram is executed, on a high
    bandwidth network
  • some files can be local and some remote
  • (inputFiles("input1". "/home/user/myexperiment"
    ("input2", "gsiftp//se.example.com/files/data"))
  • (outputFiles("output", "gsiftp//se.example.com/
    mydir/result1")("prog.out", "gsiftp//se.example.
    com/mydir/stdout"))
  • (stdout"prog.out")

19
Does one need to change existing applications?
  • three different approaches
  • using the application as is grid middleware will
    move the executable and the data to the target
    system
  • library dependencies often need to be resolved by
    linking statically or packing them to go with the
    application
  • installing the application on the target system
    and using it via the Grid interface
  • batch processing type applications normally work
    without changes, interactive applications are
    more difficult
  • with ARC middleware this is facilitated by
    runtime environments (RE)
  • modifying the application to fully exploit a
    distributed environment
  • using ARC libraries
  • distributing over a large geographical area is
    not practical unless the computation can be split
    to independent subtasks

20
Runtime environments
  • software packages which are preinstalled on a
    computing resource and made available through
    Grid
  • just send the data to processed
  • useful if there are many users of the same
    software or if the same program is used
    frequently
  • allows local platform specific optimizations
  • required runtime environments can be specified in
    the job description file, for example(runtimeenv
    ironmentAPPS/GRAPH/POVRAY-3.6)
  • Runtime Environment Registry
  • http//www.csc.fi/grid/rer/

21
Real life applications
  • it's common to send several smaller jobs to the
    Grid to solve a larger problem
  • parallel MPI jobs to a single cluster are
    supported (if correct runtime environment
    installed), but no MPI between clusters
  • splitting the job to suitable parts and gathering
    the parts together is left to the user
  • more error prone environment than traditional
    local systems gt error checking and recovery
    important
  • fault reporting and debugging has room for
    improvements
  • new ARClib API available in the development
    version

22
Information resources and support
  • lots of documentation, presentations and
    tutorials on the NorduGrid web site
    http//www.nordugrid.org
  • user guide http//www.nordugrid.org/documents/use
    rguide.pdf
  • NorduGrid user support mailing listnordugrid-supp
    ort_at_nordugrid.org
  • NorduGrid technical discussion mailing
    listnordugrid-discuss_at_nordugrid.org
  • main communication channel between developers

23
From the user point of view
Should be simple like that
but it is not
24
From the user point of view
looks probably like that
still beta
25
Welcome to the Grid
Your attitude is important!
Do not think what Grid can do for you, think what
you can do for the grid!
26
Welcome to the Grid
Let us practice
27
Grid in action
  • a scientist wants to run an analysis consisting
    of
  • local data and parameters
  • remote data (e.g. a very large data set)
  • code (an executable program that analyses the
    data)
  • what does he need to do to use the grid?
  • here is an ARC specific example

localhost
28
Grid in action (cont.)
  • one needs to Describe the Problem with ARC
    middleware this is done with xRSL (eXtended
    Resource Specification Language)
  • xRSL is used to specify job requirements memory,
    time, disk space, input, output and executable
    files
  • hellogrid.sh
  • !/bin/sh echo Hello Grid!
  • hellogrid.xrsl
  • (executablehellogrid.sh) (jobnamehellogri
    d) (stdouthello.out) (stderrhello.err) (
    gmloggridlog) (walltime10) (memory32)
    (disk1)

29
Grid in action (cont.)
  • one needs to authentificate himself with a valid
    certificate
  • a time limited proxy certificate is used act on
    the users behalf on the grid

grid-proxy-init Your identity
/OGrid/ONorduGrid/OUcsc.fi/CNGrid User Enter
GRID pass phrase for this identity Creating
proxy ............................................
.. Done Your proxy is valid until Wed Apr 2
012427 2008
30
Grid in action
Index server
localhost
ngsub myanalysis.xrsl
  • the scientists analysis is now ready to be
    submitted to the grid
  • the middleware client makes a LDAP query to an
    index server (GIIS) that has a list of the
    available resources

31
Grid in action
localhost
  • The middleware client makes more detailed LDAP
    queries to the available resources information
    servers (GRIS)
  • The middleware client chooses resources based on
    information from the clusters
  • Authorization to use
  • Available memory, disk space
  • Queue length, etc

ametisti.hip.helsinki.fi
sepeli.csc.fi
se1.ndgf.csc.fi
magnum.uio.no
32
Grid in action
local data
code
result
localhost
proxy
  • The middleware client uploads the local data,
    code and user proxy to the selected resource via
    GridFTP
  • After processing the scientist downloads results
    of the analysis via GridFTP
  • With the user proxy, middleware downloads remote
    data from storage via GridFTP
  • The analysis queues, runs and returns a result

ametisti.hip.helsinki.fi
A job id is returned gt Job submitted with jobid
gsiftp//ametisti.grid. helsinki.fi2811/jobs/45
56
remote data
se1.ndgf.csc.fi
33
Local Jobs vs. Grid Jobs
  • Local batch jobs
  • Batch queue system options specifying job
    requirements are usually written to small
    scripts, defining also directory paths etc.
  • qsub, llsubmit, ...
  • Grid jobs
  • Described using (extended) Resource Specification
    Language (xRSL)
  • ngsub
  • Runtime Environments
  • File transfers from the submitting machine or
    separate file servers on the Grid, Storage
    Elements (SE)
  • Grid middleware transforms the Grid job to a
    local batch job
Write a Comment
User Comments (0)
About PowerShow.com