OMII distribution evaluation activity at KIAM1 and JINR2 - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

OMII distribution evaluation activity at KIAM1 and JINR2

Description:

1Keldysh Institute for Applied Mathematics, Russian Academy of Sciences ... Interfaces to PBS and Condor are provided in form of platform scripts ... – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 32
Provided by: frdr46
Category:

less

Transcript and Presenter's Notes

Title: OMII distribution evaluation activity at KIAM1 and JINR2


1
OMII distributionevaluation activityat KIAM1
and JINR2
  • Viktor Pose2 et al.
  • EGEE-3
  • 18.04-22.04.2005

1Keldysh Institute for Applied Mathematics,
Russian Academy of Sciences 2Joint Institute for
Nuclear Research, Dubna, Russia
2
Contents
  • Evaluation team
  • OMII functionality (based on our experience and
    OMII User Guide)
  • General description
  • Architecture
  • Account management
  • User operation
  • Resource Allocation
  • Data Staging
  • Security
  • Authorization
  • Administration
  • Installation
  • Tests
  • Job Service
  • Data Service
  • Performance and concurrency of dummy services
  • Security
  • Adding an application
  • Adding a new service

3
Evaluation team
  • Keldysh Institute for Applied Mathematics,
    Russian Academy of Sciences
  • P.Berezovsky
  • E. Huhlaev
  • V. Kovalenko
  • D. Semyachkin
  • Joint Institute for Nuclear Research, Dubna,
    Russia
  • Y. Bugaenko
  • V. Galaktionov
  • N. Kutovskiy
  • V.Pose
  • I.Tkachev

4
General description
  • OMII distribution (OMII) is a middleware aimed at
    enabling access to remote distributed resources
    in an environment with service architecture
  • User operations of the two kind are primarily
    supported
  • Job execution
  • File transfer
  • Job execution is regarded in OMII as an execution
    of applications, which are preliminary installed
    at some servers or clusters
  • Execution of programs submitted by users within
    the job submission is impossible
  • Distributed environment based on OMII is highly
    decentralized
  • All interactions occur straightly between the
    client-server pair without participation of
    intermediate servers, such as resource broker,
    information service or replica catalog
  • OMII does not include any community services, and
    is analogous to Globus Toolkit rather than
    Workload Management System of EU DataGrid.

5
Architecture
  • OMII is centered around web services and grid
    services standards
  • Web services provide server-client connectivity,
    utilizing SOAP for communication
  • Web services are provided by Jakartas Apache
    Axis and hosted by Jakartas Apache Tomcat web
    server
  • OMII innovates by suggesting a conversation
    mechanism for keeping state (context) over series
    of interactions
  • Similar functionality is provided by WSRF
  • This mechanism is used in particular for
    authorization
  • OMII server computer
  • Accomplishes hosting functions for services
  • Serves as a gateway to a resource pool of
    computing nodes
  • The resource pool is supposed to be managed by a
    batch system
  • Interfaces to PBS and Condor are provided in form
    of platform scripts

6
Account management
  • To join an OMII based Grid and be capable of
    doing any operations the user needs
  • A certificate
  • An account
  • The way of account creation requires the
    definition of a relation for each pair (user,
    server)
  • This is not well scalable
  • A user applies for an account at each OMII
    Service site
  • User runs ogre_client open command to each OMII
    Server site
  • Service provider grants the application (or
    declines it) using a web-based tool
  • The scheme slightly improves by introduction of
    an intermediary person a budget holder
  • Step 1. The budget holder applies for an account
  • Budget holder runs ogre_client open command to
    a OMII Server site
  • Service provider grants the application (or
    declines it) using a web-based tool
  • After this step the new account can be used by
    its owner only
  • Step 2. The budget holder gives users access to
    the account
  • User sends his certificate to the budget holder
  • Budget holder stores the users certificate in
    the Java Keystore of his client
  • Budget holder enables a user to access an account
    using the graphical interface of the ogre_client
    browse command
  • The budget holder is responsible for paying for
    usage of the account
  • Under the conditions of a Grid with many Resource
    Centers and big virtual organizations

7
User operation
  • A users operations, in particular job execution
    and file transfer, are carried out by means of
    the OMII client part ogre_client
  • ogre_client is a script, which starts Java
    program
  • A part of ogre_clients functionality is
    provided to the user by interactive graphical
    panels
  • A Java library is provided to run jobs from your
    own applications
  • At one client computer there must be separate
    client installation for each user
  • ogre_client configuration file explicitly points
    to Java Keystore, where the users key is kept,
    and to some other personal files
  • This is a fairly uncommon method of deployment
    for a client
  • There is no single command for job execution, but
    the procedure is divided into four steps
  • Resource allocation
  • Uploading the input data
  • Running the job
  • Downloading the output data
  • Such a separation
  • Is partly a forced measure, as concerned resource
    allocation
  • May be useful from the view of granularity
  • Lays additional burden on the user
  • Each step generates some intermediate files,
    which should be passed to the following steps
  • Accounts.xml stores a users account
    information
  • client.state stores state data related to a
    users operations

8
Resource Allocation
  • During resource allocation an execution server is
    selected
  • Two files are used by ogre_client tender
    command Accounts.xml and Resources.xml
  • Accounts.xml contains all accounts of a user
  • Resources.xml describes the resources requested
    for the job(s) to be executed
  • Name of the application suite
  • Performance of an executive computer
  • Memory and storage volumes
  • Processor time
  • Time boundary of the allocation
  • As a result, the user gets the list of service
    providers, which could grant the requested
    resources
  • Final decision is made interactively by the user
  • OMII has no information service
  • The process of getting allocations (tender) is
    carried out by querying all the servers on which
    the user has an account
  • This way is time and network bandwidth wasteful
    in a grid with many users and RCs
  • Resource consumption in OMII is payable
  • The budget holder is responsible for paying for
    usage of the account
  • Price for uploaded or downloaded Byte

9
Data staging
  • User uploads the input file(s) to the Data
    Service via ogre_client upload command
  • Currently the preferred option is put all input
    files the job needs in one zip file
  • User submits job
  • Input file is staged in from the Data Service to
    the job workspace - an extra directory created
    for each job
  • Input data files are moved into the working
    directory of the job - a subdirectory of the
    workspace directory
  • including unpacking any that are compressed
    archives containing multiple inputs
  • If the job produces output files, they are
    created in the working directory
  • Output files are copied from the working
    directory into the specified positions in the
    workspace
  • Including packing multiple outputs into
    compressed archive files where necessary
  • Output file is staged out to the Data Service
  • User downloads the output file(s) via
    ogre_client download command

10
Security
  • OMII concerns two aspects of security
    authentication and authorization
  • Authentication mechanism is based on X.509
    certificates and public key technology
  • Current OMII distribution is configured for the
    work with "temporary" certificates signed by
    Certification Authority (CA) on OMII security
    server
  • Work with non-OMII certificates is allowed
  • There are no special instruments except standard
    Java tools
  • In the future, it will be possible to set up a
    grid where different client and service
    certificates are signed by different CAs but this
    has not been implemented at present. (OMII 1.2.0
    User Guide)
  • OMII Extensions component (GridServIT)
  • Initially supports message-level integrity based
    on WS-Security and X.509 PKI
  • Provides authentication
  • the identity of the sender of the message is
    taken from the WS-Security SOAP header after
    checking a messages integrity
  • Provides authorization
  • Process-based access control (PBAC) useful for
    enforcing business process
  • No proxy certificate support
  • Unlike Grid Security Infrastructure OMII does not
    support proxy credentials, that allow a
    computation (e.g. a service) to delegate securely
    user rights to another computation
  • This seriously limits the ability of secure
    service-to-service interactions
  • According to OMII support - OMII is committed to
    achieving interoperability with the Globus
    toolkit and an interoperable authentication model
    is under discussion
  • Java Keystore management
  • Currently Keystore password and Private key
    password are stored in clear text in a file
  • One feature of the keytool utility used to manage
    the Java Keystore needs to be improved

11
Authorization
  • All services and applications are running on a
    single system account
  • This makes the basic authorization mechanism at a
    server side primitive
  • Authorization module - Process Based Access
    Control (PBAC)
  • may be used (optional) for service development to
    enforce a workflow (business process)
  • E.g. commercial service workflow payment -gt run
    Job
  • Applicable in situations, where the sequence of
    service operations is essential
  • In PBAC the term conversation is used to
    represent an identity of a particular dialogue
    between a client and a service provider
  • On a server side the PBAC records the state of
    conversation as a set of operations that is
    accessible
  • Possible authorizations are described by
  • Identity of the user
  • Conversation ID
  • Operation name
  • When a user calls a service, an operation will be
    executed for a particular user in a particular
    conversation only if an authorization matching
    this specification is present
  • If a service developer wants to provide such an
    enforcement, he must wrap each operation of his
    service with PBAC API calls
  • These calls must check if an authorization is
    present for the calling user and the conversation
    ID
  • Finally the operation may open authorizations for
    following operations, and close authorizations
    for others

12
Administration
  • Administration in OMII refers to the
    administration of a single server site
  • A web-based tool ra_admin may be used to
    configure several parameters
  • Data service
  • Limits on storage capacities
  • Upload and download bandwidths (network provider
    limits)
  • Costs per uploaded/downloaded Byte
  • Name and capabilities of each machine in the
    resource pool
  • Relative performance
  • Memory
  • Number of processors
  • Machines can be added or deleted to/from the
    resource pool
  • Application suites and the applications within
    those suites
  • Both described by their URI
  • An application suite is a collection of
    applications normally used together
  • An administrator can specify which machines can
    run which application suites
  • Work unit costs are set per application suite
  • Apparently all that information is used
  • At a tender step of a resource allocation
  • For the costs calculation of a users job

13
Installation
  • OMII 1.0.0 and 1.1.1 client installation
  • Easy and fast
  • Successful on
  • Windows XP (with flag CheckDiskSpace -gt
    failonerrorfalse)small issue
  • SuSe Linux 9.0
  • CERN Linux 7.3
  • On a computer with multiple users one client
    installation per user is needed
  • OMII 1.0.0 and 1.1.1 server installaton
  • Successful only on SuSe Linux 9.0 (small issue)
  • OMII 1.2.0 released on 20.03.2005
  • Server and client support
  • Redhat Enterprise Linux 3.0 ES/WS
  • SuSE 9.0.
  • Client installation successful on SuSe Linux 9.0
  • Server not tested in JINR and KIAM

14
Setup for Job Service tests
  • This setup was used for the tests in the next 2
    slides
  • OMII 1.1.1 Server node
  • P4 2.4GHz, 512MB RAM
  • Preinstalled application GRIATestApp
  • 2 client nodes, each having
  • 10 OMII 1.1.1 client installations
  • P4 3GHz, 1.5GB RAM
  • Amount of server node CPU used to run the
    application is negligible
  • Work.xml the line containing the -cputime30
    setting was removed
  • Input file is the zipped test.txt file which
    contains only one letter
  • Rough profiling between Java and Postgres CPU
    consumption on the server node was made using ps
    utility
  • Total CPU consumption on the server node was
    measured via vmstat

15
Job Service Concurrency Robustness
  • Test Description
  • Running 20 clients simultaneously
  • Each client submits sequentially 20 jobs to the
    Job Service 400 jobs in total
  • Duration of the test 1h 18min
  • Results
  • The server remained stable during and after this
    test and all job submissions were successful
  • The Job service throughput was 5.1 jobs/minute
  • Average server node CPU consumption 79
  • The only issue that was noticed
  • A few times the build in monitoring of the
    running jobs failed with the message
  • Status Status is now submitted
    (STATUS-SCRIPT-ERROR)

16
Job Service Performance
  • Test Description
  • Different clients each using a separate account
    or
  • Different clients using the same account (e.g.
    one account per VO)
  • Each client submits 1 job using ogre_client run
    command
  • Results
  • Average CPU consumption at server node 79 - 84
    for 20 clients
  • job submission throughput can not considerably
    grow with a further increase of the number of
    concurrent clients
  • maximal job submission rate to an OMII Job
    service is about 6 jobs per minute for a 2.4GHz
    P4 CPU server node
  • A client consumes in average 4.3s of a 3GHz P4
    CPU per job submission
  • Part of server CPU consumed by PostgreSQL 10..20
    - 23..26, the other part is consumed by Java

17
Job Service Stability
  • Test Description
  • A user needs to submit multiple jobs to a batch
    queue via one account
  • Account creation
  • Resource allocation
  • Uploading the input file (150 Bytes)
  • The batch mode job submission command
    ogre_client start is put in a cycle to submit
    sequentially multiple jobs
  • Each job is executing the GRIATestAPP application
  • ogre_client monitor command can be used to
    monitor the status of submitted jobs
  • Server node P4 2.4GHz, 512MB RAM
  • Client node P4 3GHz, 1.5GB RAM
  • Results
  • Successful execution until an error occurred near
    the 180-th start
  • SEVERE Problem creating input stream for
    reading SOAPMessage
  • After described failure, next starts of the
    application from command line were successful
  • About 16s client execution time for 1 job
    submission
  • This setup convenient to submit and monitor about
    up to 10 - 30 jobs
  • Bulk job submission is not supported in OMII

18
Monitoring job status
  • OMIICLIENTgt ./ogre_client monitor
  • OGRE Client
  • Contacting http//omii01.jinr.ru18080/axis/servic
    es/JobService2273
  • Contacting http//omii01.jinr.ru18080/axis/servic
    es/JobService2271
  • http//jinr.ru/sjob
  • URL http//omii01.jinr.ru18080/axis/service
    s/JobService2273
  • Status Status is now submitted (RUNNING)
  • gt JOB_STATUS RUNNING
  • gt Current status RUNNING
  • gt
  • gt Details
  • gt GRIA Test App running
  • gt Current time Mon Mar 7 134141
    2005
  •  
  • http//jinr.ru/sjob
  • URL http//omii01.jinr.ru18080/axis/service
    s/JobService2271
  • Status Status is now output-staging-complete
    (FINISHED)
  • gt JOB_STATUS FINISHED

Example A user submitted 2 jobs
19
Data Service Reliability and Performance
  • Test Description
  • Single client runs 1000 sequential cycles of
    upload/download of about 10MB big files to/from
    the OMII Data Service
  • comparison of the uploaded and downloaded files
  • P4 3GHz, 1.5GB RAM client and server nodes
  • Results
  • All uploads and downloads were successful
  • One upload download cycle consumed about 2.9s
    CPU time on OMII Server node
  • Average client execution times
  • Upload 11.4s
  • Overwriting 5.7s
  • Download - 5.5s
  • About 3.5s CPU consumption by client per upload
    or download

20
Data Service Concurrency
  • Test Description
  • Simultaneous upload or uploaddownloadcompare of
    files to/from OMII Data Service with up to 5
    parallel clients
  • Each client runs 10 sequential uploads or
    uploaddownloadcompare cycles
  • Server node (OMII 1.0.0), P4 2.4GHz, 512MB RAM
  • Client node with 5 OMII 1.0.0 client
    installations, Celeron 1.3 GHz, 425MB RAM
  • Results
  • All uploads and downloads were successful
  • Client execution time grows approximately linear
    with increasing number of clients (may be partly
    influenced by slow client node)

21
Performance of Dummy Services
  • Test Description
  • Estimate the overhead expenses of OMIIAXIS
    infrastructure
  • Dummy services Non-PBAC Test Service and
    ExampleGridServIT PBAC Service were tested by use
    of the clients from OMII distribution
  • Client node
  • OMII 1.0.0 - P4 3 GHz, 1GB RAM, SUSE 9.0
  • Server nodes
  • Server at OMII
  • Server at KIAM OMII 1.0.0 - P4 3 GHz, 1GB RAM,
    SUSE 9.0
  • Time measurements were carried out by time
    command
  • Results
  • In the following we list response times of tested
    services from single client
  • Server response time (server wall time network
    delay) was estimated as (client wall time)
    (client CPU time)
  • Response time for server at KIAM is less then
    response time for server at OMII because network
    delay for KIAM server is very small

non-PBAC Test Service
ExampleGridServIT PBAC Service
22
Concurrency with dummy services
  • Test description
  • Dummy service ExampleGridServIT PBAC Service was
    tested with regular OMII client
  • It was impossible to run client utilities
    concurrently under a single user
  • A minor change of the utilities code allowed to
    start in parallel any number of them
  • It was not possible to create sufficient load on
    the server from single client host
  • Only one client node was available for this test
  • Server wall time was much less than the client
    CPU time
  • Client node
  • OMII 1.0.0 - P4 3 GHz, 1GB RAM, SUSE 9.0
  • Server nodes
  • Server at OMII
  • Server at KIAM OMII 1.0.0 - P4 3 GHz, 1GB RAM,
    SUSE 9.0
  • Results
  • 100 parallel clients finished without any
    mistakes
  • Client CPU time for all starts was approximately
    the same
  • Total client wall time is equal to 100(single
    client wall time)

23
Security Tests
  • Test OMII Services with Russian DataGrid CA user
    certificates, accepted by LCG/EGEE communities
  • OMII Services cant be used with Russian DataGrid
    CA user certificates - presumably there is no
    support for certificates signed with a 4096 bit
    CA PK
  • According to OMII support
  • It is possible, that OMII will use other security
    providers, e.g. IBM JSSE or Bouncy Castle in the
    future
  • Its still not clear, whether they support
    certificates signed with a 4096 bit CA public key
  • Test authorization features
  • Enable access of multiple users to one account -
    OK
  • Account owner imports certificates of the
    relevant users into the Java Keystore of his
    client
  • Account owner enables access for each user
    certificate manually via a graphical UI
  • User manually adds the account to the accounts
    file (uses XML format) of his client
  • This was used in the Job Service tests
  • A possible VO ltgt OMII account mapping could not
    be effectively managed this way for big dynamic
    VOs
  • One user accessing multiple accounts from the
    same client installation - OK
  • Accounts are automatically added to a clients
    accounts file during account creation
  • User chooses the account he wants to use during
    resource allocation for the task with the
    graphical UI of the ogre_client tender command

24
Adding an application
  • According to OMII User Guide to add a new
    application one has to
  • Create an application startup wrapper script to
    start the application
  • Simple applications can use the provided test
    startup wrapper script
  • Optionally create additional wrapper scripts for
    application-specific status monitoring and/or job
    termination
  • Simple applications can use the provided test
    status and job termination scripts
  • Deploy the application and wrapper script(s) on
    all execution platform nodes assigned to that
    application suite
  • Append application parameters to the job service
    configuration so the job service can find and use
    the new application
  • Add the application to a new or existing
    application suite in the resource model, using
    the resource model admin web interface
  • Using the provided
  • Documentation
  • Application wrapper and status scripts
  •   we successfully added simple applications
  • This was easy done and no issues were noticed

25
Adding a new service
  • OMII Extensions component (GridServIT)
  • Built on the native Apache Axis SOAP container
    without changing it in any way
  • Provides a service context API for eScientists
    wishing to deploy Grid Services
  • Services may use this API to retrieve contextual
    information associated with a service
  • common data, such as the distinguished name of an
    authenticated remote user
  • headers from the SOAP messages
  • access to basic infrastructure security services
    (the only one available at present is PBAC
    authorization module)
  • The alternative to the OMII API-based approach
    an extension of the container functionality
  • Seems to be essentially more productive for
    enforcement of common grid policies
  • Will free a service developer from programming
    grid-related codes
  • All OMII services are primarily web services
  • Common Tomcat and Axis tools can be used for
    service development and deployment
  • OMII suggests the GEMSS Transport and Messaging
    framework as an invocation and messaging
    framework
  • It enables client applications to make
    invocations against message based services
  • The proposed method is low level though a
    flexible one a client developer must write an
    invocation manually, without a stub
  • OMII documentation describes the difficult way
    for service and client creation with no use of
    well known Axis tools, java2WSDL and WSDL2java
    for example
  • A simple example service has been deployed
    eventually
  • There was a problem with deployment, which was
    caused by a small inaccuracy in the documentation

26
Interoperability with WMS
  • Interoperability with WMS was evaluated based on
    a paper study.
  • Can we use the OMII client to interface to the
    WMS?
  • Different job management interfaces and
    architecture
  • WMS UI
  • Designed to contact WMS Network Server and LB
  • Functionally more reach than OMII UI
  • Based on JDL
  • OMII UI
  • Designed to contact OMII Job Service, Data
    Service, Resource Allocation Service
  • Aimed at submission of a job to a certain site
  • No support for VO and VOMS
  • No support for passing job executables along with
    the job submission
  • Security
  • No support for user proxy certificates and VOMS
    in OMII
  • WMS uses information added by VOMS into the proxy
    certificate - the VO of the user
  • Conclusion
  • OMII UI cant be used to submit jobs to WMS

27
Interoperability with WMS
  • Is the 3-tier architecture OMII UI ? OMII Server
    ? WMS an effective way to provide OMII users with
    the possibility to submit their jobs through WMS
    to the underlying CEs?
  • Different job management interfaces and
    architecture
  • Currently there are no ready to use means to make
    the 3-tier architecture OMII UI ? OMII Server ?
    WMS support a necessary amount of WMS and WMS UI
    provided functionality
  • Security
  • No support for user proxy certificates and VOMS
    in OMII
  • Additional latency
  • 3 tier architecture OMII Java UI ltgt OMII Java
    Services ltgt WMS introduces additional latency
  • Conclusion
  • OMII server cant be used to interface to the WMS
    without additional development efforts

28
Interoperability with WMS
  • Can we use the WMS to submit to OMII servers?
  • Different interfaces to computing resource and
    different job management architecture
  • WMS submits jobs to CE via Condor-C (web service
    interface)
  • OMII Job Service uses it's own web service
    interface
  • Does OMII Job Service produce asynchronous job
    status notifications like a CE?
  • Information services
  • OMII has no means to publish CE and SE
    information for OMII Job Service and OMII Data
    Service to an Information System or WMS
  • Authorization
  • No VOMS based authentication and authorization
    support in OMII
  • VOMS based authentication and authorization
    support has to be added to OMII to work effective
    with big dynamic VOs and be compatible with WMS
  • Conclusion
  • WMS cant be used to submit jobs to OMII Job
    Service without additional development efforts

29
Summary of tests
30
OMII Support and Documentation
  • OMII support was contacted to resolve several
    issues and problems encountered
  • Was mostly operative
  • The answers mostly came during the day or the
    next day
  • A couple of problems were discussed and
    developers made corresponding changes in the
    documentation
  • The provided documentation clearly covers the
    main topics, but at some points (e.g. bugdet
    management, service creation) it is unclear and
    unfinished

31
Summary
  • OMII has several interesting features and
    abilities
  • Web services architecture
  • Account management
  • Resource consumption accounting
  • Conversation mechanism and authorization module
    PBAC - Process Based Access Control
  • Easy and compact installation (but restricted to
    the certain OS)
  • OMII is oriented towards web services, rather
    than Grid architecture
  • No community services
  • Does not support Grid Security Infrastructure and
    proxy credentials
  • No support for Virtual Organisations and VOMS
  • Does not support the execution of users programs
  • No interoperability with Globus Toolkit and WMS
  • Management and administrative techniques are
    intended to servicing individual servers, not VO
    and resource infrastructure
  • Users operation needs improvements in the
    implementation as well as an enhancement of
    functionality, especially in case of larger grids
  • more powerful resource selection language
  • account and resource allocation management
Write a Comment
User Comments (0)
About PowerShow.com