TROPIX Technology Survey - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

TROPIX Technology Survey

Description:

Condor. PVM. MPI. Open Source workflow Engines. Message Passing for Parallel Job Execution ... Condor. Job submission for non-dedicated machines. PBS ... – PowerPoint PPT presentation

Number of Views:150
Avg rating:3.0/5.0
Slides: 17
Provided by: benjami91
Category:

less

Transcript and Presenter's Notes

Title: TROPIX Technology Survey


1
TROPIX Technology Survey
  • -B. Lynch

2
Overarching Aspects of Technologies Considered
  • Security
  • Identity management
  • Authorization
  • System security
  • Communication
  • Protocols, synchronous/asynchronous
  • Callbacks, workflow
  • Performance
  • Language restrictions, bulky protocols
  • Pre-made or DIY
  • What do we get from an existing framework?
  • How much time would it take to write it
    ourselves?
  • What constraints are imposed if we use existing
    code?

3
Central Authority
  • Simple to implement
  • Difficult or impossible to administer across
    sites
  • Each site has different security policies
  • Each site has different resource use policies

4
Federated Authority
  • More complicated to implement
  • No out-of-the-box solutions exist
  • Each site could use its own existing
    authentication mechanisms
  • Each site could enforce its own resource use
    policies

5
Some of the Relevant Technologies Considered
  • PBS
  • Sun Grid Engine
  • Condor
  • PVM
  • MPI
  • Open Source workflow Engines
  • myGrid
  • Globus
  • caGrid
  • BOINC
  • Tomcat

6
Message Passing for Parallel Job Execution
  • MPI
  • Message passing for parallel execution
  • Uses central authority for jobs execution
  • PVM
  • Message passing for parallel execution
  • Uses central authority for jobs execution
  • Lower performance compared to MPI

7
Queues with a Central Authority
  • Condor
  • Job submission for non-dedicated machines
  • PBS
  • Batch job submission, handles parallel jobs
  • A central authority executes jobs on 1 or more
    machines
  • Sun Grid Engine
  • Similar to PBS capabilities
  • Java API

8
  • PBS
  • Sun Grid Engine
  • Condor
  • PVM
  • MPI

These can be used for queuing and parallel job
execution. These technologies all use a central
authority to distribute jobs. These can be used
for individual services, but are insufficient
for the high-level architecture.
9
Open Source Workflow Engines
  • Dozens of engines exist (36 projects briefly
    scanned)
  • Many written in Java and may be suitable if we
    need to incorporate this component
  • security may be a problem to use any of these

10
BOINC
  • Designed for cycle-stealing, and volunteer grids
    (like SETI_at_HOME)
  • No workflow framework
  • Designed for a single center of authority
  • Very-low security

11
Globus
  • The Globus toolkit provides a framework for
    security, job submission, and file-movement for
    grid services
  • We would need to look elsewhere for information
    models, identity management, authorization, and
    workflow

12
Tomcat
  • Tomcat is a servlet container that can be used
    for web services
  • There is a large community of Tomcat users
  • Has tools for service deployment and security
  • We would need to look elsewhere for information
    models, identity management, authorization, and
    workflow

13
myGrid
  • Started and maintained by UK Universities
  • Focuses on bioinformatics
  • Started 6 years ago, progress has been slow
  • Support has been limited to 7 institutions in the
    UK
  • Taverna Workbench (graphical interface for
    workflows with myGrid services)
  • Limited security (authentication and
    authorization) have been introduced in Taverna
    version 1.5 less than 1 year ago

14
caGrid
  • Broad support from NIH/NCI and many Universities
  • Supports data and analysis from many biological
    fields
  • Built on Globus technologies (also used for NSF
    TerraGrid), services can also be deployed to
    Tomcat
  • Has tools (or plans) for workflow, federated
    security, and information models

15
Home-Grown
  • A new code could be written from scratch
  • Pros no dependence on the weaknesses of existing
    frameworks
  • Cons security, workflow, and information
    modeling would start at the ground level

16
caGrid
  • Broad support from NIH and many Universities
  • Dozens of new services are being developed
  • NIH is funding support of core framework as well
    as some of the new services
  • Community has developed and continues to develop
    information models for data and analysis from
    many biological fields
  • Built on standard technologies (Java, Globus,
    Tomcat, X509, etc.)
  • Supports workflows, federated queries, and a
    federated authority
Write a Comment
User Comments (0)
About PowerShow.com