eBioFlow Different perspectives on scientific workflows - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

eBioFlow Different perspectives on scientific workflows

Description:

Human Media Interaction Group, University of Twente. Han Rauwerda, ... A control flow workflow engine: Yawl (www.yawlfoundation.org) Late binding of services ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 19
Provided by: lud76
Category:

less

Transcript and Presenter's Notes

Title: eBioFlow Different perspectives on scientific workflows


1
e-BioFlow Different perspectives on scientific
workflows
  • Ingo Wassink1, Paul van der Vet, Anton Nijholt
  • Human Media Interaction Group, University of
    Twente
  • Han Rauwerda, Timo Breit
  • Micro Array Department, University of Amsterdam
  • 1) i.wassink_at_ewi.utwente.nl

2
Overview
  • Scientific workflow systems
  • Problems using workflow systems
  • e-BioFlow
  • Control flow perspective
  • Data flow perspective
  • Resource perspective
  • Relations between perspectives
  • Conclusion
  • Current state
  • Future work

3
Scientific workflow systems
  • A lot of data is stored in online databases
  • SwissProt, PDB
  • Access to many services in a uniform manner
  • BioMoby, WSDL, SoapLab
  • Using a graphical workflow tool
  • Connect these services and databases
  • Share experiments and experimental results with
    others
  • Discuss, improve and reuse experiments
  • myExperiment, (www.myexperiment.org)
  • Automatically store provenance data

4
Problems using workflow systems
  • Are often difficult to use due to a complex user
    interface
  • It is not possible to model advanced control
    structures
  • Loops Choices
  • A priori knowledge about services is required
  • Function
  • Input and output
  • Data produced by one service is often
    incompatible with data consumed by others
  • If a service is not available, the workflow needs
    to be modified

5
e-BioFlow (I)
  • Graphical tool user interacts with the workflow
    directly
  • Develop templates for experiments
  • Abstract from web services until workflow
    execution
  • Enable complex workflow structures
  • Sequential, parallel, iteration and choices
  • Shows limited information at a time to prevent
    information overload
  • But does not restrict the user in modeling
    workflows
  • Distributes information using a tabbed user
    interface
  • Ultimate goal improve usability of workflow
    systems

6
e-BioFlow (II) Screenshots
Control flow perspective
Data flow perspective
Resource perspective
7
Control flow perspective (I)
  • Defines the order of task execution
  • Uses dependencies a task can depend on prior
    tasks
  • Advanced control structures to define the order
    of task execution
  • Sequential a task needs to wait for completion
    of prior task(s)
  • Parallel tasks can be executed at the same time
  • Iterative a task needs to be repeated until some
    criterion is met
  • Branching the execution of a task depends on a
    certain criterion

8
Control flow perspective (II)
  • AND execute next tasks in parallel
  • XOR execute one of the next tasks, depending on
    conditions

9
Data flow perspective
  • A task can require information from prior tasks
  • Tasks have input and output ports to consume and
    produce data
  • Pipes are used to define output of prior tasks
    being input for next tasks
  • Type restrictions on data are tested

10
Resource perspective
  • Defines what type of action (service) needs to be
    executed instead of which actor (web service,
    tool, user) to invoke
  • Actor is chosen at runtime
  • Uses roles to describe constraints on actors
  • A role defines the required capabilities of an
    actor
  • Service type
  • Input and output it should consume and produce

11
Relations between perspectives (I)
  • A central workflow specification is shared and
    edited by the different perspectives
  • Changes in one perspective are propagated to the
    other perspectives, wherever applicable
  • Visual effects
  • To ease switching between perspectives
  • Task positions size
  • Zoom level

12
Relations between perspectives (II)
  • If data is transferred between tasks, this
    implies the existence of a dependency between
    these tasks

Task requires information from one of the prior
tasks
Task requires information from both prior tasks
13
Relations between perspectives (III)
  • If a task requires input and output, this puts
    constraints on the suitability of actors, and
    vice versa

An alignment task requires two input sequences
A Blast task requires only one input
14
Conclusion
  • e-BioFlow enables one to create advanced control
    structures
  • By abstracting from services, it is possible to
    design experiment templates
  • The amount of information presented to the user
    is limited by providing different perspectives to
    the user
  • However, an executable environment is required to
    test the usability of this approach

15
Current state
  • Integration of a workflow engine
  • A control flow workflow engine Yawl
    (www.yawlfoundation.org)
  • Late binding of services
  • Support for different type of services
  • Support for WSDL/SOAP and BioMOBY services
  • Support for scripting tasks (R, Perl, BeanShell)
  • User interaction tasks
  • New types of services can easily be created using
    a plugin structure
  • Framework for storing provenance data

16
Future work
  • Creating workflows ad-hoc
  • Directly execute tasks during workflow
    construction
  • Redo steps, take alternative steps
  • Store and browse provenance data
  • Provide a user interface closely related to the
    workflow model
  • Improve mapping between actors and roles
  • Mapping between different ontologies structures
    is required

17
Acknowledgement
  • Pieter Neerincx
  • Laboratory of Bioinformatics, Wageningen
    University
  • Wim de Leeuw
  • MAD, University of Amsterdam
  • Matthijs Ooms
  • HMI, University of Twente
  • This work was part of the BioRange programme of
    the Netherlands Bioinformatics Centre (NBIC),
    which is supported by a BSIK grant through the
    Netherlands Genomics Initiative (NGI).

18
Thanks
Questions
Advice
Remarks
Ideas
More information
e-BioFlow is open source http//ewi.utwente.nl/bi
orange/ebioflow
Write a Comment
User Comments (0)
About PowerShow.com