Title: Workflows within Taverna
1Workflows within Taverna
- Stuart Owen
- University of Mancester, UK
- stuart.owen_at_manchester.ac.uk
2What is a workflow?
- Origins stem from the business world 1970s.
- Coordinate units of work and the flow of
documents according to some procedural rules, to
describe and carry out a complex process within
an organisation. - Adopted within the scientific world over the past
decade. - Coordinate a series of computational tasks
according to some procedural rules, to describe
and execute a complex process within an
experiment.
3What is a workflow
- Data workflows
- A task is invoked once its expected data has been
received, and when complete passes any resulting
data downstream. - B starts when it receives data from A.
- C and D run in parallel when they receive data
from B - E starts once its received data from both C and
D. - Control workflows
- A task is invoked once its dependant tasks have
completed. - B starts when A has completed.
- C and D run in parallel once B has completed
- E starts once both C and D have completed.
A
B
C
D
E
F
4Advantages of workflows
5Advantages to workflows
- High-level abstraction
- Easier to understand and modify.
- Easier to describe and discuss with others.
- Describes what you want to do, not how to do it.
- Automation
- Sharing and re-use
- Either on its own, or within other workflows!
-
6Workflows within Taverna
- A hybrid between data and control workflows.
- Predominantly based around the flow of data.
- Service oriented workflows. Services may or not
be grid enabled. - High-level GUI approach seperated from lower
level coding, you dont have to be a coder to
build a workflow. - Enactment can take place separate to the GUI,
allowing workflows to be executed from the
command line or within other systems.
7(No Transcript)
8Taverna 1.4 Workbench
- Integral part of the myGrid project
- Java based, runs on Windows, Mac OS, Linux,
Solaris . - Open source and user driven development
- 1000 downloads of current version over past
month - Over 3000 downloads of version 1.3.1
- Over 10000 downloads in total
- http//taverna.sourceforge.net
- Taverna in OMII-UK
- Dedicated team of developers focused on design,
implementation, testing and support leading to
production quality software. - Development of Taverna 2.0
9Taverna 1.4 workbench
10(No Transcript)
11SCUFL
Taverna Workbench
(Simple Conceptual Unified Flow Language)
Application data flow layer Scufl graph service
introspection
Scufl Workflow Object Model
Execution flow layer List management implicit
iteration mechanism MIME semantic type
decoration fault management service alternates
Workflow Execution
Freefluo Workflow enactor
Processor invocation layer
Processor
Processor
Processor
Processor
Processor
Processor
Bio MOBY
Plain Web Service
Soap lab
Local App
?
Enactor
12Taverna Processor
- Primary component of a scufl workflow.
- Represents a unit of work a task.
- Data flows between processors.
- Most are associated with some sort of external
resource, for example a WSDL based webservice. - Also includes basic local widgets, most
commonly used for data format transformations
shims. - Follow a standard architecture pattern and are
extendable plugins you can create your own for
you specific needs. (But needs to be shared to
share your workflow).
13Nested workflows
- A processor can be a workflow itself.
- Encourages the reuse of workflows within a more
complex scenario. - Greater abstraction of an overall process making
it more manageable.
14(No Transcript)
15Iterations
- Scufl handles iterations implicitly
- i.e. Taverna handles it automagically, theres no
need for the user to indicate that there is an
iteration required. - Taverna recognises the data mismatch and
repeatedly runs the task over each data element
in the list. - Iteration stategy with multiple inputs can be
configured.
- Cross product - all against all
- Dot product first against first, second
against second .. etc
16What about when a service fails?
- Most services are owned by other people
- No control over service failure
- Some are research level
- Workflows are only as good as the services they
connect! - To help - Taverna can
- Notify failures
- Instigate retries
- Set criticality
- Substitute alternative
- services
17Taverna Processor Task State Transition Diagram
18Provenance Data?
- Supports scientific method and best practice
- Metadata about the origin of a resource (workflow
, service, data , experiment hypothesis etc) and
the process of how a resource was generated. - The Who? , What? , When? ,Where? and Why? about
resources. - Stored as RDF triples
- Also available as OWL, opening it up to complex
reasoning
19Typed Workflow Run
launchedBy
Provenance Ontology
executed
Experimenter
Organization
ProcessRun
WorkflowRun
Workflow
belongsTo
runs
urnlsidworkflow6
urnlsidorgHY7
runs
belongsTo
urnlsid..wfInstance8
launchedBy
urnlsidperson4
executed
executed
urnlsidprocessRun84
urnlsidprocessRun51
20Provenance Browser
21New plans for Taverna 2.0
22Evolving challenges
- Long running data intensive workflows
- Manipulation of confidential or otherwise
protected information - Use with classical grid systems
- Publishing and sharing of workflows
- Better use of provenance
23Runtime Service Binding
- Service definition consists of an abstract
description - Resolved at workflow runtime to one or more
concrete resources by a broker - Allows load balancing or economic model based
service selection over grid environments
24Processor Dispatch Stack
253rd party data transfers
- Allows in place referencing of data
- Large data sets no longer round-trip between
workflow engine and data provider - Allows restricted access to sensitive data
- Automatic de-reference when a reference type is
linked to a value type within a workflow. - Connecting a grid service to a web service
26Streaming Data
- Allow execution of downstream workflow stages on
partially complete results from upstream.
Service 1
Service 2
Service 3
Non streaming (Taverna 1), entire iteration must
complete at each stage
Streamed data, Service 2 starts operating on
partial results from Service 1
27Conclusions
- Taverna and its source code is free to download.
- http//taverna.sourceforge.net
- Taverna is being adopted by a number of different
disciplines outside its bio-science origins,
including chemoinformatics, social science,
astronomy. - Open architecture and support for plugins to cope
with open world allows expansion into other
areas - User driven development
- Taverna users mailing list
- Taverna hackers mailing list
- Production quality software within OMII-UK
28Acknowledgements
- The myGrid group, past and present.
- OMII-UK
- All our users
- Carole Goble
- Katy Wolstencroft
- Daniele Turi
- Matthew Gamble