Title: Taverna and BioMOBY
1Taverna and BioMOBY
- Savita Shrivastava
- Lab presentation
- September 16th, 2005
2A Life Science Scenario
GeneID
Over expressed genes
DNA Sequence
Microarray data
Protein Sequence
- "To answer most interesting biological problems,
you need to combine data from many data sources,"
3A Life Science Scenario contd.
- Repeat the discovery process
- The same chain of services to be applied to
different data sets - Publish the methodology
- Preserve as much information about how services
were used - Trace the data to support a conclusion
4The Problem
- Finding the available resources on the web
- What and who?
- How to use it?
- What Input and output data types?
- How to create an analysis pipeline (take the
output from one analysis and send it to another
resource/website for further analysis)?
5The Problem contd.
- Diversity and complexity of information
- Data-intensive
- Compute-intensive
- Diversity and autonomy of the biological
community - Currently 5000 public data repositories and
applications - Different formats, interfaces, structures, and
coverage - Reliability
6The Problem contd.
Code for translation of a DNA sequence
7The Problem contd.
Code for getting Secondary Structure
8The Problem contd.
Code for Signal peptide and Transmembrane region
prediction
9The Problem contd.
Code for creating analysis pipeline
10The Solution - Workflow Techniques
- What is a workflow?
- A set of components/services and relation between
them - Defines a complex process from simple building
blocks - Transfers information from the output of one
component to the input of another
11Terminology
- What is a Component?
- Any command line analysis tool or
- A script written in any programming language
- Consumes an input and produce an output
- What are Services?
- All services are also Component
- Components hosted on remote machine and
accessible through internet
12Advantage of Workflow Technology
- Analysis processes in a structured, repeatable
and verifiable way - Applications are centralized to a common
interface - Automate the high through put data analysis to
accelerate the pace and reduce overall costs
13Advantage of Workflow Technology
- Users are freed from component management
- No high performance computing hardware
- Provenance
- Who, what, where, why, when, how?
- The traceability of knowledge as it is evolves
and as it is derived. - Implications for recording which services invoked
on what data, when with what parameters. - No programming and command line work, just mouse
clicks and Drag and Drop
14Taverna Workbench
- Designed for a biologist
- Allow users to construct complex analysis
workflows - Components are located in both remote and local
machine - Implicit iteration support
-
15Taverna Workbench
- Result browsing and data encapsulation
- Provenance recording
- Fault tolerance features
-
16Availability
- http//taverna.sourceforge.net/
- Available for Windows, Linux, Mac and Sun
- Download and unzip the file
- Requires Java (1.4 and above) to run
- Tutorial is available on website.
17Workbench Window
18Advanced Model Explorer Window
19Available Services
20Workflow Diagram
21Run Workflow
22Enactor Invocation
23Enactor Invocation-Result
24Enactor Invocation-Report
25BioMOBY Services in Taverna
- http//www.biomoby.org
- gt230 services registered and growing!!!
- 70 independent service providers
- List and details of all the MOBY services on-
- http//mobycentral.icapture.ubc.ca/cgi-bin/list.se
rvices.cgi
26What is BioMOBY?
27- Information discovery system
- Discovers and retrieves related piece of data
from multiple hosts and services - Provides a common format for the retrieved data
representation regardless of its origin.
28Three Components of MOBY
- MOBY-Central
- MOBY-Servers
- MOBY-Clients
29MOBY-Central(A central registry of the services)
- Responsible for cataloguing the services provided
by all MOBY-Servers - Respond to queries about these services from
MOBY-Client - MOBY-Central stores only the information about
the services available, and the URL of the
service specification
30MOBY-Servers(Provide one or more services)
- Create a service e.g. database lookup
retrieval, sequence alignment, Blast homology
search, etc. - MOBY-Servers register themselves with
MOBY-Central as providing service X using
input(s) Y and generating output(s) Z.
31MOBY-Clients(The end-users computer )
- Identify type of data
- Generate the query to MOBY-central
- Receives the list of potential services and
returned data types, - Selection of the desired service.
- Initiates contact with the service, submits a
properly formatted query, and receives/interprets
the service response.
32(No Transcript)
33MOBY uses ontology
- Object ontology types of input/output objects
- Service ontology types of services
- - Analysis
- - Parsing
- - Registration
- - Retrieval
- - Resolution
- Service relationships
- Service lt-- Analysis lt-- Alignment
lt-- BLAST lt--
WU BLAST NCBI BLAST
34What does MOBY provide?
- A library that makes it easy to write new web
services - A library of calls to discover existing services
according to - The name of the service
- What input is required
- What output is produced
- Keywords in the description of the service
- Who wrote the service
- Calls to register new data types, existing types
include - GenericSequence
- DNASequence
- Keyword
- FASTA
- Calls to register new services
- You do not need to install your own MOBY-Central
35How the service providers register with MOBY
Central
- A service name
- Service name from service type hierarchy
- Input/Output object (s) by name from the
Object-type hierarchy - A URI identifying the service provider
- The URL to the service script
- A human readable description of the service
- There are many tools available on homepage of
BioMOBY to create services
36Summary of MOBY's capabilities
- Easy to write web services using the MOBY library
- Services can be written by anyone, anywhere
- Anyone can registers their services with MOBY
Central - Anyone can query MOBY Central to discover and
learn about any service
37MOBY clients
- Gbrowse_moby
- http//mobycentral.icapture.ubc.ca/cgi-bin/gbro
wse_moby - Taverna
- http//taverna.sourceforge.net/
38A Simple Workflow
39(No Transcript)
40(No Transcript)
41(No Transcript)
42(No Transcript)
43(No Transcript)
44(No Transcript)
45(No Transcript)
46(No Transcript)
47(No Transcript)
48(No Transcript)
49(No Transcript)
50(No Transcript)
51(No Transcript)
52(No Transcript)
53(No Transcript)
54(No Transcript)
55(No Transcript)
56(No Transcript)
57(No Transcript)
58(No Transcript)
59(No Transcript)
60(No Transcript)
61(No Transcript)
62