Provenance in Taverna - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Provenance in Taverna

Description:

LSID: Life Science Identifier. URN specification in progress ... Faithfully record them as ontological instance data. RDF graphs (one for each Taverna run) ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 27
Provided by: Chris547
Category:

less

Transcript and Presenter's Notes

Title: Provenance in Taverna


1
Provenance in Taverna
  • Daniele Turi
  • University of Manchester
  • Chimatica Meeting, Manchester, 24/3/06

2
Components
  • Identifiers
  • LSIDs
  • Data
  • JDBC data store
  • Metadata
  • RDF Provenance Plugin
  • Browsing
  • Provenance Browser Plugin
  • Security
  • Under development

3
LSID
4
LSID Life Science Identifier
  • URN specification in progress
  • 5 part identifier (with optional version id)
  • urnlsidwww.mygrid.org.uklsdocumentX1234
  • urnlsidncbi.nlm.nlh.gov.lsid.biopathways.orggen
    bank_gi7717376
  • protocol for retrieving data and metadata about
    an object
  • commitment by the provider to always return the
    same data for an ID

5
LSID (ctd)
  • Issue
  • LSID Authorities
  • Resolution
  • LSID Resolvers
  • abstract, lightweight
  • independent from actual storage implementation
  • database, file system, application
  • both for private and public data sources

6
Data
7
Data Storage
  • Taverna can persist inputs, outputs and
    intermediate results in an SQL database via JDBC
  • Optional and can be done by configuring a Baclava
    Data Store
  • Allows the LSIDs of data items to be resolved
    against the actual data

8
Metadata
9
Metadata Generation
  • Taverna Provenance Plugin
  • Listen to Taverna Events
  • WorkflowEventListener
  • Faithfully record them as ontological instance
    data
  • RDF graphs (one for each Taverna run)

10
Metadata
  • Representation
  • Schema (Ontology)
  • Storage
  • Query

11
Representation
  • RDF
  • triples
  • subject predicate? object
  • semantic web language
  • XML serialization
  • flexible, powerful
  • sets of triples gives rise to graphs

12
Workflow Run
urnlsidworkflow6
urnlsidorgHY7
runs
belongsTo
urnlsid..wfInstance8
launchedBy
urnlsidperson4
executed
executed
urnlsidprocessRun84
urnlsidprocessRun51
13
Schema
  • Ontology
  • Classes and Properties
  • RDF schema
  • Taxonomic inferences
  • also available as OWL
  • opens it up to complex reasoning

14
(No Transcript)
15
Workflow Run
urnlsidworkflow6
urnlsidorgHY7
runs
belongsTo
urnlsid..wfInstance8
launchedBy
urnlsidperson4
executed
executed
urnlsidprocessRun84
urnlsidprocessRun51
16
Typed Workflow Run
launchedBy
Provenance Ontology
executed
Experimenter
Organization
ProcessRun
WorkflowRun
Workflow
belongsTo
runs
urnlsidworkflow6
urnlsidorgHY7
runs
belongsTo
urnlsid..wfInstance8
launchedBy
urnlsidperson4
executed
executed
urnlsidprocessRun84
urnlsidprocessRun51
17
Storage
  • Named RDF graphs
  • retrieve whole graphs (eg workflows)
  • implementation in NG4J (Jena MySQL)
  • scalability issues
  • new implementation almost ready using Sesame2
    native store
  • scalable

18
Query
  • RDF query languages
  • TriQL, SeRQL, SPARQL
  • Ontology inspection/reasoning
  • Canned Queries
  • workflows with failed processes
  • input/output of past process runs
  • workflows with data changed by user

19
(No Transcript)
20
Browsing
21
Provenance Browsing
  • Provenance Browser Plugin
  • reusing Taverna GUI components
  • Matthew Gamble

22
(No Transcript)
23
Analysis
24
Provenance Analysis
  • Comparison
  • Aggregation
  • etc
  • see work by Jun Zhao

25
Security
26
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com