Personal Data Management - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

Personal Data Management

Description:

Automated workflows produce lots of heterogeneous data ... runBy. e.g. BLAST _at_ NCBI. run for. Organisation level provenance. Process level provenance ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 14
Provided by: Chris547
Category:

less

Transcript and Presenter's Notes

Title: Personal Data Management


1
Personal Data Management
  • Why is this such an issue? Data Provenance
  • Representing links v Representing data
  • Identifying resources Life Science Identifiers
  • Different types of provenance
  • Provenance generation
  • Provenance storage
  • Provenance retrieval

2
Problem
  • Automated workflows produce lots of heterogeneous
    data
  • These are just some of the results from one
    workflow run for Williams Disease

3
Amplification of results
One input
Many outputs
4
Link v Data Representation
  • Data management questions refer to relationships
    rather than internal content
  • What are the origins of this data?
  • Which service produced this data?
  • Which data is this derived from?
  • Who was this data produced for?
  • ?What is this data telling me?
  • Data analysis questions delegated to external
    services.

5
Representing links
urnlsidtaverna.sf.netdatathing45fg6
urnlsidtaverna.sf.netdatathing23ty3
  • Identify each resource
  • Life science identifier URI with associated data
    and metadata retrieval protocols.
  • Understanding that underlying data will not change

6
Representing links II
http//www.mygrid.org.uk/ontologyderived_from
urnlsidtaverna.sf.netdatathing45fg6
urnlsidtaverna.sf.netdatathing23ty3
  • Identify link type
  • Again use URI
  • Allows us to use RDF infrastructure
  • Repositories
  • Ontologies

7
Provenance (1)
Organisation level provenance
Process level provenance
Service
Project
runBye.g. BLAST _at_ NCBI
Experiment design
Process
Workflow design
componentProcesse.g. web service invocation of
BLAST _at_ NCBI
Event
partOf
instanceOf
componentEvente.g. completion of a web service
invocation at 12.04pm
Workflow run
Data/ knowledge level provenance
knowledge statementse.g. similar protein
sequence to
run for
User can add templates to each workflow process
to determine links between data items.
Data item
Person
Organisation
Data item
Data item
data derivation e.g. output data derived from
input data
8
Storing management metadata
  • Automated generation of this web of links
    preferable
  • Workflow enactor generates
  • LSIDs
  • Data derivation links
  • Knowledge links
  • Process links
  • Organisation links

As RDF
9
Provenance generation
  • Configuring and generating provenance within
    Taverna

10
Storage
  • LSID has no protocol for storage
  • Taverna/ Freefluo implements its own data/
    metadata storage protocol

Publish interface
Taverna/ Freefluo
Metadata Store
data
Data store
metadata
11
Retrieval
  • LSID protocol used to retrieve data and metadata
  • Query handled separately

LSID aware client
RDF aware client
LSID interface
Query
Metadata Store
Data store
12
LSID launchpad
  • Light weight plug in to Internet Explorer
    providing access to LSID data / metadata
  • demo

13
Using IBMs Haystack
GenBank record
Portion of the Web of provenance
Managing collection of sequences for review
Write a Comment
User Comments (0)
About PowerShow.com