Part 4: Pioneers and Examples Carole Goble - PowerPoint PPT Presentation

1 / 77
About This Presentation
Title:

Part 4: Pioneers and Examples Carole Goble

Description:

Trader. Provider. Author. Requestor. Description. Description ... auto. User. UDDI style advertisements. Weak semantic descriptions. Rewriting and expansion ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 78
Provided by: caro256
Category:

less

Transcript and Presenter's Notes

Title: Part 4: Pioneers and Examples Carole Goble


1
Part 4 Pioneers and ExamplesCarole Goble
2
Specific services Application specific
controlled vocabularies
Grid Applications
Standard services semantic data integration,
service discovery, workflow enactment
composition, provenance, portals
Open Grid Service Architecture Tupperware
upper services
Standard services resource selection, matching
and brokering
Open Grid Service Architecture Underware
plumbing services
Standard interfaces and behaviours for
distributed systems naming, service state,
lifetime management, notification, registry
management
Web Service Resource Framework Web
Service-Notification WS-I
Standard mechanisms for describing and invoking
services WSDL, SOAP, WS-Security etc
Web Services
3
Where SW technologies are being used in e-Science
CombeChem, myGrid, CMCS, Hero
Composing, validating and repairing workflows and
service compositions negotiations
Describing and linking provenance records
GriPhyN (Pegasus), GRIP
Matching and provisioning
Pegasus, Geodise, myGrid, Kepler, CAT-S
Resource, service, workflow, data
set, registration, description discovery
Notification topics
myGrid, Geodise, SEEK
Controlled vocabularies for metadata and data
Schema mediation
Knowledge-based guidance and recommendation
MIAKT, AstroGrid
SEEK, GEON, BIRN Artemis
Geodise
4
Publication and Discovery
  • Promote sharing and (re)use
  • Services, resources and workflows require a
    semantic-driven description
  • Semantics is key to negotiation, discovery and
    workflow composition
  • If you cant describe what you want, you cant
    have it.
  • If you cant describe what youve got, no-one can
    find it or use it.
  • To (re)use components we need a way of describing
    what they do a place to put descriptions and a
    way of searching for them

Annotator
Domain User
5
Background
  • Semantic Web Services
  • Darpa Agent Mark-up Language
  • DAML-S http//www.daml.org
  • Semantic Web Services Initiative
  • OWL-S http//www.swsi.org
  • EU FP6 SDK Cluster
  • Web Service Modelling Ontology
  • http//www.wsmo.org/
  • Purpose automated service discovery and
    composition suitable for agent-based frameworks.

6
http//www.mygrid.org.uk
Intelligent engineering design search and
optimisation for fluid dynamics. Matlab based
http//www.geodise.org
7
Publication and Discovery
Ontologies
Classification
Knowledge consuming application Workflow
construction
Raw function/service/workflow
Discovery consumption mechanisms
Function/Service/ Workflow Annotation
Registration
Publishing
Retrieval
Function/Service/Workflow Repository
8
In silico biology http//www.mygrid.org.uk
  • Construct in silico experiments, find and adapt
    others, manage the experiment lifecycle
  • Tupperware
  • Workflows and DQP
  • Semantic registries,
  • Knowledge-based provenance and metadata
    management
  • Event notification
  • Taverna workflow workbench

Middleware for data intensive in silico biology
by bioinformaticians
9
Discovery in Taverna Workflow Workbench
  • User chooses services or workflows.
  • A common ontology (in OWL) used to annotate and
    query any myGrid object including services.
  • Discover workflows and services described in the
    registry via Taverna.
  • Find workflows that accept an input of semantic
    type nucleotide sequence requires annotating
    data as well as service inputs.

10
Discovery in Taverna Workflow Workbench
  • Drag a workflow entry into the explorer pane and
    the workflow loads.
  • Drag a service/ workflow to the scavenger window
    for inclusion into the workflow

11
Information Model
  • Components form a loosely coupled system
  • An Information Model for e-Science experiments,
    based on CCLRC scientific metadata model
  • XML messages between services conform to the
    model
  • Life Science Identifiers (URNs) uniquely
    identify all myGrid experimental objects
    (workflows, workflow templates, data, data sets
    etc

Domain specific knowledge model
Domain neutral in silico experiment data model
XML
http//cvs.mygrid.org.uk/cgi-bin/viewcvs.cgi/mygri
d/MIR/model/
12
Model of services
operation name, description input output task met
hod resource application
service name, description authororganisation
input
parameter name, description semantic
type format transport type collection
type collection format
output
workflow
WSDL operation
WSDL service
Soaplab service
bioMoby service
13
Service Ontology Suite
parameters input, output, precondition,
effect performs_task uses-resource is_function_of
Upper level ontology
Inspired by DAML-S
Publishing ontology
Informatics ontology
Molecularbiology ontology
Organisationontology
Task ontology
Bioinformatics ontology
Web serviceontology
Current work Joint development on an Open
Biological Ontologies BioService Ontology.
http//obo.sourceforge.net/
14
A Blast Description
  • Service Name Blast
  • Operation execute
  • task pairwise_local_aligning
  • resource EMBL
  • application blastn
  • Parameter
  • Input
  • Name accession
  • semantic type EMBL Nucleotide sequence id
  • transport data type string
  • Output
  • Name Result
  • semantic type sequence alignment report
  • transport data type string

15
Discovery
Service Providers
Ontology Store
Ontologists
Others
Vocabulary
WSDL
Feta Semantic Discovery
Soap- lab
Bioinformaticians
Registry
Taverna Workbench
Registry (Personalised View)
Registry
Registry
Workflow Execution
FreeFluo WfEE
invoking
mIR
Store data metadata
16
Feta Example
  • Domain dependent query
  • Find a workflow or service that performs
    nucleotide sequence alignment
  • performs task aligning or more specific
  • accepts input nucleotide sequence or more
    general

Biological data
Task
Bio Sequence data
Aligning
Nucleotide sequence data
Local aligning
.
Pairwise local aligning
Protein sequence data
Global aligning
.
.
17
Feta Semantic Discovery
18
Publication
Service Providers
Ontologists
Others
Ontology Store
Description extraction
WSDL
Interface Description
Vocabulary
Soap- lab
Pedro Annotation tool
Annotation providers
Annotation/ description
Taverna Workbench
Registry (Personalised View)
Registry
Registry plug-in
Registry
19
Pedro Data Entry Tool
Pedro Data Entry Tool
20
Annotating Anything
Ontologists
Ontology Store
Vocabulary
Haystack Provenance Browser
Pedro Annotation tool
Annotation providers
Annotation/ description
Scientists
Taverna Workbench
myGrid Information Repository
Store plug-in
Metadata store RDF Jena
Data Store RDBMS mySQL
21
Stratified metadata
  • Service Type and Class (OWL)
  • Service Instance (RDF)

22
Describing workflows
23
(No Transcript)
24
Application of Semantic Grid for engineers
  • VERTICAL advice on workflow assembly
  • Semantic matching
  • Contextual advice
  • HORIZONTAL advice on component configuration
  • Low level at semantic level
  • What needs to be filled out ( for a valid
    configuration)
  • High level at knowledge level
  • Filled out with what? Why? Suggest suitable value
    (for best configuration/ usage)
  • Integration
  • GUI mode Workflow Composer Environment (WCE)
  • Text mode workflow editor (Domain Script Editor)

25
Semantic MetadataManagement System
26
System Deployment
27
Knowledge and Application Integration Architecture
Workflow Construction Environment
Semantic driven
Decision-Tree
Workflow Advisor
Workflow Wizard
Function/Workflow Manager
Archive Manager
Ontology Manager
Semantic Queries
Semantic Annotation
Database Archiving
Semantic Archiving
Function Archive
Workflow- Template Archive
Workflow Archive
Geodise Ontologies
28
Publication Function Annotator
  • Customised for Matlab functions
  • Automatic parsing of Matlab function source
  • Instantiating concepts defined in ontology
  • Semi-automatic filling of the ontology driven
    forms

29
Advice on Function Assembly(Integrated in Matlab
Knowledge Toolbox)
  • Goal
  • Function assembly
  • What can be deploy next and before?
  • Mechanism
  • Matlab ? Java ? WSDL ? Web service
  • Function semantic interface
  • Semantic matching
  • Pre-requirements
  • Function has been annotated
  • Semantics available in the instance store

30
Advice on Function Assembly(Integrated in Domain
Script Editor)
Domain script editing area
Ontology and semantics
Function configuration advice
Function assembly advice
31
Advice on Function Configuration
get the default beam structure beam
createBeamStruct (4) analyze the OMETH and
advice on its additional control parameter (with
default value) beamcontrol gdk_options(beam)
check semantics gdk_semantics(GD_NPOP)
further configure these control parameters
run options s OptionsMatlab (beamcontrol)
1
2
3
32
Advice on Function Assembly(Integrated in WCE
workflow advisor)
  • Contextual advice
  • Workflow composition via interface semantic
    matching
  • Function configuration via semantic annotation
    decomposition
  • Semantics-based function workflow discovery
  • Exploring new components in workflows
  • Intelligent workflow monitoring based on
    provenance data

Select a function and request advice
Function assembly advice
33
Non-invisible function discovery
34
Towards Service Orientated Paradigm
35
When to reason?
Ontologies
Classification
Knowledge consuming application Workflow
construction
Raw function/service/workflow
Discovery consumption mechanisms
Function/Service/ Workflow Annotation
Registration
Publishing
Retrieval
Function/Service/Workflow Repository
36
When to reason?
Ontologies
Classification
Knowledge consuming application Workflow
construction
Raw function/service/workflow
Discovery consumption mechanisms
Function/Service/ Workflow Annotation
Registration
Publishing
Retrieval
Function/Service/Workflow Repository
37
When to reason?
Ontologies
Classification
Knowledge consuming application Workflow
construction
Raw function/service/workflow
Discovery consumption mechanisms
Function/Service/ Workflow Annotation
Registration
Publishing
Retrieval
Function/Service/Workflow Repository
38
Simplifying interfaces
  • Creating maintaining the ontology
  • Generating Concrete nodes (semantic instances)
  • Instantiating abstract nodes defined in ontology
  • Filling ontology driven forms with semantic
    content

39
Remarks and Reflections
  • Who is doing the discovering? Is it automated or
    manual?

User
Human manual
Machine automatic
Provider
UDDI style advertisements
Weak semantic descriptions Rewriting and
expansion Geodise Ontoview, Pedro tool
Human manual
Syntactic descriptions Semantic mining myGrid
Feta load tool Pegasus, Cardiff
Elaborate Semantic descriptions Simplification
views Geodise Ontoview
Machine auto
40
Reflections
  • Eager vs Late reasoning
  • If people are selecting then you need just enough
    semantics for a shortlist
  • Ontology invisibility
  • Painless publication
  • The rise of the specialist annotator
  • Describing for reuse is challenging
  • Reuse depends on costly semantic descriptions
  • Describing for someone elses benefit
  • Reuse by multiple stakeholders
  • Metadata pays off but it needs a network effect
    and there is a cost.

41
So far, Using Concepts
  • Controlled vocabulary for advertisements for
    workflows and services
  • Indexes into registries and mIR
  • Semantic discovery of services and workflows
  • Semantic discovery of repository entries
  • Type management for composition
  • Semantic workflow construction guidance and
    validation
  • Navigation paths between data and knowledge
    holdings
  • Semantic glue between repository entries
  • Semantic annotation and linking of workflow
    provenance logs

42
Provenance
  • Experiments being performed repeatedly, at
    different sites, different times, by different
    users or groups

A large repository of records about experiments!!
  • verification of data
  • recipes for experiment designs
  • explanation for the impact of changes
  • ownership
  • performance of services
  • data quality

Scientists
In silico experiments
43
Provenance forms
  • Derivations
  • A path like a workflow, script or query.
  • Linking items, usually in a directed graph.
  • An explanation of when, who, how something
    produced.
  • Execution Process-centric
  • Annotations
  • Attached to items or collections of items, in a
    structured, semi-structured or free text form.
  • Annotations on one item or linking items.
  • An explanation of why, when, where, who, what,
    how.
  • Data-centric

44
A digital lab book for chemists.
45
COSHH
46
(No Transcript)
47
(No Transcript)
48
(No Transcript)
49
(No Transcript)
50
(No Transcript)
51
(No Transcript)
52
(No Transcript)
53
(No Transcript)
54
Architecture
Viewing Tools
Sem. Web Apps
RDF over SOAP
Semantic Data
Results Data
55
(No Transcript)
56
getRecord()
57
getObservation()
58
RDF
  • Common model for metadata
  • A graph a set of triples
  • Query over
  • Link together
  • Aggregate
  • Integrate
  • Avoids pre-commitment
  • Self-describing
  • Incremental
  • Extensible
  • RDQL, repositories, integration tools,
    presentation tools

Data
Workflow
Experiment
User
Service
Graphic based on Tim Berners-Lee
http//www.w3.org/2003/Talks/0521-www-keynote-tbl/
slide22-0.html
59
Bridging islands
Service 1
Service 2
Workflow 1
Experimental Investigation 1
Data 1
60
Bridging islands Concepts and LSID
Service 1
Service 2
Workflow 1
RDF
RDF
RDF
RDF
RDF
RDF
Experimental Investigation 1
Data 1
61
Provenance Web
62
Provenance of data
  • Operational execution trail

GeneAC005412.6
SNP000010197
input
output
processstart timeend time
run_for
by_service
urn Clare Jennings
lsidHGVBase_retrieve
63
Provenance of knowledge
  • Declarative semantic execution trail

contains_single_nucleotide_polymorphism
GeneAC005412.6
SNP000010197
input
output
as stated by
processstart timeend time
run_for
by_service
urn Claire Jennings
lsidHGVBase_retrieve
64
Provenance of knowledge
urn Carole Goble
  • Trust and attribution

disputed by
contains_single_nucleotide_polymorphism
GeneAC005412.6
SNP000010197
input
output
as stated by
processstart timeend time
run_for
by_service
urn Claire Jennings
lsidHGVBase_retrieve
65
Provenance of knowledge
  • Aggregation and integration

processstart timeend time
run_for
by_service
urn Bill Jones
lsidBIGDbretrieve
as stated by
contains_single_nucleotide_polymorphism
GeneAC005412.6
SNP000010197
66
Provenance
Ontology-aided workflow construction
  • RDF-based service and data registries
  • RDF-based metadata for experimental components
  • RDF-based provenance graphs
  • OWL based controlled vocabularies for database
    content
  • OWL based integration of experiment entities

RDF-based semantic mark up of results, logs,
notes, data entries
http//www.mygrid.org.uk
67
Aside LSIDs
  • urnlsidAuthorityIDNamespaceIDObjectIDRevisio
    nID
  • urnlsidncbi.nlm.nig.govGenBankT486012
  • urnlsidebi.ac.ukSWISS-PROT.accessionP343553
  • urnlsidrcsb.orgPDB1D4X22
  • LSID Designator A mandatory preface that notes
    that the item being identified is a life
    science-specific resource
  • Authority Identifier An Internet domain owned by
    the organization that assigns an LSID to a
    resource
  • Namespace Identifier The name of the resource
    (e.g., a database) chosen by the assigning
    organization
  • Object Identifier The unique name of an item
    (e.g., a gene name or a publication tracking
    number) as defined within the context of a given
    database
  • Revision Identifier An optional parameter to
    keep track of different versions of the same item

68
Information Access
LSID aware client
RDF aware client
LSID interface
Query
Publish interface
Metadata Store
Taverna/ Freefluo
MIR Metadata Store RDF
data
Data store
MIR Data store XML
metadata
Query
XML aware client
69
Organisation level provenance
Process level provenance
Service
Project
runBye.g. BLAST _at_ NCBI
Experiment design
Process
Workflow design
componentProcesse.g. web service invocation of
BLAST _at_ NCBI
Event
partOf
instanceOf
componentEvente.g. completion of a web service
invocation at 12.04pm
Workflow run
Data/ knowledge level provenance
knowledge statementse.g. similar protein
sequence to
run for
User can add templates to each workflow process
to determine links between data items.
Data item
Person
Organisation
Data item
Data item
data derivation e.g. output data derived from
input data
70
Provenance tracking
  • Automated generation of this web of links
  • Workflow enactor generates
  • LSIDs
  • Data derivation links
  • Knowledge links
  • Process links
  • Organisation links

Relationship BLAST report has with other items in
the repository
Other classes of information related to BLAST
report
71
Haystack (IBM/MIT)
GenBank record
Portion of the Web of provenance
Managing collection of sequences for review
72
(No Transcript)
73
Provenance metadata
  • Outside objects
  • RDF store
  • Within objects
  • LSID metadata.

74
Linked Provenance Resources
The subsumed concepts
Link to the log annotated with more general
concept
The subsuming concepts
Link to the log annotated with more specific
concept
75
Generating Links
The concept
The generated Link to related provenance
document
The name of the data
76
P Afflard et al The Grid(s)? _at_ Novartis presented
at PRISM PharmaGrid retreat, July 2003
77
William Pike, Ola Ahlqvist, Mark Gahegan, Sachin
Oswal Supporting Collaborative Science through a
Knowledge and Data Management Portal in 1st
Semantic Web Conference (ISWC2003) Workshop on
Retrieval of Scientific Data, Florida, USA,
October 2003
78
Two views of a gravity model conceptfrom
the Hero CODEX web tool
William Pike, Ola Ahlqvist, Mark Gahegan, Sachin
Oswal Supporting Collaborative Science through a
Knowledge and Data Management Portal in 1st
Semantic Web Conference (ISWC2003) Workshop on
Retrieval of Scientific Data, Florida, USA,
October 2003
  • a social network reveals which users favour
    different instances of the model, with edge
    length suggesting the degree of support.
  • An ontological description shows how one
    geoscientist constructs a model

79
Collaboratory for Multi-Scale Chemical Science
CMCS Pedigree Graph portlet showing provenance
relationships between resources (colour coded by
original relationship type).
CMCS Pedigree Browser showing the metadata and
relationships of the selected data set.
80
Provenance dimensions connected by concepts and
identifiers
project


Services






Workflow instances
Author
project
workflow template
Based on http//www.w3.org/2003/Talks/0521-www-key
note-tbl/slide22-0.html
81
awareness ofcolleagues presence
BuddySpace
Access Grid Node
virtual meetings
mapping real time discussions/group sense making
NetMeeting
recovering information from meetings
enacting decisions/coordinating activities
synthesising artefacts
I-X planning tools

http//www.aktors.org/coakting/ Courtesy of David
De Roure
82
GEON Grid Applications
http//www.geongrid.org/
Courtesy Bertram Ludaescher
83
http//www.aktors.org/miakt/
84
Reflections
  • Relationship between RDF and other data models.
  • When should we use RDF?
  • Scalability of technologies
  • Querying over and aggregation of metadata
  • How to do it?
  • Real examples
  • Users viewing complex models
  • Domain specific interfaces

85
Knowledge Stakeholders
Knowledge for the Grid Applications
Semantics for the Grid
Sources of Knowledge
Write a Comment
User Comments (0)
About PowerShow.com