Title: The Semantic Web, Service Oriented Architectures, the myGrid Experience
1The Semantic Web, Service Oriented
Architectures, the myGrid Experience
http//www.mygrid.org.uk
2Roadmap
- The problem
- myGrid
- Semantic Service / Workflow Discovery
- Provenance and metadata modelling
- Semantic Web is Semantic Glue
3EPSRC funded UK e-Science Program Pilot Project
Thanks to the other members of the Taverna
project, http//taverna.sf.net
4- Identify new, overlapping sequence of interest
- Characterise the new sequence at nucleotide and
amino acid level
Cutting and pasting between numerous web-based
services i.e. BLAST, InterProScan etc
5Middleware for Life Science solutions
- Interoperation of services and data sources
- Repeat
- Reuse and Share
- Provenance
- Manage results
- My tools, my resources
6Middleware for Life Science
7(No Transcript)
8(No Transcript)
9(No Transcript)
10 First, find your service
- How to select among 3000 services?
- Mostly inputs outputs are string
- Domain specific descriptions of capabilities
- Selection is part of workflow assembly by
bioinformaticians - Selection of alternates for failure also
generally user defined, and usually replicas, but
need not be.
11Which means describe your service
- Publish and find services (and workflows) with
description using an ontology - Define domain types for objects passed around
workflow - Define a set of dimensions with which service
capabilities - GRIMOIRES / WebDAV directory
- Tied to BioMOBY Central
12(No Transcript)
13Semantic discovery
- Publish and find services (and workflows) with
description using an ontology (in OWL/RDF) - Define domain types for objects passed around and
a set of dimensions with which service
capabilities can be defined using processor
abstraction - Bootstrapping descriptions
- Mining and maintaining descriptions
- The Expert Annotator
- GRIMOIRE / WebDAV directory
- Tie into BioMOBY central
- http//phoebus.cs.man.ac.uk8100/feta-beta/mygrid/
descriptions/
Phillip Lord, Pinar Alper, Chris Wroe, and Carole
Goble Feta A light-weight architecture for user
oriented semantic service discovery in Proc of
2nd European Semantic Web Conference, Crete, June
2005
14http//www.swsi.org/
OWL-S
WSMO
OWL-WS
WSDL-S
15Semantic Web ServicesLayered model
Generic Schema for Service (part of Information
model)
Specific Application Ontology e.g. caCORE
We dont describe WSDL, we describe operations
and processors
We are classifying for people not machines, so
dont be too clever!
Web Interface
Wroe C, Goble CA, Greenwood M, Lord P, Miles S,
Papay J, Payne T, Moreau L Automating Experiments
Using Semantic Data on a Bioinformatics Grid in
IEEE Intelligent Systems Jan/Feb 2004
16Operation name, description task method resource
application
Service name description authororganisation
Parameter name, description semantic
type format transport type collection
type collection format
hasInput
hasOutput
subclass
subclass
WSDL based Web service
WSDL based operation
Soaplab service
bioMoby service
workflow
Local Java code
17Semantic Web ServicesSemantic Descriptions for
- Discovery
- Automated Discovery services or workflows
- Knowledge assisted brokering match making
- Guided instantiation and substitution
- Composition
- Automated Composition
- Self organising SOA
- Guided workflow assembly
- Composition (workflow) verification and validation
18Semantics-enabled Problem Solving
Task configuration
EDSO task ontology
Semantic service discovery
Workflow construction
Workflow Advisor
19Observations
- Technical and Abstraction mismatches
- Man vs Machine. Manual vs Automation. Service vs
Domain Semantics. Basic errors in modelling. - Web services in the wild suck. Not everything is
a Web Service. - Legacy
- Services, middleware, content and practice.
- Practicality mismatches
- Automated or assisted discovery desirable,
likely, popular - Automated composition undesirable, unlikely,
unpopular - Capturing and Curating Content
- Annotation is hard. Building the Ontology is
hard. QA is hard. Keeping the annotation up to
date is hard. The Expert annotator Altruism for
Reuse. Quality Control - Hendlers Principle
- A little semantics goes a long way! Too
complicated to use. - Tools!!
20Sharing takes effort.
- Unanticipated reuse by people you dont know in
automated workflows. - The metadata needed pays off but its challenging
and costly to obtain.. - Automated, service providers, network effects
- Quality control. Misuse. Inappropriate use.
- Competitive advantage, Intellectual property.
- Workflow design - local or licensed services
21The devil is in the detail
Experiment provenance
Simple classifications of services
Descriptions in biological language
Simple workflow
Workflows for automagical execution implicit
iteration, generous typing
Descriptions for automatic service execution and
fault management
Debugging and rerunning provenance logs
Expressive ontologies to match up services
automatically
22e-Scientific methodin vivo in vitro in silico
Courtesy Jim Myers, NCSA
23Tavena workflow workbench in myGrid http//tavern
a.sourceforge.nethttp//www.mygrid.org.uk
24Provenance in myGrid
a1
- The process
- The data derivation path
- The ownership
- The evidence of knowledge
E1S1
X1
E1S2
Y1
Z1
25Provenance graph representation
- Identity for the node URI
- Universal Resource Identifier
- An extension of URL
- An RDF (Resource Description Framework) graph
- ltX1gt derivedFrom lta1gt ltx1gt inputOf ltE1S2gt
- Ontologies
- Telling what they are
- ltX1gt isA ltgene gt ltX1gt hasFeature
lthasSimilarityTogt - Each URI is associated with
- A set of provenance statements
- A RDF provenance graph
26Resource Description Framework
27Provenance
- Flexible and extensible schema
- Data fusion and aggregation across provenance
metadata - Reasoning and querying over descriptions
- Transparent description
28(No Transcript)
29myGrid
30Annotate Anything
- People, meetings, discussions, conference talks
- Scientific publications, recommendations, quality
comments - Events, notifications, logs
- Services and resources
- Schemas and catalogue entries
- Models, codes, builds, workflows,
- Data files and data streams
- Sensors and sensor data
- DFDL, JSDL, SAML, WSDL, WSRF, DL, ML as RDF?
- If you are using a controlled vocabulary, then
lets use a standard controlled vocabulary
language.
31Courtesy Joanne Luciano
Seamark Demonstration Identification of new
drug candidates for BRKCB-1
32Observations
- Flexible metadata description for data
- Multi tiered model for different perspectives
- Machine vs Person The ontologies for people
discovery are not good enough for knowledge
aggregation - Make the semantics invisible
- Provenance aggregation Identity crisis
- Exposing knowledge means knowledge exposure.
- Reluctance to give up knowledge assets.
Vulnerability. Knowledge is power. Incentive
models. IPR. Privacy. - Capturing the Semantic Content explicitly.
- Acquiring ontology annotations Hard to describe
policies. Vagueness and trivia. Trying to
capture people-focused provenance. Hendler
principle - A little semantics goes a long way.
33Use of Semantic Web Technologies A Semantic Web
of Life Science