Promoting reuse and repurposing on the Semantic Grid - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Promoting reuse and repurposing on the Semantic Grid

Description:

A distributed computing infrastructure for advanced science and engineering ... Courtesy Bertram Ludaescher. CHESS seminar July 2005. Scientist queries ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 30
Provided by: antoong
Category:

less

Transcript and Presenter's Notes

Title: Promoting reuse and repurposing on the Semantic Grid


1
Promoting reuse and repurposing on the Semantic
Grid
  • Antoon Goderis
  • University of Manchester, UK
  • CHESS seminar, 19 July 2005

2
Talk plan
  • The grid
  • The semantic grid
  • Reuse and repurposing
  • 7 bottlenecks to repurposing
  • Semantics to the rescue

3
The Grid
  • Pervasive and dependable computing utility
  • A distributed computing infrastructure for
    advanced science and engineering
  • Coordinated resource sharing and problem solving
    in dynamic, multi-institutional virtual
    organisations

4
Science in the 21st century
  • Huge quantities of data
  • Huge number of data collection devices
  • Analysis is the bottleneck
  • Global distributed science
  • Collaboration and sharing the norm
  • In silico experiments
  • Build, reuse, repurpose on-line concurrent
    processes (workflows)

5
Grid application evolution
Smaller scale data, less machine computational
intensive, complex heterogeneous applications,
complex semantics, many people
Functional Genomics Oceanography Biodiversity Eart
h Science Neuroscience
Large scale data, large number of machines,
expensive computation, simple semantics, small
numbers of people
High Energy Physics
6
The Semantic Grid
  • The Grid has been about large scale computation
  • But the applications are also about collaboration
  • A gap between grid computing endeavours and the
    vision of Grid computing
  • To support the full richness of the vision we
    need both grid and semantic web (technologies)
  • Knowledge explicitly asserted explicitly used

7
Semantic Grid
Semantic Web
Richer semantics
Classical Web
Classical Grid
More computation
Source Norman Paton
8
Semantics in Grid workflows
  • Classification and discovery of computational and
    data resources provenance trails
  • Declarative specification of services, workflows
    and their requirements problem solving selection
  • Job control, distributed execution models,
    semantic integration, resource brokering,
    resource scheduling
  • Encoding performance metrics, service state,
    event notification topics, access rights to
    databases, personal profiles and security
    groupings charging infrastructure

9
Talk plan
  • The grid
  • The semantic grid
  • Reuse and repurposing
  • 7 bottlenecks to repurposing
  • Semantics to the rescue

10
From building workflows to recycling them
  • Reuse of workflows
  • Best practice
  • Training
  • Peer review
  • Repurposing
  • Adapt and extend useful fragments
  • Build on best practice
  • Across groups / communities

11
Analyze This
12
Analyze This x scientistsx workflowsx
versionsx runs
13
Bridging user information need and workflow
descriptions
14
Bridging user information need and workflow
descriptions
Network effects!
15
Reuse and repurposing
  • A user will reuse a workflow or workflow fragment
    that fits their purpose and could be customised
    with different parameter settings or data inputs
    to solve their particular scientific problem.

16
Reuse and repurposing
  • A user will reuse a workflow or workflow fragment
    that fits their purpose and could be customised
    with different parameter settings or data inputs
    to solve their particular scientific problem.
  • A piece of an experimental description that is a
    coherent sub-workflow that makes sense to a
    domain specialist (in Ptolemy, a composite actor)
  • A snippet of workflow code annotation

17
Reuse and repurposing
  • A user will reuse a workflow or workflow fragment
    that fits their purpose and could be customised
    with different parameter settings or data inputs
    to solve their particular scientific problem.
  • A user will repurpose a workflow or workflow
    fragment by
  • finding one that is close enough to be the basis
    of a new workflow for a different purpose and
  • making small changes to its structure to fit it
    to its new purpose.
  • Aiming for automated discovery of ranked
    fragments

18
7 bottlenecks to workflow repurposing
  • Lack of a comprehensive discovery model
  • Process knowledge acquisition bottleneck
  • Lack of workflow fragment rankings
  • Workflow interoperability
  • Restrictions on service availability
  • Rigidity of service and workflow definitions
  • Intellectual property rights on workflows

Make workflows usable
Collect enough workflows
19
A comprehensive discovery model
  • A user will repurpose a workflow or workflow
    fragment by
  • finding one that is close enough to be the basis
    of a new workflow for a different purpose and
  • making small changes to its structure to fit it
    to its new purpose.
  • Based on semantic annotation, find a set of
    workflows, which people can then edit
  • For scientists data flow based queries in their
    jargon, largely abstracting from control
  • For developers control flow based queries,
    largely abstracting from data

20
Kepler
http//kepler.ecoinformatics.org/
Courtesy Bertram Ludaescher
21
A comprehensive discovery model
  • Scientist queries
  • Find all processes where sequence alignment is
    followed by visualisation
  • Given a set of data points, services, or
    fragments, have these been connected up in an
    existing base of workflows? Alternatives?
  • Show me the provenance of this workflow
  • Developer queries
  • How have people applied this dataflow execution
    model (eg in Ptolemy, an SDF Director)?
  • How can it be combined with other execution
    models?

22
A comprehensive discovery model
  • Challenges
  • Libraries of (scientific) task based patterns
  • Eg task semantics of gene annotation pipelines
    classified in OWL
  • Libraries of design patterns for distributed
    behaviour
  • Identify how people build concurrent systems how
    they choose (combinations of) execution semantics
  • A good start workflow patterns for Petri Nets
  • Eg synchronizing merge and multi-merge

23
Workflow fragment rankings
  • A user will repurpose a workflow or workflow
    fragment by
  • finding one that is close enough to be the basis
    of a new workflow for a different purpose and
  • making small changes to its structure to fit it
    to its new purpose.
  • We need metrics for processes
  • For scientists ranking scientific relevance
  • For developers
  • compare processes based on the same execution
    semantics
  • compare different execution semantics
  • Challenge defining the metrics, and combining
    them into rankings

24
Workflow interoperability
  • A user will repurpose a workflow or workflow
    fragment by
  • finding one that is close enough to be the basis
    of a new workflow for a different purpose and
  • making small changes to its structure to fit it
    to its new purpose.
  • Workflows take a long time to build and get very
    large
  • The nice thing about standards
  • Different workflow systems, different (implicit)
    semantics
  • Import workflows across workflow environments
  • Manually redo it in your own
  • Wrapping
  • Auto-rewrite to new environment
  • eg

25
Workflow interoperability
  • To inform interoperation, we need a layer of
    abstraction that captures behavioural semantics
  • Many non-standardised formalisms out there
  • Functional languages - one paradigm fits all?
  • Petri nets
  • Process algebras
  • Finite State Machines
  • All (hierarchical-) combinations of these
  • Challenge
  • Behavioural design patterns to compare formalism
    classes, eg PN and SDF Director

26
Conclusions
  • Grid Semantic Grid
  • Reuse ltgt repurposing
  • Task and behavioural semantics both needed for
    repurposing
  • Design patterns for distributed processes a long
    road ahead
  • Task semantics
  • Behavioural semantics

27
EPSRC funded UK eScience Program Pilot Project
Many slides taken from Carole Goble
28
  • Core
  • Matthew Addis, Nedim Alpdemir, Tim Carver, Rich
    Cawley, Neil Davis, Alvaro Fernandes, Justin
    Ferris, Robert Gaizaukaus, Kevin Glover, Carole
    Goble, Chris Greenhalgh, Mark Greenwood, Yikun
    Guo, Jan Humble, Ananth Krishna, Peter Li,
    Phillip Lord, Darren Marvin, Simon Miles, Luc
    Moreau, Arijit Mukherjee, Tom Oinn, Juri Papay,
    Savas Parastatidis, Norman Paton, Terry Payne,
    Matthew Pocock Milena Radenkovic, Stefan
    Rennick-Egglestone, Peter Rice, Ian Roberts,
    Martin Senger, Nick Sharman, Robert Stevens,
    Victor Tan, Anil Wipat, Paul Watson, Jimi
    Worthington and Chris Wroe.
  • Users
  • Simon Pearce and Claire Jennings, Institute of
    Human Genetics School of Clinical Medical
    Sciences, University of Newcastle, UK
  • Hannah Tipney, May Tassabehji, Andy Brass, St
    Marys Hospital, Manchester, UK
  • Steve Kemp, Liverpool, UK
  • Postgraduates
  • Martin Szomszor, Duncan Hull, Jun Zhao, Pinar
    Alper, Keith Flanagan, Antoon Goderis, Tracy
    Craddock, Alastair Hampshire
  • Industrial
  • Dennis Quan, Sean Martin, Michael Niemi, Syd
    Chapman (IBM)
  • Robin McEntire (GSK)
  • Collaborators
  • Keith Decker

29
References
  • Publications on
  • Home page www.cs.man.ac.uk/goderisa
  • myGrid site www.mygrid.org.uk
Write a Comment
User Comments (0)
About PowerShow.com