Title: Holding slide prior to starting show
1Holding slide prior to starting show
2(Some) Key Issues in Grid Computing
David Walker School of Computer Science Cardiff
University
http//www.cs.cf.ac.uk/user/David.W.Walker
3Main Thesis of Talk
- At a surface level many aspects of Grid Computing
appear to be straightforward, and reduce to
simple programming tasks and the use of existing
tools. - This talk aims to show that for domain scientists
to effectively use the Grid many challenging CS
issues need to be addressed.
4A Typical Scientific Process
5Key Elements of the Grid
- The specification of problems how do you
program the Grid? - The dynamic discovery of Grid resources.
- Provenance support for Grid applications.
- The interoperability and federation of different
Grid middleware stacks. - Grid access to legacy applications.
- Support for remote collaboration over the Grid.
6A Simple Example
- A simple use of the Grid involves the use of a
PSE or portal to do a set of pre-determined
tasks. - This corresponds to the utility computing mode
of use. - No support for building new applications or
services. - No support for dynamic discovery of resources.
- No support for collaboration.
7Programming the Grid
- Problem specification could involve
- Use of high-level domain-specific
programming/scripting language. - Representing coordinated tasks with a workflow
graph assembled in a visual programming
environment. - Use of recommender systems to assist users in
formulating and solving problems.
8Workflow
- Commonly used to represent applications composed
of interacting services. - Services may be hierarchical composed of other
services. - Easy to represent graphically, but not scalable
with number of services or number of
inputs/outputs.
9Workflow Composition
Would like to support the domain scientist in
designing workflows to solve problems.
10Problems in Workflow Composition
- How do you know that the input port of one
service is compatible with the output port of
another service? - Given that the services may have been created by
different people/organisations? - Type signatures must match, but semantics must
also match.
11Annotating Services
- To support plug-and-play between services in a
workflow requires the use of ontologies. - Need to give semantic content (meaning) to
service inputs and outputs. - This allows composition hints in the form of
semantic suggestions. For example, for a given
service port we could find all services that
could be connected to it.
12Types of Workflow Composition
13Workflow Composition in Semantic Grids
- Semantic Web technologies enable automation at
several levels automated resource discovery,
selection, management, service composition,
execution. - Promises automated seamless interoperation of
autonomous, heterogeneous distributed
applications. - Our focus is on the use of Semantic Web
technologies to automate service composition in
Grid environments. - See S Majithia, DW Walker, and WA Gray Automatic
Composition of Web Services, in Proceedings of
the UK e-Science Programme All-Hands Meeting
2004. Available online at http//www.allhands.org.
uk/proceedings/papers/148.pdf - Main developer is Shalil Majithia.
14Framework - Overview
WFMS Workflow Manager Service AWFC Abstract
Workflow Composition Service CWFC Concrete
Workflow Composition Service RS Reasoning
Service MMS Matchmaking Service AWFR
Abstract Workflow Repository CWFR Concrete
Workflow Repository RB - Rulebase
15Framework - Interactions
16Abstract Workflow Composer
- An abstract workflow specifies a workflow without
referring to a specific service implementation . - The Abstract Composer tries to generate an
abstract workflow by using - AWF Repository stores semantically annotated
descriptions of services and workflows. Use
ontology to match services. - Rulebase a rulebase specifies the recipe to
achieve an objective - Chaining services try and chain services by
matching service outputs and inputs.
17Concrete Workflow Composer
- A concrete workflow specifies an executable
workflow by referring to specific service
implementations. - The Concrete Composer tries to generate an
executable workflow by using - Matchmaking match abstract workflow with service
implementations available at that time. - Chaining services try and chain services by
matching service outputs and inputs.
18Other Components
- Matchmaker service (based on that of Paolucci et
al.) adapted for dynamic substitution. - Chaining service backward chaining service based
on domain ontologies. - Repositories store semantically annotated
abstract and concrete workflows.
19Implementation
- All components implemented as Web services using
Axis server. - Services and workflows described using OWL-S.
- DQL/JTP server used for subsumption reasoning
- Rulebase implemented in RuleML
- Plug-in module enables generation of concrete
workflows in BPEL4WS.
Snippet of OWL-S Profile for FFT
20Family Tree Example
- Families trees have 3 basic relationships
- Spouse_of
- Child_of
- Parent_of
- Other relationships (aunt, grandparent, cousin,
etc) can expressed in terms of these
relationships through an ontology.
21Cousins Example
- Suppose we want to create a workflow to find the
cousins of a given person, X. - Query is submitted to WFMS which checks the AWF
repository (i.e., checks annotated name of
workflows) - If no match then check rule base
22Rulebase
- Grandparents(X)ParentsParentsX
- Cousins(X)excludeGrandchildrenGrandparents(X),
ChildrenParentsX - Note There is no rule for GrandchildrenX. The
Chaining Service would deduce how to do this from
the ontology.
23Abstract Workflow From Rulebase
Atomic service
Composite service
24WF after Recursive Application of Rulebase
25WF after Application of Chaining Service
Note opportunity for optimization and
parallelism.
26Dynamic Resource Discovery and Scheduling
- Assume that semantically annotated services can
be found through a registry or repository
service. - Scheduling of workflow nodes on distributed
resources. - Early binding model bind to specific
service/platform at composition time
(validation). - Intermediate binding model bind at compile
time (when converting from XML to executable
form). - Late binding model bind dynamically at runtime.
- Later binding allows the use of more up-to-date
information to make scheduling decisions. - In our framework binding is done by the
Matchmaker Service, and can follow any of the
above binding models.
27Provenance Support in Service-Oriented Grids
- A workflow may produce many intermediate and
final data products that may need to be later
reviewed and analysed. - A person, project, or organisation may need to
archive many such workflows and their results. - Want to store the provenance of data products
how they were produced and why. - Main developer is Shrija Rajbhandari.
28Provenance
- Provenance can be regarded as historical metadata
that provides an explanation of how a particular
data product has been generated. - Uniquely defines the derived data.
- Identifies what data is passed between services.
- Provides a traceable path to the origin of the
data.
29Provenance Importance and Problem
- No known standards to support archiving
provenance in service-oriented Grid environment. - Requires recording the provenance
- The transformation of data occurred during the
invocation of services in a workflow. - Complex service executed via a workflow Engine.
30Original Motivation
- Would like to be able to view an electronic
publication, and click on tables and figures of
results to - See how they were generated requires provenance
browser. - Re-run the workflows that generated the results
to verify them, or to perform what-if study by
changing the workflow inputs. - See the results of any re-run workflows in the
same format as the original data (table of graph).
31Provenance Model
RDF Schema
Workflow Engine BPWS4J
Provenance Server
I N TER FACE
PCS
Provenance mySql Database
JENA
PQS
PCS Provenance Collection Service PQS
Provenance Query Service Jena is a Java framework
for building Semantic Web applications.
http//jena.sourceforge.net/
32Prototype Provenance System
- Provenance Schema
- Resource Description Framework (RDF).
- Provenance of workflow execution.
- Provenance Collection Service (PCS)
- Provenance is represented in RDF statements.
- Database storage.
- Provenance Query Service (PQS)
- Client interface to browse provenance.
- Allows re-execution of retrieve provenance for
what- if style of analysis.
33Prototype Dataflow
34Services Composition and Invocation
- Compose Web services using BPEL4WS
- Execute with BPEL4WS compliant engine IBMs
BPWS4J - Dynamically invoke Web services using Web Service
Invocation Framework (WSIF).
35Provenance RecordingExample Adding two numbers
and multiplying the result with a third number
36Provenance Recording (cont..)
37Provenance Recording (cont..)
38Provenance Query
39Re-execution for what-if analysis
40Support for Collaboration in Grid Environments
- Collaboration can take various forms.
- Making services available to others.
- Making workflows available to others.
- Making results available to others.
- Collaboratively doing steering an application.
- Collaborative visualisation of results.
41Resource-Aware Visualisation Environment (RAVE)
- Aims to develop a collaborative visualization
environment that scales across a wide range of
network-enabled devices. - Will respond to changes in network bandwidth and
capabilities of the target display device. - Will start by examining VizServer and COVISE
systems. - RAVE postdoc is Dr Ian Grimstead.
42RAVE Overview
43RAVE Motivation
- Current systems make assumptions about available
resources. - RAVE makes use of local and/or remote resources,
and can react dynamically to changes in these
resources and the network connecting them
44RAVE Infrastructure
- The RAVE infrastructure is based on Web services.
- Services are published and discovered through a
UDDI server. - Main services are
- Data Service.
- Render Service.
45Data Service
- Imports data from a file, web resource, or
external application. - Acts as a central distribution point for scene
graph. - Bridging services link to external applications.
46Render Service
- Render services connect to the Data Service which
accepts and broadcasts changes in the scene
graph. - Render services contain complete scene graph.
- View may be rendered in mono or stereo mode.
- Multiple render sessions supported.
47Thin Client
- A thin client is a client with modest rendering
capabilities, e.g., a PDA. - It can connect to a remote render service and
make requests for off-screen rendered copies of
the data. - Local user can still manipulate camera and
underlying data.
48RAVE on Zaurus PDA
49Connecting to an Application
- Data Service can receive live updates from an
external application via a bridging service. - Future work will extend this to allow
computational steering.
50Other Grid Projects
- Quality of Service http//www.cs.cf.ac.uk/user/Ra
shid/ - Grid-Enabled Computational Electromagnetics
(GECEM) http//www.wesc.ac.uk/projects/gecem/ - Workflow Optimization Services for e-Science
(WOSE) http//www.wesc.ac.uk/projects/wose/
51Summary
- Semantic Web technologies play a key role in
enabling - plug-and-play in the composition of service to
create workflows. - dynamic discovery of resources.
- Support for provenance.
- The above, together with collaborative
visualisation, are important in convincing
scientists (and others) to use the Grid.
52(No Transcript)