Designing and Building a Biodiversity Grid: Experiences from the BiodiversityWorld Project - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

Designing and Building a Biodiversity Grid: Experiences from the BiodiversityWorld Project

Description:

'e-Science is about global collaboration in key areas of science and ... Existing techniques (in extremis) e.g. VNC. Plug-ins for the BDW client. External tools ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 44
Provided by: archive6
Category:

less

Transcript and Presenter's Notes

Title: Designing and Building a Biodiversity Grid: Experiences from the BiodiversityWorld Project


1
Designing and Building a Biodiversity
GridExperiences from the BiodiversityWorld
Project
  • Andrew C. Jones
  • Cardiff University, UK
  • Andrew.C.Jones_at_cs.cardiff.ac.uk

2
The BiodiversityWorld project
  • 3 year e-Science project funded by the UK BBSRC
    research council
  • Universities of Reading, Cardiff and Southampton
    The Natural History Museum (London)

2
3
Some Background ...
4
The GRID e-Science
  • A computational grid is a hardware and software
    infrastructure that provides dependable,
    consistent, pervasive, and inexpensive access to
    high-end computational capabilities(Foster
    Kesselman The Grid)
  • e-Science is about global collaboration in key
    areas of science and the next generation of
    infrastructure that will enable it. The
    infrastructure to enable this science revolution
    is generally referred to as the Grid(Hey
    Trevethen The UK e-Science CoreProgramme the
    Grid)

4
5
GRAB (GRid And Biodiversity)
  • 6 month DTI-funded demonstrator project
  • Project aim
  • Assess Grids potential for collaborative
    research in biodiversity informatics
  • Supporting discovery use of diverse
    biodiversity-related databases
  • Exploring use of Globus SRB middleware

5
6
6
7
GRAB resource types
...
GRAB interface
  • Catalogue of life
  • Scientific common names
  • Species Information System (SIS)
  • Images geography
  • Climate
  • Max/min temperature annual precipitation

7
8
Issues in GRAB
  • Problems installing Globus research software
  • Essentially wanted to send distributed requests
    receive responses
  • Initial HTTP-based prototype worked well
  • Versions of SRB then available had little to
    offer
  • Globus 2 approach needed canned queries,
    temporary files, etc much more difficult than
    the HTTP prototype

8
9
What we want to achieve
10
Some difficult Biodiversity questions
  • How should conservation efforts be concentrated?
  • (example of Biodiversity Richness Conservation
    Evaluation)
  • Where might a species be expected to occur, under
    present or predicted climatic conditions?
  • (example of Bioclimatic Ecological Niche
    Modelling)
  • How can geographical information assist in
    selection among possible phylogenetic trees?
  • (example of Phylogenetic Analysis Palaeoclimate
    Modelling)

10
11
Some relevant resource types
  • Data sources
  • Catalogue of life
  • Species Information Sources (SISs)
  • Species geography
  • Descriptive data
  • Specimen distribution
  • Geographical
  • Boundaries of geographical political units
  • Climate surfaces
  • Genetic sequences
  • Analytic tools
  • Biodiversity richness assessment various
    metrics
  • Bioclimatic modelling bioclimatic
    envelopegeneration
  • Phylogenetic analysis (generation of
    phylogenetictrees)

11
12
Some challenges
  • Finding the resources
  • Knowing how to use these heterogeneous resources
  • Originally constructed for various reasons
  • Often little thought was given to standards or
    interoperability
  • So need to have appropriate associated metadata

12
13
Our vision (1)
  • Biodiversity Problem Solving Environment
  • Heterogeneous diverse resources
  • Facilitating integration of both legacy and
    newly-developed resources
  • Flexible workflows
  • Main challenges centre around metadata,
    interoperability, resource discovery, etc
  • High-performance computing secondary(though
    relevant)

13
14
Our vision (2)
  • Distinctive features
  • a biodiversity informatics GRID
  • interoperability with heterogeneous data, complex
    in structure
  • resilience to infrastructure change
    interoperation with other GRIDs
  • interactive collaboration a secondary concern
  • We want to automate tasks such as the following
    analysis

14
15
15
16
16
17
17
18
Our architecture
19
BiodiversityWorld as a flexible PSE
19
20
Interoperability in BiodiversityWorld
  • Initial proof-of-concept prototype used Java
    RMI no serious attention to interoperability at
    that stage
  • Have now defined BiodiversityWorld-Grid Interface
    (BGI) addressing need to
  • wrap resources to remove needless heterogeneity
  • wrap the wrapped resources (!) to insulate from
    infrastructure change
  • use metadata to cope with remaining heterogeneity

20
21
BiodiversityWorld architecture


User interface


Presentation

Workflow
enactment
Wrapped
Native

engine

resources

Biodiversity
-
Metadata
World
repositor
y

Resources

BGI API


BiodiversityWorld
-
GRID
Interface
(BGI)


The GRID

21
22
BGI architecture
22
23
Some implications
  • Wrapping
  • Various ways of introducing resources (see later)
  • Computationally intensive applications
  • Assume these will lie within a single BDW
    resource
  • Interoperability with other Grids
  • Could wrap non-BDW resources
  • Could rely on (e.g.) WSRF for communications with
    our GRID
  • Highly interactive applications
  • BGI OK for coarse-grained interaction other
    possibilities (see later)

23
24
Resources for BiodiversityWorld
1
2
3
4
5
6
7
Wrapped non-Java resource
Grid software (of some sort)
24
25
User interaction with BDW
26
Example work-flow (Climate-space Modelling)
Submit scientific name retrieve accepted name
synonyms for species
Present or recent climate surfaces
Retrieve distribution data for species of interest
Model of climatic conditions where species is
currently found
Prediction of suitable regions for species of
interest
Possibly different climate surfaces (e.g.
predicted climate)
World or regional maps
Projection of predicted distribution on to base
map
26
27
BDWorld / Triana in operationWorkflow creation
(design, editing)
27
28
Triana screen-shots
28
29
Triana screen-shots
29
30
Triana screen-shots
30
31
Triana screen-shots
31
32
BDWorld / Triana in operationWorkflow
execution (enactment, run-time)
32
33
Triana screen-shots
33
34
Triana screen-shots
34
35
Triana screen-shots
35
36
Triana screen-shots
36
37
Workflows
  • Creating a workflow
  • Workflows clearly good for capturing complex
    tasks
  • Good for tweaking tasks
  • But is this how users think?
  • If not, we should provide an environment that
    supports a more exploratory approach too, e.g.
  • User tries out some small subtasks
  • (S)he joins results together
  • System records interactions, so re-usable
    workflows can be composed

37
38
Other aspects of user interface
  • The drag-and-drop metaphor needs further
    research into the best ways to support
  • resource discovery
  • resource matching
  • data management (e.g. temporary storage of
    intermediate results)

38
39
Complex interactions
  • BGI not well-suited to fine-grained interaction
  • Stand-alone applications difficult to wrap
  • may need, e.g., screen scraping
  • Were looking at
  • Less portable by-pass mechanisms, e.g.
  • New BGI protocol
  • Existing techniques (in extremis) e.g. VNC
  • Plug-ins for the BDW client
  • External tools
  • (which will always be needed)

39
40
Role of metadata
  • Metadata is needed to enable discovery of
    resources and to indicate how they are to be used
  • Properties to help locate appropriate resources
  • Check interoperability, suggest transformations
  • Provenance of data sets
  • Log of work-flows executed

40
41
Architectural issues in BDW
  • Globus 3 provides Grid Services, but still
    evolving (WSRF in Globus 4)
  • Trade-off abstraction layer (BGI) including
    invocation mechanism
  • Insulates from change
  • Wraps resources to remove needless heterogeneity
  • Wraps the wrapped resources (!) to insulate from
    infrastructure change
  • (3 implementations now Java RMI-, Globus OGSA-
    and Web Services-based)
  • Performance penalty
  • Assume computationally intensive applications lie
    in a single BDW resource
  • Hinders interoperation with other
    Grid/Webservices

41
42
A dream
  • Desktop environment in which scientists drag
    drop data sources, analysis and modelling tools,
    and visualisation interfaces into desired
    sequence of operations which can be run
    automatically
  • BDWorld just about at this stage
  • With additional features (some described
    earlier), the environment could be made richer,
    more productive, and support research groups.
  • Essentially a component-based visual programming
    environment
  • Not just for biodiversity!

42
43
Acknowledgements
  • UK DTI, EPSRC BBSRC EU
  • Collaborators at
  • Cardiff University
  • Southampton University
  • The University of Reading
  • The Natural History Museum (London)
  • Organisations that have co-operated with these
    research projects, especially
  • Species 2000
  • ILDIS
  • FishBase
  • Hadley Centre for Climate Prediction and Research

43
Write a Comment
User Comments (0)
About PowerShow.com