Title: Grid Portals for Earth Science Stateofthe Art survey
1Grid Portals for Earth Science State-of-the Art
survey
SSA- IST 2005-034619
3rd DEGREE Workshop, KNMI, May 2008
2Why GRID for Earth Science?
- Earth Science is fragmented in many disciplines
- Each focusing on parts of the puzzle
- Real Earth systems interacting through many
interfaces - Largescale environmental modelling
- Earth Systems interacting at various
spatio-temporal scales - Data integration from different sources
- Satellite in-situ instruments
- Global, multi-dimensional coverage
- Long time series and historic data - increasing
- Explosions of data - the "Data Deluge"
- Scientists and instruments widely scattered
across geographical and organizational boundaries - Requirement for largescale computing networks
- Large infrastructure building is ongoing
- GRID model for largescale "loosely distributed"
computing - Organizations keep full local control of
resources - But can easily share them when needed
- To support collaborations among "Virtual
Organizations" - Within one organization
- Across distributed organizations
3DEGREE Objectives
www.eu-degree.eu
- Disseminate, promote uptake of Grid in wider ES
community and integrate newcomers - Reduce the gap between ES Users and Grid
Technology - Explain and convince ES users of Grid benefits
and capability to tackle new and complex problems - Approach
- Focus on four specific areas
- Grid Application Families, Requirements and Test
Suites - Grid Job Control and Workflow Management
- Grid Data Management
- Portals for Grid, SOA and eCollaborations
- For each area
- Establish state-of-art in ES and other science
communities - ES Requirements gathering analysis
- Gap analysis
- Provide Key requirements Recommendations for
input to ES Grid Roadmap - Elaborate a set of ES application test suites
targeted to illustrate and test selected ES
Requirements - Generate ES Grid Roadmap
- Disseminate results to wide ES and Grid
Communities
4ES Grid Portals Objective Approach
- Objective
- Portals are in the critical path between User and
Middleware - As such a key element to increase Grid uptake
exploitation - Where do we stand now and what needs to be done
to ensure adequacy for the future - on Middleware ( services) side
- on Portals side
- Approach to the task
- Establish current state-or-art
- in ES
- in other e-Science communities
- Analyze the present solutions
- what are the ES Portals Requirements ?
- are they met ?
- what can be improved and how ?
5State-of-art Surveys
- ES Grid Portals Survey
- establish current state-of-art baseline
- extent of uptake of Grid technologies
- in ES Portals
- methods techniques and middleware used
- types of solutions where Grid is used
- ES Portals requirements
- Deliverable D4.1 (available on the web)
- Generic Grid Portals Survey
- looking outside ES Community
- other e-science communities approach to
- Grid Portals
- focus on generic Portals middleware solutions
- analysis of ES Grid Portals requirements vs.
middleware gaps - recommendations and inputs for ES Grid Roadmap
- Deliverable D4.2 (available on the web)
www.eu-degree.eu
6Results Overview
- State-of-art surveys
- Wide range of ES Portals scenarios exploiting
Grid, SOA, eCollaboration - Technology state-of-art summarized
- Workshop on ES Grid Portals
- CRS4, Sardinia
- Attended by a good balance of Grid ES community
people - ES Grid Portals Classification
- ES Grid Portals Requirements
- Gap Analysis
- Recommendations
- towards ES
- towards Middleware Services
7State-of-art Surveys Results
- ES Grid Portals Survey
- Established current grid technology uptake from
17 Earth Science Grid Portals surveys from an
initial list of 32 - ES Portals Classification derived
- ES Portals Requirements gathered analyzed
- Applications covered
- Generic, Ocean, Atmosphere, Cryosphere, Land
Surface, Solid Earth - Key Technologies
- Grid, e-Collaboration, Webservices, SOA,
Ontology, Semantic Web, Metadata, Data Access
8State-of-art Surveys Results
- Generic Grid Portals Survey
- Detailed surveys of 14 Portals from outside the
ES Community selected from 37 to find
state-of-the-art technology - Gap Analysis
- Recommendations
- Applications covered
- Generic Grid computing, VRE, social science,
digital media, bioinformatics, geosciences,
meteorology, emergency handling - Key Technologies
- Grid, e-Collaboration, Webservices, SOA, Portal,
Metadata
9ES Portals Classification
- Data Dissemination
- Discover, identify, access ES data
- Publish ES data and make it available to users
- Using Grid for data exchange and sharing
- Collaborative
- Online collaborative ES Virtual Communities
- Provide collective focal point for special
interest groups - Using Grid for communication and working together
- Sharing a common subset of ES tools data
- Grid-based
- ES data intensive processing
- Access to service-based Grid infrastructure and
resources for dynamic processing of ES specialist
datasets - Using Grid for high performance data processing
- "On demand" sharing of complementary resources
10ES Portals Example Data dissemination
- GEONETWORK Portal
- optimized to support spatial data
- allows sharing of geo-referenced thematic
datasets in wide community of spatial data users - enable access to geo-referenced databases,
cartographic products and related metadata from a
variety of sources - standard
- implements and extends ISO 19115 Geographic
Metadata, and OGC - unifying approach is offered to the community,
free and opensource - de-centralized
- nodes installed in individual organizations
- single entry point
- distributed search
- users can
- locate and access the data for creating new maps
combining various layers of information - processing is done off-line
- publish the new maps using the same Portal
- types of users
- Decision makers, development planners,
humanitarian and emergency managers - GIS experts, multidisciplinary geographical
spatial data analysts and forecasters - Researchers and value adders
11ES Portals Example Collaborative
- SSE (Service Support Environment)
- Common web portal based framework
- Allows service providers to easily make their
services available to a broad community - SSE Service directory contains a wide range of
basic and complex ES community services - Services integrated directly in the Portal or can
remain in the service provider's environment and
accessed via the Portal - New services can be composed using the SSE
Workflow
- Services available
- Data and Information provision, data conversion
and processing, data delivery - Thematic mapping, land use, environmental
monitoring, etc. - Product searches
- Demonstrations promotions of new EO environment
monitoring services - Heterogeneous access to multi-mission satellite
data - Other new services under development
- Others in this group AMI4FOR, UNEP, ETHER,
TheVoice
12ES Portals Example Grid-based
- GRID-IFY (Grid-On-Demand)
- Spatial data (EO) application and Grid
integration Portal framework environment - Integration of EO Catalogues for product search
and retrieval, OGC WMS for displaying product
overlays on top of world maps - Security management
- User registration, login and automated management
of certificate using MyProxy - At time of registration the user is assigned
privileges to access specific applications,
processing algorithms, services and data - Application porting
- Implementation and configuration of an
application decomposed into a set of processing
modules/services using a basic Grid deployment
framework model - Simplified access to Grid services exploiting the
state of the art Grid standards and technology
- Grid-based application deployment
- Combined scheduling of data and jobs, execution
using Grid resources - Desktop as well as web user interfaces
- Others in this group IMPECT, VGISC, WEBGRECL,
DATACROSSING, IDEAS, KWF-GRID, MEDIGRID
13ES Grid Portals Requirements
- Generic requirements
- Interoperability between different Grid MW
infrastructure - Reliability QoS
- Guaranteed fast turnaround
- Standard "off-the-shelf" tools for integrated
Grid Security and User management - Dynamic content authoring, addition of customized
services, registration of available resources - User support, how-to, tutorials
- ES specific requirements
- Strong emphasis on Metadata and Data, its
Discovery and Access - Working with very large datasets and file numbers
- Integration of heterogeneous distributed services
(Grid Geo-services, OGC) - Support "Gridifcation" in Geo-services and
Spatial Data standards - Tools interfaces readily useable by ES
Scientist - as application assembler as well as end user
- automated tools to assist deployment ES
applications and libraries on the grid - Facilitate integration with ES web services
- Interoperability with ES data catalogues
- Support for Earth Science sensors and thematic
data
14Key Requirements by ES Portal Type
- Data dissemination
- Registration and publication of new sources of
data - Search, locate and discover details of registered
data collections - Access to data
- Collaborative
- Structured, customized organization of the portal
pages according to dedicated application themes,
activities and functions - Facilitate customizations of the portal
information and content by the realtime
integration of contributions from individual
users - User identity management, access permissions
control, account settings and customization of
the individual users environment - Customized domain-specific tools for
e-Collaboration - Grid-based
- Front-end user interface for largescale dynamic
processing ES specialist datasets - Orchestration coordination of low-level tools
services - Ability to interface to different infrastructures
- Generic framework model to facilitate addition
and easy gridification of new ES applications,
independent of middleware implementation
specifics - Provide ready access to large Grid-based ES data
collections and to support the easy integration
of new data - Collections for use in the Grid-based ES
applications data processing.
15ES Grid Portals Generic Model
- A Generalized Component Model View
- Requirements design objectives serving two
different domains - End-user ease-of-use
- Application-developer ease-of-assembly
Application Data Services
Data Anaylsis Modelling
Search Catalogs
- Front-end user interface
- Domain of the end-user
- Reusable services stored in workflow repository
- Users can invoke available workflows and compose
new ones
Data Processing
Data Access
Visualization
e-Collaboration
Webmap
Data Visualization
User Forum
News Announcements
Computation Results
Models
e-Communication Tools
- Back-end services
- Domain of the application-developer
- Assemble new service components
- Publish services in workflow repository
16Gap Analysis
- Gap between Grid and local portal user management
- Request Grid CA certificate, register with VO,
request local accounts, map Grid credentials to
the local ones - Single user identity for Grid but different local
identities - Allow an authenticated user to move seamlessly
among different Grid portals - Gap in the SOA requirements for portals and
available Grid-services resource framework - Advanced SOA framework not available in gLite
- WSRF GT4 framework is often used in place of the
gLite - OGSA-DAI and OMII partly cover the gap, but
standardized Gridservices resource framework for
gLite is not solved yet - Gap in the interoperability between portals
- Grid portals currently cannot reuse or federate
their services for metadata searching and data
visualization, using e.g. - metadata schemas (e.g. FGDC)
- XML query language (xQuery)
- self-describing data formats (NetCDF, HDF)
- streaming protocols (OpenGIS WMS, WCS and Unidata
OPeNDAP).
17Gaps Analysis
- Focus must be on ES functionality, grid working
as back-end does not have to be visible. - Higher level components targeted to ES
- Computation submission without descending to
grid-job level, maybe even hiding the grid
completely - Big emphasis on Metadata and Data, its Discovery
and Access - Browsing and accessing datasets the ES way
- Support for Spatial Data INSPIRE, SDI, ...
- Spatial data searches and OGC services (e.g. WMS)
- Tools integrated with the Grid
- Interoperability and interchange
- support for standard tools/protocols (ISO19115,
OpenDAP, LAS, DODS, NETCDF, integration with
OGSA-DAI) - ontology / semantic web (developing the
rudiments) - Publish, subscribe, notify
- Search, locate, access and process ES datasets of
interest
18Gaps Analysis
- Graphical interfaces for different kinds of ES
data, activated by data type - Input specification, e.g. area selection on a map
for subset selection - Output visualization and browsing components for
displaying time series of images, image layering - Such components exist, but they use different
technologies and APIs - Standard "off-the-shelf" tools for integrated
Grid Security and User Management - Interfacing Grid security and ES security Portal
login models User management integrated with
certificate management - Certificates generated on the portal by the
portal, transparently - Loging-in into a portal should be enough to
authenticate user - There are existing activities and software to
remedy this, e.g. PURSE (EarthScienceGrid), GAMA,
...
19Gaps Analysis
- Another approach integrating grid into existing
portals - Light-weight grid service interfaces to grid
functionality for easy integration and/or
mash-ups creation - Like Google Maps but for grids
- Would allow easy integration of grid services
into (existing) ES portals
20Recommendations
- Improved standardization
- common Grid Portal and API interface models
- foundational frameworks, class libraries
- abstract, generic Grid interface
- interoperability across Grid infrastructures,
middleware, and sw migrations - standard interfaces for integration of GRID
services and ES common tools - metadata catalogues, data repositories, webmap
and geospatial services, etc. - Portlet technology fully exploited by ES as well
as Grid developers - support development of dedicated Grid Portlet
interfaces "plug-ins" - for data management, job submission, grid
information, workflow, security, grid-login etc. - ES increase uptake of Grid, building a critical
mass in terms of - infrastructure, resources, services,
applications, tools and users - large sustained effort as a long term objective
to increase the critical mass - porting more ES software tools, algorithms, data
- increase application developers end-users
accessing Grid infrastructures - increased education, training and exposure of ES
scientist on available Grid facilities - ES to demonstrate commitment to Grid
- receive more support and commitment from the Grid
Community
21Statement
- ES and Grid are distinct communities with
different aims, culture and background - ES Community driven by needs of science
- Grid Community driven by computing technology
services - Sustainability of Grid means increased
exploitation by applications - ES Community needs tools and facilities for
largescale processing and e-Collaboration among
Virtual Organizations - Objectives can be met by mutual support between
these communities - Sustainability through mutual exploitation and
sharing of results, knowledge, expertise and
resources - A large sustained effort is needed to bring the
two communities together and increase
collaborations and understanding between them
22Conclusion Next Steps
- Strong demand for ES Applications to access Grid
infrastructure services using Portals - Many demonstrate strong need for Grid - even if
they dont use any Grid middleware or
infrastructure - implemented using webservices, GridSphere, etc.,
- but different solutions need to be interoperable,
scaleable - Need is constantly increasing as ES data
collections and applications becoming more
complex - Increasing integration of data from diverse
sources - Large number of technology solutions to choose
from - Strong need to integrate OGC and other ES web
sevices - Contribute recommendations and key requirements
for including in ES Grid Roadmap - Dissemination of results to ES and Grid
communities - Suggested follow-ups
- Implementation, deployment, evaluation,
demonstration... - Integrating ES candidate applications using
selected available Grid Portals technology (e.g.
PGRADE, GRB, A-WARE), OGC services - Integrate new methods for accessing ES Data
Repositories longterm archives
23Thank you
24ES Portals for Grid, SOA and e-Collaboration
- Results
- Wide range of ES Portals scenarios exploiting
Grid, SOA, eCollaboration - ES Portals Classification
- Data Dissemination "Discover, identify and
access ES data" - Collaborative "Online collaboration in ES
Virtual Communities" - Grid-based "ES data intensive processing"
- ES Major Requirements
- Integration of heterogeneous distributed services
(Grid Geo-services) - Support "Gridifcation" in Geo-services and
Spatial Data standards - Standard "off-the-shelf" tools for integrated
Grid Security and User Management - Big emphasis on Metadata and Data, its Discovery
and Access...
25Gaps in portal functionality
- Focus must be on ES functionality, grid working
as back-end does not have to be visible. - Higher level components targeted to ES
- Computation submission without descending to
grid-job level, maybe even hiding the grid
completely - Big emphasis on Metadata and Data, its Discovery
and Access - Browsing and accessing datasets the ES way
- Support for Spatial Data INSPIRE, SDI, ...
- Spatial data searches and OGC services (e.g. WMS)
- Tools integrated with the Grid
- Interoperability and interchange
- support for standard tools/protocols (ISO19115,
OpenDAP, LAS, DODS, NETCDF, integration with
OGSA-DAI) - ontology / semantic web (developing the
rudiments) - Publish, subscribe, notify
- Search, locate, access and process ES datasets of
interest
26Gaps in portal functionality
- Graphical interfaces for different kinds of ES
data, activated by data type - Input specification, e.g. area selection on a map
for subset selection - Output visualization and browsing components for
displaying time series of images, image layering - Such components exist, but they use different
technologies and APIs - Standard "off-the-shelf" tools for integrated
Grid Security and User Management - Interfacing Grid security and ES security Portal
login models User management integrated with
certificate management - Certificates generated on the portal by the
portal, transparently - Loging-in into a portal should be enough to
authenticate user - There are existing activities and software to
remedy this, e.g. PURSE (EarthScienceGrid), GAMA,
...
27Gaps in portal functionality
- Another approach integrating grid into existing
portals - Light-weight grid service interfaces to grid
functionality for easy integration and/or
mash-ups creation - Like Google Maps but for grids
- Would allow easy integration of grid services
into (existing) ES portals