Title: WP2: Data Management
1- WP2 Data Management
- Horst Schwichtenberg
- FhG/SCAI
2WP2 Data Management Introduction
Partners involved SCAI (leader), IISAS, KNMI,
GCRAS,CNRS,CGG
- Why Data Management
- Data and Metadata are inevitable for ES
- Access to distributed remote ES data has to be
provided with IT-solutions to open the
possibility to run complex ES experiments
(coupled models eg. climate, different
communities) - Experiences of recent years show that available
capabilities to access data in Grid
Infrastructures/middleware like EGEE are not
sufficient -
-
3WP2 Objectives and Challenges
- Objectives
- Defining the requirements from the analysis and
technical specification of ES - use cases concerning data management
-
- Comparing capabilities of EU grid middleware
and infrastructures with - ES requirements
- Providing data oriented use cases/applications
for grid middleware developers - (Testsuite)
- Challenges
- ES covers many domains and diverse data sources
- many definitions of grid, many different grid
solutions (we have to limit) -
-
4WP2 Achievements
- ES Data management TASKS 2.1/2.2
- From PM 1 - PM 6 in parallel to WP1
- Deliverable PM12
- Survey of existing data technologies and data
usage policies in ES - based on a questionnaire where we asked
- What is typical for data provision and data flow
in ES scenarios - What are the typical policies in ES for data
- How are the data information systems/repositories
organized - 10 Use Cases and the WP2 questionnaire merged
with WP1 - 21 different scenarios were analyzed concerning
data management and classified as in WP1(simple,
complex, complex workflows) and whether they are
deployed on the grid, or not yet gridified) -
5WP2 Achievements
- Deliverable Task 2.1/2.2
- Data is from instruments or simulation (eg.
Weather prediction) - Data files are organized with domain dependent
formats - ex. Netcdf (climate), HDF (satelitte), Grib
(meteorological) - Data information systems provide metadata esp.
for discovery with standards like ISO19115/19339,
Dublin Core, - ES has as many standard formats as instruments
and/or user communities - ES has to provide/develop together with
middleware developers converters, adaptors for
interoperation, semantic solutions - Interoperability is needed
- Geographical information systems (GIS) provides a
huge number of - tools for storage, analysis and
visualization of spatial information - ? many commercial applications, lots of
SMEs - ? GIS community establishes its own
standard OpenGIS by OGC - based on Webservice standards
6WP2 Achievements
- Deliverable Task 2.1/2.2
- Data policies always exists for ES Data
according to - Academic, Industrial, Commercial area,
- Organisations delivering the data
- Access may be restricted to a limited time (ex.
Thematic campaigns) - Grid solutions have to consider access
restrictions - Infrastructures like EGEE lacks these
features!
7WP2 Achievements
- Task 2.4 Test suite
- From PM9 PM15
- Deliverable PM15
- WP2 contributions to test suite
- Use cases with special focus on datamanagement
- GOME
- SPIDR
- GRIMI2
-
8WP2 Future plans
- Task 2.3 Compare ES Requirements with existing
grid solutions - From PM 4 - PM 21
- Deliverable PM 21
- Find missing pieces in Grid Infrastructures (like
EGEE) and middleware stacks - Recommendations to ES for new developments and
porting - Ongoing Work
- Analysis of interoperability on the query
language and data model levels between OGC
(WCS), OGSA-DAI (SQL) and NetCDF - OPenDAP - Data access analysis of scientific array based
data models and relational structure models SQL,
XML/Xquery, OpenDAP, ES - Analysis of catalogue services for ES application
and grid infrastructure specific metadata
catalogues - Availability of grid services to fulfill data
policies - Grid services for data export, processing and
mining - Grid environments and data visualization tools
9WP2 Future plans and Challenges
- Main Activities
- Preparing D2.2 on comparison of capabilities of
EU grid middleware stacks and ES requirements -
- Extension of the Test suite Common action of
WP1,2,3 coordinated byWP1 - Challenges and risks
- Capture the relevant grid solutions
- Stimulate the middleware developers to provide
solutions for ES complex scenarios and not only
addressing single requirements -
10WP2 Future Plans
M2.3 / D2.2 Testsuite 1st version
D2.3 Report of Grid datamanagement tools for ES
applications M2.4 Test Suite report fully
updated
DEGREE IST 2005- 034619
10
Brüssel
11WP2 ES Requirements
- Main findings from survey
- Enabling interoperability of ES systems at data
level - (data formats conversion, data federation,
transparent access to distributed data sources) - Support for Metadata intensive applications in
distributed environments - Support for complex workflows, robust fast
replication data is indispensable - Webservice (WS- Standards) based
interfaces/tools (ex. with Open GIS services) - Data access and management solutions with
commercial OSs (eg. Microsoft) - Unification of ES data standards and/or
converters - Fast transfer of large files and of a large
number of different files - Different and extended Data policies have to be
processed - The middleware has to be extendable with
semantic technologies for data -
12WP2 Conclusion
1. Hard to find common language between ES and
Grid communities 2. Grid community needs
detailed technical requirements ? ES
requirements often tends to be more generic
? Requirements refinement from generic ES
requirements to detailed technical
requirements is a challenging task good
consortium structure helps us to overcome 1. and
2. (experienced ES and Grid community members
several have experiences in both fields)
Expected feedback from ES community Wider
adoption of grid technology among ES community
New ES projects (satellites, campaings,.. )
will be grid enabled from the beginning New
platforms on top of Grid middleware will be
developed by ES Grid