GRID COMPUTING FOR NEW EARTH SCIENCE PARADIGMS - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

GRID COMPUTING FOR NEW EARTH SCIENCE PARADIGMS

Description:

Discovery Metadata (data objects for scientific description eg. ... (Opaque layers hides differences of underlying DB systems) ... – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 32
Provided by: Peti2
Category:

less

Transcript and Presenter's Notes

Title: GRID COMPUTING FOR NEW EARTH SCIENCE PARADIGMS


1
WP2 Data Management
Horst Schwichtenberg
2
WP2 Data Management
  • Contents
  • Overview Tasks
  • Including results and deliverable of D2.1/2.2
  • Test Suite - Data Management Example
  • Partner contribution by partners
  • SCAI, IISAS, KNMI, GCRAS,CNRS, CGG

3
WP2 Data management TASKS 2.1/2.2
  • Analysis of Existing data technologies and data
    usage policies in ES
  • What is typical for data provision and data
    flow in complex ES scenarios
  • What are the typical ES data policies
  • How are the data information systems/repositorie
    s organized
  • Deliverable (PM12) of Survey ready
  • Begin PM 1 End PM 6
  • Milestones PM 6 and PM 12
  • Effort by partner

4
WP2 Data Management TASK 2.1/2.2
  • Requirements on Data management and Policies
  • Questionnaire to describe the data management of
    a given applications
  • Data organisation
  • Data policy
  • Data access
  • Data information systems
  • Data flow before and during the computation
  • Metadata
  • 21 different scenarios were analyzed
  • and classified (simple, complex, complex
    workflows (WP1)
  • on the grid, partly on a grid, not yet gridified
  • Some of them are based mainly on web services for
    data dissemination
  • Some of them are using Grid Infrastructures

5
WP2 Deliverable TASK 2.1/2.2
  • ES has
  • Global, regional, local applications
  • Alternative use of the data at different time
    and spatial resolution
  • Large historical distributed archives
  • Long term data archives have to be exploited
  • Near real-time access to data
  • For processing, value adding and dissemination
  • For now-casting and alert
  • Models to provide long term trends and forecast
  • Processing-intensive, data-intensive and complex
    applications
  • Integrate different data sources
  • Data fusion, data assimilation, data mining,
    modelling
  • Standardisation, Virtual Organisation,
  • Link data to information system and knowledge

6
WP2 Deliverable TASK 2.1/2.2
  • Data format
  • As many standard formats as instruments and/or
    user communities
  • Auto-descriptive format (NetCDF, HDF..) or not
  • ASCII or Binary compressed or not
  • Meteorological format (GRIB, BUFR)
  • Data files
  • Flat files
  • Organisation simple to complex architecture
    depending on
  • Size and number of files created
  • end-users
  • Metadata linked to catalogue, especially for
    shared data
  • Database
  • Few for data because depending on
  • the size of data if relatively small
  • Organisation, Data provider
  • Mainly for Metadata


7
WP2 Deliverable TASK 2.1/2.2
  • A data policy always exists and concerns
  • Use of data
  • Publication of the results (co-author,
    acknowledgement, reference)
  • Large variety of data policy
  • User and data Use Academic, Industrial,
    Commercial, accepted proposal
  • Data source
  • Confidential or sensitive
  • restricted to authorized users even for bought
    data
  • Free on a web site
  • Organisation delivering the data
  • Access may be restricted for a limited time
    (thematic campaigns)
  • Absolute Need to access restriction group and
    even person


8
WP2 Deliverable TASK 2.1/2.2
  • Metadata for discovery of data and information
  • Resource Metadata (computing and storage
    resources, Lfns, )
  • Discovery Metadata (data objects for scientific
    description eg. ISO 19115, 19139, 19119, Dublin
    Core, )
  • Use Metadata describes data objects and files
    needed for access on data
  • Metadata is central for ES
  • ? middleware has to support restricted access
  • Data discovery by ES portals (Geon)
  • Semantic and ontology techniques used for search,
    discovery and accessing widely dispersed,
    heterogeneous data sources


9
WP2 Deliverable TASK 2.1/2.2
  • General Requirements to middleware stacks and
    SOA
  • Interfaces/Layers to access to heterogeneous
    federated RDBMS
  • Access to data from different locations in a
    grid and from locations outside of a grid
    infrastructure
  • Webservice (WS- Standards) based interfaces
    (esp. with Open GIS services)
  • Fast transfer of large files and a large number
    of different files
  • For complex workflows, robust and fast
    replication data is indispensable
  • Data access and management also for Microsofts
    .net
  • Support of Metadata intensive applications in
    distributed environments
  • User/role based access control to Metadata and
    data
  • Ontologie technologies should be available for
    the resources and specific ES domains

10
WP2 ESR Data Management Requirements for Grid
middleware stacks T2.3
  • TASK 2.3 Comparison of existing Grid Services
    and ES Requirements for Data management
  • Find missing pieces in Grid Infrastructures
    (EGEE) and middleware stacks based on the WP1
    and WP2 Requirements for ES applications
  • Recommendations to ES for new developments and
    porting of applications to grid environments
  • For example
  • Access to existing ES databases outside of Grid
    infrastructures like EGEE by DB interfaces like
    AMGA or OGSA-DAI
  • Integration of Webservice based standards like
    OGC/GIS with existing classical Middleware stacks
    like gLite or Unicore

11
WP2 ESR Data Management
Requirements for Grid middleware stacks T2.3
  • TASK 2.3
  • Begin PM 4 End PM 21
  • Milestones (M2.4) PM 21
  • Deliverable (D2.3) PM 21
  • Effort by partner

Next Milestone will be in PM23
DEGREE IST 2005- 034619
Internal Review at CRS4 12 June 2007
11
12
WP2 ESR Data Management
Testsuite Task 2.4
  • Work will be done in close cooperation with WP1
  • Begin PM x End PM 21
  • Milestones (Mx) PM 21
  • Deliverable (Dx) PM 21
  • Effort by partner

WP2 will contribute data management relevant
applications to the Testsuite
DEGREE IST 2005- 034619
Internal Review at CRS4 12 June 2007
12
13
WP2 ESR Data Management Test suite Task 2.4

  • Task will provide a typical ES applications with
    emphasis on data management
  • First Application provided GOME

Example Validation of GOME/ERS experiment with
Lidar data Two different instruments
Ground-based Lidar, spectrometer aboard the
satellite, ERS. The satelitte data stored by
orbit or pixel different algorithms The Lidar
data stored in monthly files with one
profile/night
14
WP2 ESR Data Management Test suite Task 2.4


Part of Opera/NNO meta data scheme
Column Type -----------------------
-----------------------------------------------
dataset character varying(50) level
character varying(5) version character
varying(4) orbit integer file_name
character varying(50) start_date timestamp
without time zone stop_date timestamp without
time zone lat numeric(8,2) lon
numeric(8,2) proc_center character
varying(50) proc_date timestamp without time
zone file_input character varying(50)
proc_description character varying(50)
footprint geometry (Multipolygon)
15
WP2 ESR Data Management Test suite Task 2.4


ES Requirement for middleware developers
- secure and restricted access to (external)
Meta data in an grid environment - preferable
interfaces provides industrie standards
(part of ES are industry) - the RDBMS need to
support spatial data types (OpenGIS conform)
16
WP2 ESR Data Management Partner contribution
  • Partner SCAI TASK 2.1/2.2
  • Work done
  • Preparation of data management questionnaire
  • Collecting examples
  • Contribution to D2.1/2.2
  • Effort 0.78 (official)

17
WP2 ESR Data Management
Partner contribution
  • Partner SCAI TASK 2.3
  • Work today
  • Requirement Access to external distributed
    RDBMS from the GRID
  • Layers to be considered OGSA-DAI, AMGA,
    Spitfire,
  • (Opaque layers hides differences of underlying DB
    systems)
  • Interoperability of the interfaces with grid
    services of middleware stacks
  • Capabilities and missing features of grid
    services and interfaces
  • First results (exp) OGSA-DAI (quasi standard)
    not integrated to gLite
  • AMGA integrated tool of gLite, but very specific
  • Effort until today 2,34 PM

18
WP2 ESR Data Management Partner Contribution
  • Partner SCAI Planned Work
  • Task 2.3
  • Availability of grid services/tools to implement
    data policies
  • Continue with further requirements from WP1 and
    WP2 D2.2/2.1
  • Preparation and coordination of D2.3
  • Effort until PM 21 2.34 PM
  • Task 2.4
  • Contribution to Test Suite description of GOME
  • Contribution to Test Suite 2nd example
  • Effort until end of Project xx

19
WP2 ESR Data Management
Partner Contribution
  • Partner SCAI Planed Work
  • Task 2.3
  • Availability of grid services/tools to implement
    data policies
  • Preparation and coordination of D2.3
  • Effort until PM 23 xx
  • Task 2.4
  • Contribution to Test Suite description of GOME
  • Contribution to Test Suite 2nd example
  • Effort until end of Project xx

20
WP2 ESR Data Management
Partner Contribution
  • Partner UISAV TASK WP2.1/2.2
  • Work done
  • WP2 application questionnaires analysis
  • WP1 application questionnaires analysis
  • Contribution to D2.1/2.2
  • Data provision
  • Integration of relevant sections from WP1
    questionnaires to D2.1/2.2
  • Effort 1.29 (official)?

DEGREE IST 2005- 034619
Internal Review at CRS4 12 June 2007
20
21
WP2 ESR Data Management Partner contribution
  • Partner UISAV TASK 2.3
  • Work today
  • Analysis of catalogue services for application
    and grid infrastructure specific metadata
    catalogues
  • Metadata catalogue types needed application
    specific metadata, compute and storage resources
    metadata, discovery metadata, VO and security
    metadata
  • Analyzed software
  • Standard grid metadata catalogues (MDS/WS-MDS,
    RLS, MCS, AMGA)?
  • RDF and ontology-capable (semantic) catalogues
    (RDFPeers, SDR, DSWS-R, TUPELO, Edutella)?
  • Other catalogues (Graffiti, DIMES)?
  • Observed properties
  • Content language, security, maturity, query
    language, distribution/integration of content,
    standards conformance

DEGREE IST 2005- 034619
Internal Review at CRS4 12 June 2007
21
22
WP2 ESR Data Management Partner contribution
  • Partner UISAV TASK WP2.3
  • Planned work
  • Analysis of catalogue services for application
    and grid infrastructure specific metadata
    catalogues
  • Extend the set of analyzed catalogue services and
    observed properties
  • Prepare categorization of analyzed services
  • Document analysis (reports, deliverable)?
  • TASK WP2.4
  • Work today
  • Draft of flood prediction application testsuite
    description
  • Planned work
  • Further elaboration of data management specific
    issues in Flood application testsuite
  • contribution to Testsuite 2nd example

DEGREE IST 2005- 034619
Internal Review at CRS4 12 June 2007
22
23
WP2 ESR Data Management Partner contribution
  • Partner GCRAS
  • TASK WP2.1/2.2
  • Work done
  • Evaluation of ES grid data fusion applications
    (SPIDR, ESSE, CLASS)
  • GIS applications review and OpenGIS standards
    summary
  • Contribution to D2.1/2.2
  • Effort 1 PM

24
WP2 ESR Data Management Partner Contribution
  • Partner GCRAS
  • TASK WP2.3
  • Work today
  • Analysis of interoperability on the query
    language and data model levels between OGC
    (WCS), OGSA-DAI (SQL) and NetCDF - OPenDAP
  • Metadata standards catalog, inventory level and
    ordering extensions
  • Data access analysis of scientific array based
    data models and relational structure models SQL,
    XML/Xquery, OpenDAP, ESSE
  • Effort until today 1,5

25
WP2 ESR Data Management Partner Contribution
  • Partner
    GCRAS
  • Planned Work
  • Task 2.3
  • Data visualization tools (connection to Grid
    environments)
  • Metadata search engines
  • ES grid services for data export, processing and
    mining
  • Task 2.4
  • contribution toTestsuite discription of GEONGrid
  • contribution to Testsuite 2nd example
  • Effort until end of Project 0,9

26
WP2 ESR Data Management Partner contribution
  • Partner CNRS
  • TASK WP2.1/2.2
  • Work done
  • Evaluation of data policies and security
  • Contribution to D2.1/2.2
  • Effort x PM
  • TASK WP2.4
  • Providing Examples for Testsuite (1st Gome)
  • Effort x PM

27
WP2 ESR Data Management Partner contribution
  • Partner KNMI
  • TASK WP2.1/2.2
  • Work done
  • Evaluation of data technologies in Weather
    forecast
  • Contribution to D2.1/2.2
  • Effort x PM
  • TASK WP2.4
  • Providing Examples for Testsuite
  • Effort x PM

28
WP2 ESR Data Management Partner contribution
  • Partner CGG
  • TASK WP2.1/2.2
  • Work done
  • Evaluation of data technologies Geophysics esp.
    in large enterprises
  • Contribution to D2.1/2.2
  • Effort x PM
  • TASK WP2.3
  • in preparation
  • Effort x PM
  • TASK WP2.4
  • Providing Examples for Testsuite
  • Effort x PM

29
WP2 ESR Data Management Dependencies to other
Workpackages

  • Requirements


WP1 Requirements
WP2 Data management
  • WP3
  • Jobmanagement
  • Co-scheduling of data
  • Workflows
  • WP4
  • Portals
  • platforms

30
WP2 ESR Data Management Requirements for Grid
middleware stacks
Road map
for 2.3 Jun Aug
Okt
Jan 13 15
17 21

Checkpoint Internal report esp. on missing
pieces with EGEE Sync with WP34
Checkpoint WP1. /WP2. requirements considered
D2.3 ready
31
WP2 ESR Data Management Requirements for Grid
middleware stacks
  • Future
  • universal or domain specific WS-
    platform to access data in different Grid
    infrastructures (EGEE, Nordu-Grid)
  • Open GIS Services to Glite, Unicore, .
Write a Comment
User Comments (0)
About PowerShow.com