PowerPointPrsentation - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

PowerPointPrsentation

Description:

1) Climate and Environmental data Retrieval and Archiving ... Raw data file in DKRZ Archive. M.Lautenschlager (WDCC / MPI-M) / 15.06.06 / 11 ... – PowerPoint PPT presentation

Number of Views:83
Avg rating:3.0/5.0
Slides: 24
Provided by: mlau7
Category:

less

Transcript and Presenter's Notes

Title: PowerPointPrsentation


1
World Data Center Climate Status and Portal
Integration
Michael Lautenschlager, Hannes Thiemann and
Frank Toussaint ICSU World Data Center
Climate Model and Data / Max-Planck-Institute for
Meteorology Hamburg, Germany
GO-ESSP at LLNL Livermore, June 19th 21st, 2006
WDCC Home www.wdcc-climate.de / WDCC Contact
data_at_dkrz.de
2
Content WDCC Status CERA Concept Portal
Integration
3
WDCC Content
June 2006 590 Experiments / 79.000 Data Sets
Data from Earth System Modelling and Related
Observations
ERA40
Start Approved in January 2003 Maintenance
Model and Data (MD/MPI-M) and German Climate
Computing Centre (DKRZ)
4
Data Export from WDC Climate
Corresponds to 2 10 TB/month
5
Geographical Distribution of WDCC Users
Total number of registered users 750 (Mai 2006)
6
Data Import into WDC Climate
ECHAM5/MPI-OM IPCC AR4 Scenarios (ca. 110 TB)
7
CERA1) Concept Semantic Data Management
  • (I) Data catalogue and Pointer to Unix files
  • Enable search and identification of data
  • Allow for data access as they are (coarse
    granularity raw data files)
  • (II) Application-oriented data storage in BLOB
    tables
  • Time series of individual variables are stored as
    BLOB entries in DB Tables (fine granularity data
    products)
  • Allow for fast and selective data access
  • Storage in standard data format (GRIB, NetCDF/CF)
  • Allow for application of standard data processing
    routines (PINGOs, CDOs)

1) Climate and Environmental data Retrieval and
Archiving
8
WDCC Data Topology
Level 1 - Interface Metadata entries (XML,
ASCII) Data Files
Level 2 Interf. Separate files containing
BLOB table data in application adapted
structure (time series of single variables)
BLOB DB Table corresponds to scalable, virtual
file at the operating system level.
9
CERA Data Model
10
Data matrix of model experiment
Model variables
Model Run Time
Raw data file in DKRZ Archive
2 D small BLOBS (180 KB) 3 D large BLOBS (3
MB) Raw data file direct model output (1.3
16.2 GB)
Each columm is one BLOB Table in CERA-DB
11
Climate Model Data Structures
  • Preferred DB-storage structure for web-based
    access
  • single variable
  • single level
  • time series of 2D gridded data records
  • Formats GRIB-1 NetCDF/CF (- GRIB-2)

Application related data structure (2-D)
original data structure (4-D)
12
DKRZ Architecture
TX7 Intel Itanium-2 with Linux
13
Portal Integration
Two strategies One way integration discovery
and use metadata are integrated in a central data
portal in one step Example C3Grid data
catalogue (refer to presentation from Heinrich
Widmann) Two way integration discovery metadata
are integrated in central data portal, use
metadata are extracted from remote archive when
they are needed for data download and
processing Example Primary data publication in
TIB library catalogue (STD-DOI) WDCC integration
in NDG (NERC Data Grid)
14
Primary data publication (STD-DOI)
URL http//www.std-doi.de/
Primary Data Publication Process
Data Review
ISO 690-2 Metadata for citation of electronic
media
15
Example Publ.-DOI from WDCC
16
DOI URN
17
Publ.-DOI
18
830 GB
19
Ident.-DOI
Data retrieval procudure is given at the end
(user identification is required)
20
WDCC Metadaten und OAI-PMH
  • O p e n A r c h i v e s I n i t i a t i v e
  • Protocol for Metadata Harvesting

21
WDCC support ofOAI-PMH requests
  • Identify
  • get information about a repository
  • 2. ListMetadataFormats
  • list of available metadata formats
  • 3. ListSets
  • list the structure of a repository (sets,...)
  • 4. ListIdentifiers
  • list of all identifiers of a set
  • 5. GetRecord
  • retrieve one individual metadata record
  • 6. ListRecords
  • list records of a set (used for harvesting)

Ü
22
OAI-PMH http
  • http request
  • base URL
  • list of keyword arguments
  • Form keyvalue pairs
  • Request type GET or POST (URI syntax)
  • http response
  • responseDate (format UTCdatetime)
  • request (request that generated a response)
  • error (incl. request that generated the error)
  • http//www.openarchives.org/OAI/openarchivesprotoc
    ol.html

Ü
23
  • WDCC OAI server at
  • (Software dlese (www.dlese.org) apache-tomcat
    5.5.12 Java 1.5)
  • http//uranus.dkrz.de8080/oai/provider
  • - 35 IPCC experiments with more than 11000
    datasets
  • Metadata Format ISO 19115
  • C3Grid (http//gsphere.awi.de8080/gridsphere/g
    ridsphere)
  • - 40 STD-DOI experiments with more than 1700
    datasets
  • Metadata Format DIF
  • GO-ESSP (NDG, http//ndg.badc.rl.ac.uk/)

Ü
24
NDG
OAI Harvesting (Pull or Notification)
Ü
DIF XMLs WDCC
OAI Server WDCC (Software dlese)
OAI Client NDG (dlese)
Catalog NDG record 1...n
Discovery Portal NDG
DIF XMLs Provider 2
OAI Server 2
Process
OAI Server n
Delivery
25
URL http//glue.badc.rl.ac.uk/discovery/ Keyword
ECHAM4
Write a Comment
User Comments (0)
About PowerShow.com