Title:
1curator DB design
Curator meeting, GFDL, Sep 20
2Why RDBMS
- A lot of information
- Model metadata
- Experiments metadata
- Institution/user metadata
- Data metadata
- Mostly its in textual form
- Information is internally linked tightly that can
be easy to express by means of relational
databases. - Relational databases have well developed means
for searching and extracting procedures (SQL
query language and program interfaces for any
language) as for local as well as for remote
user. - Very reliable, safety technology.
Curator meeting, GFDL, Sep 20
3Desirable Features of Model Data Factory
- Relational Database storing metadata, containing
description of - model components and model configuration
- scenarios
- postprocessing (model output and CMOR) directives
- experiments
- variables
- formalized rules of Quality Control
- data locations
- task scheduler
- users and groups accounts
- XML as data exchange format
- for compliance with FRE
- working format of existing third party software
- good fitted for hierarchical metadata description
- prevalent in world, easy to exchange with others
Data Portals - Model Builder (FMS Runtime Environment in GFDL)
- checks out available model components from DB
- chooses model datasets from DB
- sets postprocessing directives
Curator meeting, GFDL, Sep 20
4Desirable Features of Model Data Factory
(continue)
- Climate Model Output Rewriter (CMOR) subsystem
- prepares data consistently with specific project
requirements - Data Publisher
- transfer data to Data Portal storage in
accordance to settings from DB - Data Portal Software Package
- Configuration Manager (configures Aggregation
Server and Data Portal Interface) - Search Catalog Engine
- Data Subsampling Engine
- Data Computation Engine
- Data Visualization
- Data Delivery Manager
Curator meeting, GFDL, Sep 20
5Standard scenario of functioning Model Data
Factory (ideal picture)
- Scientist builds model in FRE using available
model components, datasets and forcing scenario. - FRE puts metadata about built model, scenario,
experiment into curator DB and runs experiment
- Postprocessing subsystem extracts metadata about
postprocessing plan from curator DB and
executes it, and on finish puts metadata about
processed experiment back into DB. - Data Publisher (DP) regularly checks curator DB
for new experiments marked as public and if
finds any invokes CMOR. - CMOR goes to curator DB for metadata and
processes needed data following metadata
instructions. - DP calls QAC and then transfers data to Data
Portal storage. - Configuration Manager configures Aggregation
Server and Data Portal Interface and puts records
about new public data in curator DB. - End of process, data is ready to go.
Curator meeting, GFDL, Sep 20
6Common functionality schema of Model Data
Factory
Curator meeting, GFDL, Sep 20
7Database Compartments
Database curator design
- Model Metadata Compartment
- contains models descriptions, allows to build
coupled model of needed configuration -
- Variables Compartment
- List of all related physical variables
- Workflow Compartment
- contains scenarios, experiments, institutions,
projects and users info - Postprocessing Compartment
- defines postprocessing plan for conducting
experiment - Data Portal Compartment
- contains info about experiments data
Curator meeting, GFDL, Sep 20
8Curator meeting, GFDL, Sep 20
9Model Metadata Compartment(in development)
Workflow Compartment
Experiments
Variables Compartment
Curator meeting, GFDL, Sep 20
10Data Samples from Model Compartment
Curator meeting, GFDL, Sep 20
11Variables Compartment
Workflow Compartment
Curator meeting, GFDL, Sep 20
12Data Sample from Variables Compartment
Curator meeting, GFDL, Sep 20
13Workflow Compartment (in development)
Curator meeting, GFDL, Sep 20
14Data Samples from Workflow Compartment
Curator meeting, GFDL, Sep 20
15Postprocessing Compartment
Data Samples from Postprocessing Compartment
Curator meeting, GFDL, Sep 20
16Data Portal Compartment
Curator meeting, GFDL, Sep 20
17Data Samples from Data Portal Compartments
Curator meeting, GFDL, Sep 20
18curator DB is in use now
Curator meeting, GFDL, Sep 20
19Future Development
- Bring DB terms to conventional terminology.
- Set up model metadata schema standards and create
tables in curator DB following this schema. - Fill these tables with real metadata extracted
from models of GFDL, CCSM, MIT and from ESMF
Component Database. - Implement tables for observation data metadata.
- Implement DODS aggregated data support.
- Build XML bridge for XML transcoding DB
input/output
Curator meeting, GFDL, Sep 20
20- END
- Questions?
- Suggestions?
- Objections?
-
- Thanks!
Curator meeting, GFDL, Sep 20