ESG Publication Tools

About This Presentation

Title:

ESG Publication Tools

Description:

Scanning self-describing dataset to extract metadata. Aggregate variables ... starts the dataset scan. Options are: - create a new dataset, or replace ... – PowerPoint PPT presentation

Number of Views:33

Avg rating:3.0/5.0

Slides: 11

Provided by: deanwi

Category:

more less

Transcript and Presenter's Notes

Title: ESG Publication Tools

1
ESG Publication Tools
PCMDI Software Team
ESG All Hands Meeting Boulder, Colorado April
29, 2008
LLNL-PRES-403079
2
Overview

Publication is the process of generating metadata
about ESG datasets, and making that information
available to ESG services
Search, browse, download, server-side processing
rely on published metadata
Eventually will tie into a notification service
Unit of work is a dataset
Question need to publish individual files
directly?
Publication deals with files and aggregations as
first-class objects
Persons responsible for publication are the data
publishers

3
Goals

Publisher can read metadata in a collection of
files, and
Add new metadata
Modify existing m-d
Add, update, delete dataset
Flexibility to add new projects
Static configuration where possible (minimize
coding)
Logic can be encapsulated in project-specific
handlers
Metadata fields of interest are defined by the
configuration
Different projects may have different metadata
items.
CF-1 support
Standard names
Spatio-temporal coordinates
Standard configuration
.ini style

4
Goals

GUI, but publishing is also scriptable
Quality control checks for
Duplication of data
Validity of coordinate metadata (ex.
monotonicity of time dimension)
Validity of standard name
Generation of THREDDS catalogs to support LAS,
harvesting
Generation of data aggregations
Ability to publish both online and offline
(tertiary storage) datasets.
For offline data, requires a list of paths /
filesizes
Support for Dublin Core
Some CF fields map to DC

5
The Process

Specify
Project (IPCC_AR4, C-LAMP, NARCCAP,)
Dataset
Metadata may be read from self-describing
dataset, or input by user
Options for specifying a dataset
Read paths from a file
Regular expression template for paths
Directory name and file filter
Generate dataset metadata by
Scanning self-describing dataset to extract
metadata
Aggregate variables
Create/replace/update/delete
Publish
Generate THREDDS catalog. The form of the catalog
may depend on whether
Dataset is aggregated,
Non-aggregated,
Offline
Release data for harvesting

6
Dataset publishing on an ESG node Metadata
specification

Dataset pane
shows metadata in a file, allows modification
is project-specific
metadata is extracted from the first file in the
list

Output pane
displays logged results
log level is configurable

Expansion buttons in left pane correspond to
publication steps.

Status bar
shows scan progress

7
Data scan
1. Dataset is created or updated based on input
metadata.Required fields are highlighted.
Selecting an extraction option starts the
dataset scan. Options are- create a new
dataset, or replacethe dataset if it exists -
append or update - the files are added to an
existing dataset.
2. Files are scanned and internal database
tables populated.
3. If an aggregation dimension is found or
specified, variables are aggregated.
8
Data aggregation and publication