Title: The Earth System Curator
1The Earth System Curator
- Metadata Infrastructure for Climate Modeling
- Rocky Dunlap
- Georgia Tech
2What is the Earth System Curator?
- The goal of Curator is to link climate datasets
with a detailed description of the model that ran
to produce the dataset - Use cases for climate model metadata
- Provenance (history of what happened)
- Archival and search (for models and datasets)
- Model inter-comparison
- Compatibility checking
- Generation of coupler components
3Collaborators
- Earth System Modeling Framework (ESMF)
- Software infrastructure to facilitate building
numerical Earth System models - Component-based model development
- Built in tools for managing common modeling tasks
(coupling fields, calendars, grid creation, etc). - Earth System Grid (ESG)
- A large scale distributed portal for hosting data
produced by Earth System models - Services such as dataset ingest, faceted search,
dataset browsing, viewing metadata, downloading
datasets
4Getting Model Metadata into ESG
- ESMF modeling components are self-describing
- Metadata is exported from an ESMF component in
XML format - The XML is ingested into ESG and exposed to the
data portal - Users discover components and datasets via the
portal
5Metadata Lifecycle
6Metadata Lifecycle
- ESMF component exports XML metadata
- The XML is validated and harvested into a Java
object representation - The Java objects are persisted to a relational
database (RDBMS) - Metadata in the RDBMS is then harvested into RDF
a Semantic Web ontology language - The RDF is accessed by the ESG web portal for
faceted search of the metadata
7ESMF XML Output (example)
ltmodel_component name"Finite Volume Dynamical
Core"gt ltdiscipline_setgt ltdiscipline
name"Atmosphere" /gt lt/discipline_setgt
ltphysical_domain_setgt ltphysical_domain
nameEarth system" /gt lt/physical_domain_setgt
ltagency_setgt ltagency name"NASA" /gt
lt/agency_setgt ltinstitution_setgt
ltinstitution name"Global Modeling and
Assimilation Office (GMAO)" /gt
lt/institution_setgt
XML output will be available on next ESMF release
8ESMF XML Output (example)
ltauthor_setgt ltauthor name"Max
Suarez" /gt lt/author_setgt
ltcoding_language_setgt ltcoding_language
name"Fortran 90" /gt lt/coding_language_setgt
ltmodel_component_framework_setgt
ltmodel_component_framework name"ESMF (Earth
System Modeling Framework)" /gt
lt/model_component_framework_setgt
ltvariable_setgt ltvariable
shortname"DPEDT" longname"Edge pressure
tendency" units"Pa s-1" /gt ltvariable
shortname"DUDT" longname"Eastward wind
tendency" units"m s-2" /gt
lt/variable_setgt lt/model_componentgt
9Faceted search
Harvested component
ESG Prototype Data Portal
10ESG Prototype Data Portal
11ESG Prototype Data Portal
12Demo of Dycore Portal
http//dycore.ucar.edu/
13Technical Challenges Along the Way
- Web-based systems often deal with a lot of data
formats. This requires lots of translations. - XML ? Java objects ? relational database ? RDF
- Collaborative ontology development is
challenging. There is a need for both governance
structures and tools. - This will continue to affect all areas of
e-science
14Future Work for Curator
- Continue working with ESG to add more
sophisticated metadata to the portal - ESMF components to export more detailed XML
- Collaborate with international partners for
designing metadata that can be used by a wide
audience