Title: Metadata and Data Discovery Expert Team
1Metadata and Data Discovery Expert Team
- Recommendations to the
- DMAC Steering Team
- May 10, 2006
2Metadata for IOOS
- The information needed to
- identify
- assess
- use
- access
- exchange
- transport
- archive
- data for IOOS
3Metadata Aspects
- Metadata Expert Team divided metadata issues into
the following - Content
- Format specification
- Vocabularies
- Discovery
4Teams Approach
- Identify existing practices
- What currently exists that IOOS can readily
use/adopt - Evaluate capabilities
- What does the use of a particular practice buy
the IOOS community - Determine IOOS requirements
- How and What DMAC and IOOS participants need to
do in order to utilize these systems - Focus
- Catalog Services / Discovery Mechanisms
- Metadata Standards
5Recommendations
- Metadata Expert Teams effort resulted in five
reports/ recommendations to the DMAC Steering
Team - IOOS Metadata Requirements Version 1
- Metadata Standards Review and Supporting Tools
- Metadata Catalog Services
- IOOS Vocabulary Version 1
- Required and Recommended Vocabularies for IOOS
Metadata
6Recommendation
- IOOS Metadata Content Requirements
- Version 1
- The following information is proposed as initial
minimal content for IOOS metadata. Examples for
adapting the FGDC, DIF, and OBIS metadata
standards are included and should be applied
where appropriate.
7The Problem
- Info in Info out
- Very little metadata means very little capability
- FGDC and other metadata standards can be written
with minimal information - title, summary, keywords, lat/lon, date
- Doesnt provide the info needed for IOOS
8Metadata for IOOS
Collection method Accuracy of location Temporal
extent Observation parameters
Analysis method Spatial extent Quality assurance
Depth Instrument Calibration
Altitude Units Valid values
Title Summary Keywords
Server File system Database name
Data set format Online resource URL
Lineage Unique ID Version history
Liability System environment Data processing
methods
Access constraints
9Proposed metadata requirements
- For this initial recommendation
- List of metadata needs by desired functionality
- Use, access, transport, etc.
- Content only
- Ensure metadata contains the info needed to work
in IOOS - Format will be addressed in future
10Metadata content by functionality
- Organized metadata needs into the following
categories - Consumer use
- Data management
- Discovery
- Access
- Data transport
- Archive
11Consumer use
- Textual summary or abstract, purpose
- Observation parameters (attribute names, units,
valid values, etc.) - Data source information (instrument type,
manufacturer, calibration, etc.) - Originator
- Quality assurance / quality control methods
- Sample collection methods in field (when
applicable) - Sample analysis methods (when applicable)
- Data processing methods
- Spatial extent (horizontal and vertical
geographic location) - Accuracy of location
- Temporal extent
- Use constraints
- Liability
12Data management
- Title
- Unique identifier
- Textual summary or abstract, purpose
- Progress/status
- Observation parameters (attribute name, units,
valid values, etc.) - Data source information (instrument type,
manufacturer, calibration, etc.) - Originator
- Quality assurance / quality control methods
- Sample collection methods in field (when
applicable) - Sample analysis methods (when applicable)
- Data processing methods
- Spatial extent (horizontal and vertical
geographic location) - Accuracy of location
- Temporal extent
- Data set type (raster, vector, point)
- Data set format
- Data set location (server, file system, database
name, etc.) - Use constraints
- Liability
13Discovery
- These (traditional) fields provide very limited
discovery - Title
- Originator
- Keywords
- Spatial extent
- Temporal extent
- More fields should be added as IOOS evolves.
14Access
- Online resource URL (http, ftp, telnet, etc.)
- Access constraints
- Data set format
- System environment (software, operating system,
etc) - Will coordinate with Data Transport Expert Team
on additional fields and formats
15Data transport
- Data set type (raster, vector, point)
- Data set format (shapefile, tabular ascii, GRIB,
etc.) - System environment (software, operating system,
etc) - Will coordinate with Data Transport Expert Team
on additional fields and formats
16Archive
- These fields were requested by the Archive Expert
Team - Unique ID
- Data set latency specification
- Data set lineage and version history
- Quality assurance / quality control methods
- Sample collection methods in field (when
applicable) - Sample analysis methods (when applicable)
- Data processing methods
- Data set citation and reference (publications
based on analysis of data set, etc.) - Expiration date (date data set should be deleted)
17Templates and Examples
- Templates and examples available for
- FGDC
- DIF
- OBIS
18Next steps
- IOOS Metadata Requirements
- Determine needs for machine-to-machine fields
- Discovery
- Access
- Data transport
- Modeling?
- Post current version on public web page
19Recommendation
IOOS Metadata Requirements Version 1 Discussion
/ Comments / Suggestions / Decisions
20Recommendation
- Metadata Standards Review and
- Supporting Tools
- Fund the further review of existing metadata
standards to - Identify those that meet IOOS requirements
- Expedite the process to review future standards
- Identify where further refinements are needed for
existing metadata standards
2114 metadata standards 850 metadata fields 6
functionalities 4 data types
Matrix
22Status
- Selection of metadata standards
- DMAC plan, input from team
- Need input from community
- Crosswalk of fields
- Completed for FGDC, DIF, OBIS metadata, ISO 19115
- Started for others but needs lots of input and
review - Not a comprehensive list of fields to date
- Functionalities
- Some work mapping fields to IOOS minimal metadata
requirements - Data/observation types
- No work to date
23Matrix Metadata Standards
- 14 metadata standards
- Federal Geographic Data Committee (FGDC)
- DIF
- OBIS metadata
- OBIS schema
- ISO 19115
- JGOFS
- netCDF
- WMO core ocean gridded data
- Open Archival Information System (OAIS)
- Ecological Metadata Language
- Geographic Markup Language
- Sensor ML
- Marine XML
- Earth Science Markup Language
24Matrix Metadata Standards
25Matrix - Functionalities
- 6 functionalities
- Discovery
- Data consumer use
- Data manager use
- Access
- Transport
- Archive
- Results in IOOS minimal or must have metadata
26Matrix FunctionalitiesExample
27Data/Observation Types
- Are these the right ones?
- Gridded, raster
- Single dimensional, point
- Vector
- Model data
- ???
28Status
- Selection of metadata standards
- DMAC plan, input from team
- Need input from community
- Crosswalk of fields
- Completed for FGDC, DIF, OBIS, ISO metadata
- Started for others but needs lots of input and
review - Functionalities
- Some work mapping fields to IOOS minimal metadata
requirements - Data/observation forms/structure
- No work to date
29Next Steps
- Critical component of DMAC
- Requires lots of time and attention to detail
- Complete as contract deliverable
- Complete crosswalks for other metadata standards
- Analyze field content for machine-to-machine
interoperability - Research fields needed to adequately describe
observation types, forms, and parameters - Track and incorporate efforts such as MMI,
QARTOD, ORION to document observational
parameters of interest to IOOS
30Recommendation
Metadata Standards Review and Supporting
Tools Discussion / Comments / Suggestions /
Decisions
31Recommendation
- Metadata Catalog Services
- Metadata shall be provided to the National
Spatial Data Infrastructure (NSDI) Clearinghouse,
the Global Change Master Directory (GCMD), or the
Geospatial One-Stop (GOS) catalog. Additionally,
when applicable, metadata and data for biological
observations shall be submitted to the Ocean
Biogeographic Information System (OBIS)
32Existing Catalog Services
- Identification of catalog services
- National Spatial Data Infrastructure (NSDI)
Clearinghouse - Geospatial One-Stop (GOS)
- Global Change Master Directory (GCMD)
- Ocean Biogeographic Information System (OBIS)
33National Spatial Data Infrastructure (NSDI)
Clearinghouse
- Description
- part of the Federal Geographic Data Committee
coordinated data sharing effort - community of distributed data providers
- publish (server access) metadata collections
- portal now using GOS framework
- Metadata
- FGDC Content Standard for Digital Geospatial
Metadata - FGDC Profiles and Extensions
- Use
- 404 clearinghouse nodes
- 191 US domestic, 213 non-domestic
- access to over 9.6 million metadata records
- 8,283,326 US domestic, 1,348,120 non-domestic
34Geospatial One-Stop (GOS)
- Description
- part of the Federal Geographic Data Committee
coordinated data sharing effort - supports the e-gov Geospatial One-Stop Initiative
- access to registered geographic information and
online access services including map services - portal provides map-based query and filtering
by topic categories - metadata individually submitted and/or
collections harvested - Metadata
- FGDC Content Standard for Digital Geospatial
Metadata - FGDC Profiles and Extensions
- Use
- 923 publishers
- 103,870 metadata records
35Global Change Master Directory (GCMD)
- Description
- established by NASA to promote and facilitate the
exchange of scientific data sets in support of
global change research - holds descriptions of earth and environmental
science data sets and services - portal provides the capability to search for data
and services - GCMD content is harvested by GOS
- Metadata
- Data Interchange Format (DIF)
- Service Entry Resource Format (SERF)
- Use
- contributions from 2564 data centers
- 47 countries (including US)
- contains over 16,300 unique metadata records
- over 4000 records under the category ocean
36Ocean Biogeographic Information System (OBIS)
- Description
- established by the Census of Marine Life (CoML)
- provides global geo-referenced information on
marine species - provides several spatial query tools for
visualizing relationships among species and their
environment - includes data from an international federation of
providers - not limited to CoML-related projects
- Metadata
- Darwin Core 2 OBIS extension
- OBIS Discovery Metadata
- Use
- 98 databases worldwide
- access to 9.2 million data and metadata records
37Rationale
- These catalog services
- are national and/or international in extent
- are established resources with support and ample
guidance - provide for data discovery
- enable initial cataloging of IOOS data
- possibly applied as step in OS certification
38Next Steps
- Monitor progress of catalog service coordination
- all metadata submitted to these services
discoverable through GOS - duplication of metadata records among different
services - ongoing work by organizations maintaining these
services - Streamline the process for metadata submission
- Test application of Expert Team vocabulary
recommendations in these services - Address other existing services
- EPA Exchange Network
- National Water Quality Monitoring Network
39Recommendation
- Metadata Catalog Services
- Discussion / Comments / Suggestions / Decisions
40Recommendation
- IOOS Vocabulary
- Version 1
- IOOS should endorse the establishment of a
controlled vocabulary and that version 1 of the
IOOS Vocabulary be adopted as specified
41Description
- A discrete set of terms that can be referenced in
metadata (keywords) and other IOOS documentation - Drawn from existing IOOS documentation
- The First U.S. Integrated Ocean Observing System
(IOOS) Development Plan, Ocean.US Publication No.
9, January 2006. - Contains terms identified as
- IOOS Identifiers
- IOOS Core Variables
- IOOS National Backbone Programs
42IOOS Identifiers
- Integrated Ocean Observing System
- IOOS
43IOOS Core Variables
- Bathymetry
- Bottom Character
- Contaminants
- Dissolved Nutrients
- Dissolved O2
- Fish Abundance
- Fish Species
- Heat flux
- Ice Distribution
- Ocean color
- Optical properties
- Pathogens
- Phytoplankton species
- Salinity
- Sea Level
- Surface Currents
- Surface Waves
- Temperature
- Zooplankton abundance
- Zooplankton species
44IOOS National Backbone Programs
- Altimeter Data Fusion Center
- ADFC
- Benthic Habitat Mapping and Monitoring
- Coastal Change Assessment Monitoring
- Coastal Field Data Collection Program
- CFDCP
- Coastal Mapping
- Coastal-Marine Automated Network
- C-MAN
- Coastwatch
- Commercial Statistics
- Coral reef mapping
- Coral reef monitoring
- Geostationary Operational Environmental
Satellites - GOES
- Global Seismic Network
- GSN
- Habitat Assessment
- Hydrographic Surveying
- National Current Observing Program
- National Data Buoy Center
- NDBC
- National Estuarine Research Reserve System
- NERRS
- National Observer Program
- National Stream Quality Accounting Network
- NSQAN
- National Streamflow Information Program
- NSIP
- National Water Level Observation Network
- NWLON
- Physical Oceanographic Real-Time System
- PORTS
- Polar Operational Environmental Satellite
- POES
- Protected Resource Surveys
- Recreational Fisheries
- Shoreline Change
45Application
- Initial application in metadata keywords
- ltkeywordsgt
- ltthemegt
- ltthemektgtIOOS Vocabulary Version
1lt/themektgt - ltthemekeygtSalinitylt/themekeygt
- ltthemekeygtTemperaturelt/themekeygt
- ltthemekeygtIntegrated Ocean Observing
Systemlt/themekeygt - ltthemekeygtIOOSlt/themekeygt
- ltthemekeygtNational Estuarine Research
Reserve Systemlt/themekeygt - ltthemekeygtNERRSlt/themekeygt
- lt/themegt
- lt/keywordsgt
46Application
- Application in an ontology
- lt?xml version"1.0"?gt
- ltrdfRDF xmlnsrdf"http//www.w3.org/1999/02/22-r
df-syntax-ns" - xmlnsowl"http//www.w3.org/2002/07/owl"
xmlnsdc"http//purl.org/dc/elements/1.1/"
xmlnsrdfs"http//www.w3.org/2000/01/rdf-schema"
xmlns"http//marinemetadata.org/2005/02/ioos" - xmlbase"http//marinemetadata.org/2005/02/ioos"gt
- ltowlOntology rdfabout""gt
- lt/owlOntologygt
- ltowlClass rdfID"Physical"gt
- ltrdfssubClassOf rdfresource"Variable" /gt
- lt/owlClassgt
- ltowlClass rdfID"Variable" /gt
- ltPhysical rdfID"Surface_waves" /gt
- ltPhysical rdfID"Temperature" /gt
- ltPhysical rdfID"Bathymetry" /gt
- ltPhysical rdfID"Sea_level" /gt
- ltPhysical rdfIDSurface_Currents" /gt
- ltPhysical rdfID"Ice_distribution" /gt
- ltPhysical rdfID"Heat_flux" /gt
47Rationale
- By applying versioning and content control to a
list of terms - the list can be referenced
- relationships among the IOOS vocabulary and other
vocabularies can be developed - a knowledge base can begin to be established
- Establishing an IOOS vocabulary and applying
these terms to data/metadata, enhances the
discovery of IOOS data in catalog systems - A step toward interoperability by adopting a
common vocabulary
48Next Steps
- Expand vocabulary as recommended to include
- Additional core variables as defined by IOOS
- OOS characteristics as determined by Regional
Associations and individual Observing Systems - System characteristics
- Sensors
- Observed parameters
- Etc.
- Submit Version 2 for Steering Team approval (Nov
06)
49Recommendation
- IOOS Vocabulary
- Version 1
- Discussion / Comments / Suggestions / Decisions
50Recommendation
- Required and Recommended
- Vocabularies for IOOS Metadata
- Require the application of controlled
vocabularies in metadata to support catalog
services
51Required and Recommended Vocabularies
- Six vocabularies are identified for initial use
in IOOS metadata - Two vocabularies are REQUIRED
- Others should be used where applicable to
specific catalog services
52Required Vocabularies
- IOOS Vocabulary Version 1
- Identifiers (required)
- Core variable (recommended)
- National Backbone Programs (recommended)
- ISO Topic Categories
- farming
- biota
- boundaries
- climatologyMeteorologyAtmosphere
- economy
- elevation
- environment
- geoscientificInformation
- health
- imageryBaseMapsEarthCover
- intelligenceMilitary
- inlandWaters
- location
- oceans
- planningCadastre
- sociology
- structure
- transportation
- utilitiesCommunication
53Recommended Vocabularies
- GCMD's Science Keywords and Associated Directory
Keywords - Apply to DIF and SERF metadata for use in GCMD
- Can be applied to FGDC metadata for discovery
through NSDI and GOS - IHO Codes for Oceans and Seas
- Apply to OBIS discovery metadata
- Can be applied to FGDC metadata for discovery
through NSDI and GOS - OBIS Marine Species Groups
- Apply to OBIS discovery metadata
- Can be applied to FGDC metadata for discovery
through NSDI and GOS - CF Standard Names
- Recommended by Modeling Caucus for model output
datasets - Can be applied to FGDC metadata for discovery
through NSDI and GOS
54Rationale
- Recommendation supports the DMAC Plans framework
for interoperability by endorsing the adoption of
common vocabulary - Enhance IOOS data discovery
- Step toward optimizing use of existing catalog
services
55Next Steps
- OOS community outreach
- identification of individual OOS vocabularies
- conversion of vocabularies into OWL
- ontology development harmonization among
vocabularies - Work with the RA Caucus, Marine Metadata
Interoperability (MMI) project and other groups
to advance the above items
56Recommendation
- Required and Recommended
- Vocabularies for IOOS Metadata
- Discussion / Comments / Suggestions / Decisions
57Work Plan
- Community recommendations critical
- Identify what needs to be captured per data type
and collection method - Sensor/Instrument
- Station characteristics
- QA/QC
- Attributes
- Transport
- Format/storage
-
- Identify vocabularies in use
- Coordinate work through RA Caucus
- Work with other organizations (MMI, QARTOD) to
compile guidance and formalize recommendations
58Work Plan
- Continuation of Standards evaluation
- Other standards to include
- Develop a schema repository
- Identify existing standards available to capture
community needs - Apply/combine portions of schema
- Explore registries
- Metadata
- Vocabulary
- Discrete entries for citing vocabularies for
machine readability - Develop guides
- Review available tools
59Ponderings
- How many metadata standards will be
acceptable/workable? - How many clearinghouse standards will be
acceptable/workable? - Will DMAC need to provide tools to garner
participation? - Who will write the code to make data discovery
work? Who will write tools? How much funding is
there for these things? - Focus on how to build data discovery from various
clearinghouses (mixture of all of above?) - How will data discovery interoperate/coordinate
with data access and transport? - How will data discovery figure into system
architecture plan now being developed? - Build a portal to which clearinghouse?
60We have some of theseso wheres the problem? (1)
Metadata
Data
Protocol
HTTP
REST
SOAP
Z39.50
OPeNDAP
THREDDS
LAS
DCMI
ESML
ADL
Content
TIF
DFDL
WxS
MarineXML
EML
NetCDF
FGDC
Coards/CF
GML
JPEG
ISO
HDF
ASCII
Slide from presentation by John Graybeal, MMI