Title: IPOPP Data Products for NPP
1IPOPP Data Products for NPP
- Format and Metadata
- EOS/NPP Direct Broadcast Meeting
- Bangkok, Thailand
Bill Thomas NPOESS IPO/MITRE
2Mission Data Flow
Source Wittman (Raytheon)
3NPOESS products delivered at multiple levels
4(No Transcript)
5NPOESS Data Format Considerations
- Users and use patterns Designed to serve
operational users, current science users, and
future archival users - Users via Centrals or via field terminals
- Operational users need effectively all of the
data quickly - Research users need current information, but
often have time and resources to improve product
quality with post-processing - Archival researchers look for highly selective
data sets - Sizing and Complexity
- Increase in data volume over heritage instruments
- Relatively short-duration granules, typically
30-90 seconds of data - Multiple versions of the data, from RDR to SDR to
EDRs. - Anticipating change
- Detailed formats, contents, product lists, and
interfaces that must be accommodated by NPOESS
data product framework format through the mission
lifetime
6NPOESS Data Format Overview
- Consistent structure across data products
- Organization for each product is the same as all
others. - Granule is the fundamental unit of tracking,
processing, and access - Granules can be combined into aggregations
- Data Products stored as HDF5 objects
- Structural hierarchy from Files to Granules to
Arrays - Files Granules contain both data metadata
- Collection metadata (quasi-static) retained
separate from granule (dynamic) metadata - Profiles in XML
- describe the product contents
7IPOPP Data formats
- IPOPP data formats are planned to be consistent
with the Mission defined data formats used at the
Centrals - Benefits of commonality
- Leverage contractor efforts in product definition
- Facilitates usage scenarios that leverage both
direct broadcast data and data processed at the
Centrals - Facilitates interoperability of DB missions and
Centrals - Facilitates DB participation in a global Cal/Val
campaigns - Interface Control Documents and subsidiary
configuration management documents will provide
the details
8NPOESS Motivation for HDF5
- Data Complexity
- Several product types representing different
processing levels - Many specific products
- Complex products, with high data rates
- User Needs
- Interoperability with user systems
- Interface consistency
- User community familiarity
- Maturity
- Performance
- Sustainability
9Format Overview
- Data products are stored as a collection of
n-dimensional arrays - Aggregations extend a dataset dimension
- Data product, quality and geo arrays share a
common index - Quality Flags stored as packed bit arrays
- Multiple data products sharing GEO may be
packaged together in a single file (if they share
a common relocation)
10Quality Flags
- Quality Flags are densely stored arrays of
quality information about their corresponding
data product arrays - Bit level quality factors grouped together and
stored as a collection of generically named byte
arrays - The dimensions of the quality flag arrays are the
same as the data product arrays that they qualify - Information on the meaning and bit layout of the
quality flag arrays is contained in the xml
product profile
11Geolocation
- Geolocation stored as another Data Product
- Geolocation arrays may be packaged within the
same file as the data arrays, or stored in a
separate file and referenced from the data
product file - The dimensions of the geolocation arrays are the
same as the data product arrays to which they
apply - A collection of standard geolocation types have
been defined and are used in the data products
12NPOESS Products HDF5 Conceptual Diagram
Source Tomachevsky et. al, HDF/HDFEOS Workshop
IX, Dec 2005
13Data Product Example Cloud Products
File (Root) Information
All_Data contains the data arrays
Data Products contains metadata and references
the data arrays
Product Information
Aggregation Information
Granule Information
14Data Product Example
CCL Data Product
CCL Data Array
Scale Factors
Quality Flags
Common Geolocation
15NPOESS Data Products Metadata
- NPOESS metadata elements that are found in the
FGDC metadata specification follow the
FGDC-naming convention that separates most words
with underscores (e.g., Instrument_Short_Name).
In some cases, a hyphen is used for a delimiter. - NPOESS metadata elements that are aggregate
elements have names concatenated together without
delimiters (e.g., AggregateEndingDate). - NPOESS metadata elements that have no FGDC
metadata counterparts begin with N_ and follow
the FGDC-naming conventions (e.g.,
N_Processing_Domain).
16HDF5 NPOESS Root Group Attributes
17HDF5 NPOESS Product Group Attributes
18HDF5 NPOESS Aggregate Attributes
19HDF5 NPOESS Granule Attributes (1 of 2)
20HDF5 NPOESS Granule Attributes (2 of 2)
21HDF5 XML Users Block
- The XML Users Block for NPOESS Data Products
provides a quick-look into the metadata of the
associated HDF5 file. - Provides elements
- N_Processing_Domain
- Mission, Platform, and Instrument Names
- N_Dataset_Type_Tag
- Number_of_Data_Products
- CollectionShortName(s)
- Aggregation Information
- Timestamps (Creation Timestamp, Observation
Timestamps) - Percent Missing Data
22Summary/Next Steps
- Benefits of IPOPP data formats
- commonality with archive formats
- consistency across products
- flexibility in packaging and aggregation
- Issues with DB users
- How to facilitate use of the formats
- How to collect and leverage user feedback
- How/where to tailor or extend to satisfy unique
DB needs - Capturing data product design best practices
- Ease of use as well as ease of creation
- Flexibility vs. consistency vs. ease-of-use for a
purpose