Title: NetCDF-4 Interoperability with HDF4 and HDF5 Ed Hartnett
1NetCDF-4 Interoperability with HDF4 and HDF5Ed
Hartnett Unidata, 8/4/9
2Purpose of Interoperability Features World
Conquest
- The purpose of the interoperability features is
to allow users to use netCDF programs on
non-netCDF data archives. - NetCDF-Java can read many data formats the idea
is to bring some of this functionality to the
C/Fortran/C libraries.
3Warning and Request
- HDF4 and HDF5 interoperability features are still
being tested. They are not ready for operational
use yet. - The interoperability features are available in
the netCDF daily snapshot release. - Please use them and send feedback to
- support-netcdf_at_unidata.ucar.edu
4Overview
- HDF4 Interoperability
- What is HDF4 and why bother with it?
- Reading HDF4 files with netCDF.
- Limitations and request for help.
- HDF5 Interoperability
- What is HDF5 and why bother with it?
- Reading HDF5 files with netCDF.
- Limitations.
5What is HDF4?
- The original HDF format, superseded by HDF5.
- HDF4 has built-in 32-bit limits that make it
unattractive for new data sets. It is still
actively supported by The HDF Group, but no new
features are added. - Get more info about HDF4 at http//www.hdfgroup.o
rg/products/hdf4
6Why Read HDF4?
- Some important data sets are distributed in HDF4,
for example the Aqua/Terra satellite data.
7HDF4 Background
- HDF4 has several different APIs. The one of
greatest interest to netCDF users is the SD
(Scientific Data) API. - The SD API is (intentionally) very similar to the
netCDF classic data model.
8Confusing HDF4 Includes NetCDF v2 API
- A netCDF V2 API is provided with HDF4 which
writes SD data files. - This must be turned off at HDF4 install-time if
netCDF and HDF4 are to be linked in the same
application. - There is no easy way to use both HDF4 with netCDF
API and netCDF with HDF4 read capability in the
same program.
9Reading HDF4 SD Files
- Starting with version 4.1, netCDF will be able to
read HDF4 files created with the Scientific
Dataset (SD) API. - This is read-only NetCDF can't write HDF4!
- The intention is to make netCDF software work
automatically with important HDF4 scientific data
collections.
10Building NetCDF to Read HDF4
- This is only available for those who also build
netCDF with HDF5. - HDF4, HDF5, zlib, and other compression libraries
must exist before netCDF is built. - Build like this
- ./configure with-hdf5/home/ed enable-hdf4
11Compiling with HDF4
- Include netcdf header file as usual.
- Include locations of netCDF, HDF5, and HDF4
include directories - -I/loc/of/netcdf/include -I/loc/of/hdf5/include
-I/loc/of/hdf4/include
12Linking with HDF4
- The HDF4 and HDF5 libraries (and associated
libraries) are needed and must be linked into all
netCDF applications. The locations of the lib
directories must also be provided - -L/loc/of/netcdf/lib -L/loc/of/hdf5/lib
-L/loc/of/hdf4/lib - -lmfhdf -ldf -ljpeg -lhdf5_hl -lhdf5 -lz
13Use nc-config to Help with Compile Flags
- The nc-config utility is provided to help with
compiler flags
./nc-config --cflags -I/usr/local/include
./nc-config --libs -L/usr/local/lib -lnetcdf
-L/machine/local/lib -lhdf5_hl -lhdf5 -lz -lm
-lhdf4 ./nc-config --flibs -M/usr/local/lib
-lnetcdf -L/machine/local/lib -lhdf5_hl -lhdf5
-lz -lm -lhdf4
14Implementation Notes
- You don't need to identify the file as HDF4 when
opening it with netCDF, but you do have to open
it read-only. - The HDF4 SD API provides a named, shared
dimension, which fits easily into the netCDF
model. - The HDF4 SD API uses other HDF4 APIs, (like
vgroups) to store metadata. This can be confusing
when using the HDF4 data dumping tool hdp.
15C Code to Read HDF4 SD File
/ Create a file with one SDS, containing
our phony data. / sd_id
SDstart(FILE_NAME, DFACC_CREATE) sds_id
SDcreate(sd_id, PRES_NAME, DFNT_INT32,
DIMS_2, dim_size)
SDwritedata(sds_id, start, NULL, edge, (void
)data_out) if (SDendaccess(sds_id)) ERR
if (SDend(sd_id)) ERR / Now open
with netCDF and check the contents. / if
(nc_open(FILE_NAME, NC_NOWRITE, ncid)) ERR
if (nc_inq(ncid, ndims_in, nvars_in,
natts_in, unlimdim_in))
ERR ...
16ncdump and HDF4 SD Files
- With HDF4 reading enabled, ncdump works on HDF4
files. - Sample MODIS file
../ncdump/ncdump -h MOD29.A2000055.0005.005.20062
67200024.hdf netcdf MOD29.A2000055.0005.005.20062
67200024 dimensions Coarse_swath_lines_
5km\MOD_Swath_Sea_Ice 406
Coarse_swath_pixels_5km\MOD_Swath_Sea_Ice 271
Along_swath_lines_1km\MOD_Swath_Sea_Ice
2030 Cross_swath_pixels_1km\MOD_Swat
h_Sea_Ice 1354 variables float
Latitude(Coarse_swath_lines_5km\MOD_Swath_Sea_Ice
, Coarse_swath_pixels_5km\MOD_Swath_Sea_Ice)
Latitudelong_name "Coarse 5 km
resolution latitude"
Latitudeunits "degrees" ...
17HDF-EOS Not Understood
- Many HDF4 data sets of interest follow the
HDF-EOS metadata standard. - Stored as a long text string in global
attributes, the HDF-EOS metadata looks messy.
// global attributes
HDFEOSVersion "HDFEOS_V2.9"
StructMetadata.0 "GROUPSwathStructure\n\tGROUP
SWATH_1\n\t\tSwathName\"MOD_Swath_Sea_Ice\"\n\t\
tGROUPDimension\n\t\t\\tOBJECTDimension_1\n\t\t\
t\tDimensionName\"Coarse_swath_lines_5km\"\n\t\t\
t\tSize406\n\t\t\tEND_OBJECTDimension_1\n\t\t\tO
BJECTDimension_2\n\t\t\t\tDimensionName\"Coarse_
swath_pixels_5km\"\n\t\t\t\tSize271\n\t\t\t...
18HDF4 Read Testing
- Tested in libsrc4/tst_interops2.c, which creates
some HDF4 files with the SD API, and then reads
them with netCDF. - If enable-hdf4-file-tests is used with netCDF
configure, some Aura/Terra satellite data files
are downloaded from Unidata FTP site, then read
by libsrc4/tst_interops3.c.
19HDF4 Interoperability Limitations
- File must be opened read-only.
- Only HDF4 SD data files are currently understood.
- This feature cannot be used at the same time as
HDF4's netCDF v2 API, because HDF4 steals the
netCDF v2 API function names. So you must use
disable-netcdf when building HDF4. (It might
also work to disable-v2 for the netCDF build.)
20Future HDF4 Work
- More tests.
- Support for HDF4 image types.
- Test support for compressed data.
- Add some support for HDF-EOS metadata in the
libcf library, using the HDF-EOS toolkit.
21Request for User Help What Data to Read?
- Please send me pointers to scientifically
important HDF4 datasets. - The intention is not to read any HDF4 data, just
those of wide scientific interest.
22Contribute Code to Write HDF4?
- Some programmers use the netCDF v2 API to write
HDF4 files. - It would not be too hard to write the glue code
to allow the v2 API -gt HDF4 output from the
netCDF library. - The next step would be to allow netCDF v3/v4 API
code to write HDF4 files. - Writing HDF4 seems like a low priority to our
users. I would be happy to help any user who
would like to undertake this task.
23What is HDF5?
- HDF5 is an extremely general data storage format
with many advanced features on-the-fly
compression, parallel I/O, a rich data model,
etc. - Starting with netCDF-4.0, netCDF has been able to
use HDF5 as a storage layer, exposing some of the
advanced features. - But, until version 4.1, only HDF5 files created
with netCDF-4 could be understood by netCDF-4.
24Why Read HDF5 Files?
- Many important datasets are available in HDF5
format, including data from the Aqua satellite.
25Rules for Reading HDF5 Files
- NetCDF-4.1 provides read-only access to existing
HDF5 files if they do not violate some rules - Must not use circular group structure.
- HDF5 reference type (and some other obscure
types) are not understood. - Write access still only possible with
netCDF-4/HDF5 files.
26HDF5 Version 1.8 Background
- In version 1.8, HDF5 introduced dimension
scales as a way of supporting shared dimensions. - Also in version 1.8, HDF5 introduced ordering by
creation, rather than ordering alphabetically. - But most data providers don't use these features,
but instead use HDF5 1.6.
27NetCDF-4.1 Relaxes Some Restrictions for HDF5
Files
- Before netCDF-4.1, HDF5 files had to use creation
ordering and dimension scales in order to be
understood by netCDF-4. - Starting with netCDF-4.1, read-only access is
possible to HDF5 files with alphabetical ordering
and no dimension scales. (Created by HDF5 1.6
perhaps.) - HDF5 may have dimension scales for all
dimensions, or for no dimensions (not for just
some of them).
28HDF5 C Code to Write HDF5 File
/ Create file. / if ((fileid
H5Fcreate(FILE_NAME, H5F_ACC_TRUNC, H5P_DEFAULT,
H5P_DEFAULT)) lt 0)
ERR / Create the space for the
dataset. / dims0 LAT_LEN
dims1 LON_LEN if ((pres_spaceid
H5Screate_simple(DIMS_2, dims, dims)) lt 0) ERR
/ Create a variable. It will not have
dimension scales. / if ((pres_datasetid
H5Dcreate(fileid, PRES_NAME,
H5T_NATIVE_FLOAT,
pres_spaceid, H5P_DEFAULT))
lt 0) ERR if (H5Dclose(pres_datasetid) lt 0
H5Sclose(pres_spaceid) lt 0
H5Fclose(fileid) lt 0) ERR
29NetCDF C Code to Read HDF5 File
/ Read the data with netCDF. / if
(nc_open(FILE_NAME, NC_NOWRITE, ncid)) ERR
if (nc_inq(ncid, ndims_in, nvars_in,
natts_in, unlimdim_in))
ERR if (ndims_in ! 2 nvars_in ! 1
natts_in ! 0 unlimdim_in ! -1)
ERR if (nc_close(ncid)) ERR
30Future Plans for HDF5 Interoperability
- More testing.
- Proper handling of reference types. This will
require (probably) an extension of the netCDF
APIs. - Better handling of strange group structures, if
this proves necessary to read important data.
31Summary
- With the 4.1 release, the netCDF C/Fortran/C
libraries allow read-only access to some existing
HDF4 and HDF5 data archives. - The intention is not to develop a completely
general translation, but instead to focus on
datasets of significance to the Earth science
community. - Write capability is quite possible, but we don't
plan on providing it because the demand for this
is low.