Reading HDF family of formats via NetCDF-Java / CDM - PowerPoint PPT Presentation

About This Presentation
Title:

Reading HDF family of formats via NetCDF-Java / CDM

Description:

Reading HDF family of formats via NetCDFJava CDM – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 31
Provided by: car97
Category:
Tags: cdm | hdf | netcdf | family | formats | java | mailto | reading | via

less

Transcript and Presenter's Notes

Title: Reading HDF family of formats via NetCDF-Java / CDM


1
Reading HDF family of formatsvia NetCDF-Java /
CDM
  • John Caron
  • UCAR/Unidata

2
NetCDF-Java library
  • 100 Java
  • Open Source (LGPL, MIT)
  • Independent implementation
  • Used as a component in other software (partial)
  • Integrated Data Viewer, THREDDS Data Server
    (Unidata)
  • Panoply (NASA)
  • ncBrowse (EPIC/NOAA)
  • Java NEXRAD Viewer (NCDC/NOAA)
  • MyWorld GIS (Northwestern)
  • EDC for ArcGIS, ERRDAP (SFSC/NOAA)
  • Live Access Server (PMEL/NOAA)
  • ncWMS (Reading)
  • Matlab plug-in (USGS)

3
Application
Scientific Feature Types
Datatype Adapter
NetCDF-Java/ CDM architecture
NetcdfDataset
CoordSystem Builder
NetcdfFile
I/O service provider
OPeNDAP
NetCDF-3
NIDS
GRIB
NetCDF-4
NcML
HDF5
GINI
Nexrad
DMSP

4
Format Readers (IOSP)
  • General NetCDF, HDF5, HDF4, OPeNDAP
  • Gridded GRIB-1, GRIB-2, GEMPAK
  • Radar NEXRAD 23, DORADE, CINRAD, Universal
    Format
  • Point BUFR, ASCII
  • Satellite DMSP, GINI, McIDAS AREA
  • Misc GTOPO, Lightning, etc
  • Others in development (partial)
  • AVHRR, GPCP, GACP, SRB, SSMI, HIRS (NCDC)

5
Line of Code (est)
6
Why all the trouble?
  • 20-40 C/C time spent on portability issues
  • Platform Independence
  • Linux, Solaris, Windows (Sun)
  • Mac OS X (Apple)
  • AIX, Linux, Windows, z/OS (IBM)
  • HP-UX (Hewlitt-Packard)
  • Progammer productivity
  • Object-Oriented
  • Garbage Collected no memory leaks
  • Rich libraries
  • Open source
  • Faster than C for some applications

7
Independent implementation
  • Written entirely from reading HDF4, HDF5 file
    specifications
  • Helped debug (HDF5), validate file specs
  • File format spec is what will be needed in 100
    years to read legacy data
  • OTOH, semantics not always obvious
  • Dont confuse reference implementation with the
    file/protocol specification

8
HDF family of formats
  • HDF5/NetCDF-4
  • HDF4
  • HDF-EOS
  • Note read-only, no parellel I/O, etc

9
HDF5/NetCDF4
  • Goal is to read all HDF5
  • Can read all HDF5 files that we have example
  • including references, soft links
  • Complete coverage difficult to guarantee
    combinatoric explosion
  • Some esoteric features we are skipping
  • File drivers, external files, slib compression
  • Working on a comprehensive test harness
  • JNI interface to Netcdf4/HDF5 library
  • read every byte and compare

10
HDF4 / HDF-EOS
  • Complete, works against all examples
  • Tested against 400 sample files (27 Gb)
  • thanks to Ruth Duerr (NSIDC)
  • Spot checked against HDFView
  • Need systematic test to compare reading against
    the HDF4 C Library

11
Geolocation Primer
12
Swath
  • Float lat(245, 33477)
  • Float lon(245, 33477)
  • Float time(33477)
  • Float data(245, 33477)
  • Just know that its swath data
  • 245 points cross track
  • 33477 along the track
  • Each scan has a time coordinate

13
Swath
  • Float lat(33477, 245)
  • Float lon(33477, 245)
  • Float time(33477)
  • Float data(245, 33477)

14
Swath
  • Float lat(999,999)
  • Float lon(999,999)
  • Float time(999)
  • Float data(999,999)

15
Swath
  • Float v1(999, 999)
  • Float v2(999, 999)
  • Float v3(999)
  • Float v4(999,999)

16
If you write data
  • Dont rely on variable name conventions
  • Dont rely on index ordering
  • Dont rely on matching index sizes
  • Minimize you just have to know that

17
Dimensions
  • Dimensions
  • d1999
  • d2999
  • Variables
  • float v1(d1999, d2999)
  • float v2(d1999, d2999)
  • float v3(d2999)
  • float v4(d2999,d1999)

18
Good
  • Variables
  • float v1(d1999, d2999)
  • v1standard_name Latitude
  • float v2(d1999, d2999)
  • v2standard_name Longitude
  • float v3(d2999)
  • v3standard_name Time
  • float v4(d2999,d1999)
  • Data_type Swath
  • Conventions My unique name

19
If you write data
  • Unique signature
  • Specify dimensions
  • Identify georeferencing coordinates
  • Identify data type
  • Units are not optional

20
HDF-EOS, HDF-EOS2
  • Read structural metadata field to obtain more
    semantics
  • Parse text in ODL
  • Data type Swath, Grid, Point
  • Dimensions
  • Geolocation coordinate variable types Latitude,
    Longitude, Time

21
HDF-EOS, HDF-EOS2
  • Good
  • Unique signature, identify coordinates and data
    type
  • Not so good
  • ODL
  • Not using hdf4/5 constructs
  • Bad
  • No data units
  • No time coordinate units!

22
Better EOS
  • Variables
  • float v1(999, 999)
  • v1standard_name Latitude
  • v1dims d1 d2
  • float v2(999, 999)
  • v2standard_name Longitude
  • v2dims d1 d2
  • float v3(999)
  • v3standard_name Time
  • v3dims d2
  • float v4(999,999)
  • v4dims d2 d1

23
NPP (i1.4.0.3_NPP_QUAL)
  • Good
  • XML better than ODL
  • Not so good
  • Not using hdf4/5 constructs
  • Bad
  • No data units
  • No time coordinate units!
  • Fatal Error please reboot
  • Metadata not in the same file

24
Summary
  • Netcdf-Java reads entire HDFx family
  • Good for Java-philes
  • Needs more testing
  • Send example files,
  • Dimensions are not optional
  • Keep structural and georeferncing metadata in the
    same file as the data
  • Can also have specialized external files

25
Contact
  • caron_at_ucar.edu
  • Google netcdf java

26
NetCDF-4 and Common Data Model (Data Access Layer)
27
Dimension primer
  • Float lat(180)
  • Float lon(360)
  • Float alt(20)
  • Float time(1200)
  • Float data(1200,20,180,360)

28
Unique Name!
  • Float lfip(lfip180)
  • Float lflop(lflop180)
  • Float zorg(zorg20)
  • Float skdf(skdf1200)
  • Float dglot(skdf1200,zorg20,
  • lfip180,lflop180)

29
  • Float lfip(180)
  • Float lflop(180)
  • Float zorg(20)
  • Float freebish(1200)
  • Float dglot(1200,20,180,180)

30
  • Float lat(180)
  • Float lon(180)
  • Float alt(20)
  • Float time(1200)
  • Float data(1200,20,180,180)
Write a Comment
User Comments (0)
About PowerShow.com