netCDF - PowerPoint PPT Presentation

About This Presentation
Title:

netCDF

Description:

Present the multi-dimensional arrays in a self-describing way ... Start[] = {2, 1}, Count[] = {3, 2}, Stride[] = {3, 2}, Imap[] = {1, 3} ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 20
Provided by: jianw8
Category:

less

Transcript and Presenter's Notes

Title: netCDF


1
netCDF
  • An Effective Way to Store and Retrieve Scientific
    Datasets

Jianwei Li 02/11/2002
2
Motivation
  • Related scientific datasets in a single file.
  • Define a file format
  • Present the multi-dimensional arrays in a
    self-describing way
  • Common data access through a simple interface
  • Typical ways of access to multi-dimensional
    arrays
  • Platform-independent / Network-transparent
  • Here comes netCDF!

3
Outline Today
  • Overview of netCDF
  • netCDF File Format
  • netCDF library and Programming Model
  • Advantages Disadvantages
  • Future Plans for NetCDF

4
What Is netCDF?
NetCDF (network Common Data Form) is an interface
for array-oriented data access. It defines a
machine-independent file format for representing
multi-dimensional arrays with ancillary data, and
provide support for creation, access, and sharing
of array-oriented data. The netCDF software is
implemented as I/O libraries for C, FORTRAN, C,
and Perl and other language for which a netCDF
library is available.
  • Self-describing.
  • Portable.
  • Direct-access.
  • Appendable.
  • Sharable.

5
Why use netCDF?
  • Facilitate the use of common datasets by distinct
    applications.
  • Permit datasets to be transported between or
    shared by dissimilar computers transparently,
    i.e., without translation.
  • Reduce the programming effort usually spent
    interpreting formats.
  • Reduce errors arising from misinterpreting data
    and ancillary data.
  • Facilitate using output from one application as
    input to another.
  • Establish an interface standard which simplifies
    the inclusion of new software into existing
    system.

6
Platforms that netCDF runs on?
  • OpenVMS
  • OS/2
  • SUNOS
  • ULTRIX
  • UNICOS
  • Windows NT
  • AIX
  • HPUX-9.05
  • IRIX, IRIX64
  • MSDOS (gcc, f2c, GNU make)
  • Linux
  • OSF

Platforms that range from CRAYs to personal
computers and include most UNIX-based
workstations.
7
netCDF dataset Components
  • Dimensions
  • name, length
  • Fixed dimension
  • UNLIMITED dimension
  • Variables named arrays
  • name, type, shape, attributes
  • Fixed sized variables array of fixed dimensions
  • Record variables array with its most-significant
    dimension UNLIMITED
  • Coordinate variables 1-D array with the same
    name as its dimension
  • Attributes
  • name, type, values, length
  • Variable attributes
  • Global attributes

8
An Example netCDF File
9
netCDF File Format
  • Header
  • Number of record variables
  • Dimension list
  • Global attribute list
  • Variable list
  • Data (row-major)
  • Fixed-sized data
  • data for each variable is stored contiguously
  • Record data
  • a variable number of fixed-size records, each of
    which contains one record for each of the record
    variables in order.
  • Both use extended XDR

10
Data access in netCDF dataset
Direct Access! Efficient for small subsets out of
large datasets Forms of access
  • access to all elements (whole array)
  • access to individual elements, specified with an
    index vector
  • access to array sections, specified with an index
    vector, and count vector
  • access to subsampled array sections, specified
    with an index vector, count vector, and stride
    vector
  • access to mapped array sections, specified with
    an index vector, count vector, stride vector, and
    an index mapping vector (map the indices of an
    element in an external subarray to the offset in
    internal buffer).

11
Dataset Access Example
  • Start 2, 1, Count 3, 2, Stride
    3, 2, Imap 1, 3

12
netCDF Library Programming Model
  • Modes definition mode, data mode
  • IDs dataset ID, dimension ID, variable ID,
    attribute number
  • Create Write Read by name Read sequencially
    Add dim,var,att

13
Example library functions
14
netCDF data types and type conversion
  • Different library functions for each data type
  • Automatic conversion to or from external type
  • Conversion from one numeric type to another

15
Other Data Types?
  • Data Structure
  • No direct support
  • Can build some data structure by grouping arrays
    or by using data in one array as pointers into
    another array. Inefficient, and may not be
    self-describing!
  • Character String
  • not a primitive netCDF external data type
  • Charater-string attributes well represented by
    NC_CHAR values and its length.
  • Charater-string variables use a
    character-position dimension as the most quickly
    varying dimension for the variable (dim-length
    the max string length)

16
Share Data Between Read and Write
  • Support for multiple read and single write
  • Use NC_SHARE mode to create or open dataset to
    use unbuffered I/O
  • Or call nc_sync() to force writing changes back
    to disk
  • Header changes in define mode are synchronized to
    disk only when nc_enddef() is called, reader must
    call nc_sync() to bring header up-to-date in its
    memory

17
Advantages
  • Simple and useful libraries for array-oriented
    data access
  • The dataset is self-describing
  • Portable and easy to share
  • Flexible and efficient data access
  • Mechanism to support extendable arrays

18
Disadvantages
  • No support for multiple concurrent write
  • Limited number of external numeric data types
    8-, 16-, 32-bit integers, or 32- or 64-bit
    floating-point numbers.
  • No support for nested data structures such as
    trees, nested arrays, or other recursive
    structures
  • Only one unlimited dimension. Multiple variables
    sharing an unlimited dimension, must all grow
    together.
  • Redefine(structure changes) that requires more
    file space is accomplished by copying the dataset.

19
Future plans for netCDF
  • Parallel write support
  • Support for bit data type and transparent data
    packing for bit variables
  • Support for efficient structure changes
  • Support for nested arrays and other data structure
Write a Comment
User Comments (0)
About PowerShow.com