Unidata - PowerPoint PPT Presentation

About This Presentation
Title:

Unidata

Description:

Machine and OS independent file format for 'self ... (DAP-2) HDF5. Data. Model. CommonData (Access) Model. Coordinate Systems. and Scientific Data Types ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 54
Provided by: unid1
Category:
Tags: dap | unidata

less

Transcript and Presenter's Notes

Title: Unidata


1
UnidatasCommon Data Modeland theTHREDDS Data
Server
  • John Caron
  • Unidata/UCAR, Boulder CO
  • Jan 6, 2006
  • ESIP Winter 2006

2
Outline
  • Definitions
  • Creating a Common Data (Access) Model from
    NetCDF, HDF5, OPeNDAP
  • CDM Coordinate Systems, Data Types
  • CDM implementation
  • NetCDF Markup Language (NcML)
  • The THREDDS Data Server

3
NetCDF-3
  • Machine and OS independent file format for
    self-describing scientific data
  • C library (Fortran, C, Perl, IDL, MatLab,
    Python, Ruby), Java library
  • Efficient subsetting of multidimensional arrays.
  • gt 20,000 downloads last year

4
HDF5
  • Machine and OS independent file format for
    self-describing scientific data
  • C library (Fortran, Java, PyTables)
  • Evolution from HDF4, but different.
  • HDF-EOS, HDF5-EOS, standard formats for EOSDIS,
    ASCI, NPOESS
  • Parallel-IO, chunked storage, compression
    filters, many data types.
  • Developed at NCSA, now independent

5
NetCDF-4
  • Project funded by NASA to create new version of
    netCDF using the HDF5 file format.
  • Extend and merge netCDF and HDF5
  • Widespread use and simplicity of netCDF
  • Generality and performance of HDF5

6
NetCDF-Java 2.2 (nj22)
  • 100 Java library
  • Prototype implementation of CDM
  • File formats
  • General NetCDF, HDF5, OPeNDAP
  • Grids GRIB1, GRIB2
  • Radar NEXRAD, NIDS, DORADE
  • Satellite DMSP, GINI
  • Access to THREDDS catalogs

7
OPeNDAP
  • Client-server protocol for scientific data access
  • C client and server, Java client and server
    libraries.
  • Current version 2.0 NASA ESE standard
  • Working on new 4.0 protocol spec

8
THREDDS
  • Originally funded by NSDL
  • discovery and use of scientific data
  • Middleware between data providers and users
  • Dataset Inventory Catalogs (XML)
  • Now part of Unidata core funding
  • Data Serving (pull)

9
Whats a Data Model?
  • Its about scientific data storing, accessing
  • Its an abstraction
  • Equivalent to an abstract object model in OOP
  • An Abstract Data Model describes data objects and
    what methods you can use on them

10
Whats a Data Model?
  • An API is the interface to the Data Model for a
    specific programming language
  • A file format is a way to persist the objects in
    the Data Model.
  • A data access protocol plays the role of a file
    format.
  • The Abstract Data Model removes the details of
    any particular API and the persistence format.

11
Creating a Common Data Access Model from NetCDF,
HDF5, OPeNDAP
12
NetCDF-3 Data Model
13
OPeNDAPDataModel(DAP-2)
14
HDF5 Data Model
15
CommonData(Access) Model
16
Coordinate Systemsand Scientific Data Types
17
Common Data Model Layers
Coordinate Systems
Data Access
18
Coordinate Systems needed
  • NetCDF, OPeNDAP, HDF data models do not have
    integrated coordinate systems
  • so georeferencing not part of API
  • Need conventions to specify (eg CF-1, COARDS,
    etc)
  • Contrast GRIB, HDF-EOS, other specialized formats
  • Must be done in a general way

19
Coordinate Systems
  • Same underlying mathematics as VisAD, ASCII

20
Scientific DataTypes
  • Based on datasets Unidata is familiar with
  • APIs are evolving
  • How are data points connected?
  • Intended to scale to large, multifile collections
  • Intended to support specialized queries
  • Space, Time
  • Corresponding standard NetCDF file conventions

21
Point Observation Data
22
PointObsDataset Methods
  • // Collection of StructureData
  • Collection getData(
  • LatLonRect boundingBox,
  • Date start, Date end)

23
Trajectory Data
24
TrajectoryObs Methods
  • int getNumPoints()
  • StructureData getData(int point)

25
Station Data
26
StationObs Methods
  • // return List of Station
  • List getStations()
  • // return List of StructureData
  • List getData(
  • Station s,
  • Date start, Date end)

27
Radial Data
28
Radial methods
  • interface Radial
  • int getNumGates()
  • float getData(int gate)
  • float getStartingGate()
  • float getGateSize()
  • float getElevation()
  • float getAzimuth()
  • double getTime()

29
Gridded Data
30
Grid methods
  • interface GridCoordSys
  • CoordinateAxis getTaxis()
  • CoordinateAxis getXaxis()
  • CoordinateAxis getYaxis()
  • CoordinateAxis getZaxis()
  • Projection getProjection()
  • Array getDataCube(Range time, Range z, Range y,
    Range x)

31
Image/Swath
32
Standardizing NetCDF Formats
  • Grid CF-1 Convention
  • Need improvements for regional models (WRF), GIS
    info
  • Radar Radar Exchange Format
  • With radar community (led by NCAR ATD)
  • Point Observations
  • Unidata Observation Dataset Conventions

33
CDM implementations NetCDF-4 and NetCDF-Java
2.2
34
NetCDF-4 C Library
NetCDF-4 C Library
35
NetCDF-4 Status
  • 4.0 Beta implements CDM access layer
  • complete, but waiting for HDF5 release 1.8 to
    finalize file format
  • 4.1 adding Coordinate Systems
  • 4.? merge OPeNDAP access (pending funding)

36
NetCDF-Java 2.2 (nj22)
  • Prototype implementation of CDM
  • File formats
  • General NetCDF, HDF5, OPeNDAP
  • Grids GRIB1, GRIB2
  • Radar NEXRAD, NIDS, DORADE
  • Satellite DMSP, GINI
  • Access to THREDDS catalogs
  • Implements NcML

37
Common Data Model
Coordinate Systems
Data Access
38
Application
Scientific Datatypes
Datatype Adapter
NetCDF-Java version 2.2 architecture
NetcdfDataset
CoordSystem Builder
NetcdfFile
ADDE
I/O service provider
OPeNDAP
NetCDF-3
NIDS
GRIB
NetCDF-4
HDF5
GINI
Nexrad
DMSP

39
NetCDF-Java 2.2 Status
  • Data Access layer Beta quality
  • also waiting for HDF5 release to finish NetCDF-4,
    commit to API
  • Coordinate Systems early Beta
  • Finishing docs, runtime plugability
  • Data Types Alpha, still experimenting with APIs

40
NetCDF Markup Language (NcML)
  • XML representation of netCDF metadata (like
    ncdump -h)
  • Create new netCDF files (like ncgen)
  • Modify existing datasets
  • Add/delete/rename
  • Create logical sections of existing variables.
  • Create unions and aggregations of multiple
    existing datasets.

41
NcML example
  • lt?xml version"1.0" encoding"UTF-8"?gt
  • ltnetcdf xmlns"http//www.unidata.ucar.edu/schemas
    /netcdf/ncml-2.2"
  • location/data/nids/N0R_20041119_2147"gt
  • ltattribute nameDataType" valueRadar" /gt
  • ltremove typeattribute namepassword" /gt
  • ltvariable name"Reflectivity" orgNameR34768gt
  • ltattribute name"units" valuedBZ" /gt
  • lt/variablegt
  • lt/netcdfgt

42
NcML Aggregation
  • Union
  • Join Existing
  • Join New
  • Forecast Model Run

43
NcML Aggregation Example
  • ltnetcdf xmlnshttp//www.unidata.ucar.edu/schemas
    /netcdf/ncml-2.2gt
  • ltaggregation dimName"time" type"joinNew"gt
  • ltvariableAgg name"Temperature"/gt
  • ltvariableAgg name"Pressure"/gt
  • ltscan locationC/data/goes/"
    suffix".gini"/gt
  • lt/aggregationgt
  • lt/netcdfgt

44
THREDDS Data Server
  • Integrates data access with THREDDS catalogs and
    services
  • Tomcat/Servlet, 100 Java, single war file
  • Data input is netCDF Java 2.2 library
  • Data output
  • OPeNDAP
  • HTTP Server
  • OGC Web Coverage Server (gridded)

45
THREDDS Data Server
HTTP Tomcat Server
Catalog.xml
Application
THREDDS Server
  • OPeNDAP
  • HTTPServer
  • WCS

NetCDF-Java library
hostname.edu
Datasets
IDD Data
46
TDS as WCS Gateway
hostname.edu
HTTP Tomcat Server
Catalog.xml
Application
THREDDS Server
  • OPeNDAP
  • HTTPServer
  • WCS

NetCDF-Java library
OPeNDAP Server
anotherHost.org
47
TDS and NcML
hostname.edu
HTTP Tomcat Server
Application
THREDDS Server
  • OPeNDAP
  • WCS

Netcdf-Java
Catalog.xml
Datasets
48
TDS and NcML
  • Server serves the dataset wrapped by the NcML
  • Client sees OPeNDAP or WCS, not NcML
  • Can fix metadata problems
  • Can augment metadata
  • Use NcML aggregation on the TDS
  • replaces the old Aggregation Server

49
TDS and Digital Libraries
HTTP Tomcat Server
Catalog.xml
Application
THREDDS Server
  • OPeNDAP
  • HTTPServer

NetCDF-Java library
  • WCS

Datasets
hostname.edu
otherhost.gov
OPeNDAP Server
50
TDS and Digital Libraries
  • Framework to add metadata
  • By hand (collection level)
  • Automatic extraction from datasets
  • Send records to existing DLs
  • No search
  • Both collection and inventory level

51
Future Plans
  • NetCDF-Java
  • Get APIs stable, docs, runtime plugability
  • NetCDF-4 (!)
  • HDF4, HDF-EOS, BUFR (need funding)
  • NetCDF-4 C Library
  • DataTypes too immature to port
  • NcML?
  • Java on the server

52
TDS Future Plans
  • Aggregation
  • Driven by IDD data (motherlode)
  • Pluggable Authorization
  • access control by dataset
  • Performance
  • Services
  • Coordinate System Verifier (eg CF-1)
  • Data access
  • Subset and get netcdf file

53
Conclusion N M instead of N M things on your
TODO List!
File Format 1
Visualization Analysis
NetCDF file
File Format 2
OpenDAP Server
File Format N
WCS Service
Write a Comment
User Comments (0)
About PowerShow.com