Title: DAP Clients and Services
1DAP Clients and Services
- Section 3
- APAC 07 OPeNDAP Workshop
- 12 Oct 2007
- James Gallagher
2Outline
- Browsing a Server - jump right in
- DAP Requests and Responses - background on using
DAP - Finding Data
- Types of Clients
- Graphical
- Command line
- Custom
3Browsing a Server
- Type the Servers URL into the browser
- Hyrax (and most other DAP servers) provide a way
to browse data - Choose a data set using THREDDS catalogs and/or
common directory traversal - Choose one or more variables within a data set
using the HTML form interface
4Open a server
Type the servers URL the URL could be an Entry
in a catalog or HTML page.
Contents at the top-level
These links become active when a dataset Is
listed. For a directory, these dont apply
5Browse its directory structure
Follow the Pathfinder links down to
6and traverse all the way down to a file
this point. Now we see a listing of datasets
Descend into a dataset
7Open a file
Note that the URL is duplicated here.
8Supply a constraint Get ASCII data
Use the form elements to build a Constraint
Note that the constraint is visible here,
appended to the URL
9The ASCII data view
Note the constraint and the .asc suffix
appended before the constraint.
10Spreadsheets can often read URLs and they Can
parse the CSV output of Hyrax (and most Other DAP
servers)
Paste a DAP URL with the .ascii extension
into the Location box
11Data read into the spreadsheet. Sometimes you
have to tell the spreadsheet how to import the
data
12Browsing summary
- Directory hierarchy browsing
- Data files open to a HTML form which enables
choosing variables - The form supports interactive construction of
constraint expressions and ASCII data returns - The form interface has many limitations but it
can be used in many different situations
13DAP background information
- Data are referenced by a URL
- DAP responses with metadata or data are requested
using tokens appended to the URL - With a data granule, elements are accessed using
a Constraint Expression
14URLs Reference Data
- As weve seen, URLs reference data granules
(usually files). - DAP, version 2 defines three responses
- DDS - syntactic metadata - information about the
structure of the data - DAS - semantic metadata - background information
about the data - DODS - data - actual data values, bundled with
syntactic metadata to form a self-contained
response.
15DAP Data Model
- A Dataset is a collection of variables (tuples of
type-name-value) - Each variable has attributes which are also
type-name-value tuples - The Dataset may also have global attributes
16Data Model Types
- Types of variables
- Scalars Byte, Integer, Float, String, URL
- Array N-dimensional
- Structure Simple aggregate type
- Sequence hierarchical table data
- Grid Array with map vectors (establishes a
mapping between array indeces and independent
variable values)
17Attributes
- Scalars
- Vectors
- Structures
- No Grids or Sequences.
18Accessing those responses
- For each of the responses, add the extension
.dds, .das or .dods at the end of the URL file
name.
19or use the form interface
20Other response types
- DAP4 will use XML to encode metadata and replace
the two objects with a single response accessed
using .ddx - Virtually all servers support
- Info (.info) A HTML page built using all the
metadata - HTML (.html) The HTML for interface weve seen
- ASCII (.asc, .ascii) The ASCII data dump, also
already seen
21Aggregation
- There are several different servers which can
perform aggregation - TDS Array data
- GDS, Hyrax/JGOFS Sequences (table data)
- BES (but not when used in Hyrax) Any collection
of data types aggregated to a Structure - Aggregation maps searching and selecting from an
Inventory onto using a constraint expression - Aggregation can eliminate the dichotomy between
inventory searching/access and data access
22An example Aggregation
- http//satdat1.gso.uri.edu/thredds/dodsC/NWAtlanti
cDec_1km.html
23THREDDS responses
- Use THREDDS to define a logical hierarchy thats
distinct from the set of directories that
actually hold the data. - We can request THREDDS catalog XML files using
catalog.xml or HTML pages using catalog.html
after a directory name. - While the directory browser works for any
directory, THREDDS catalogs are valid only for
the logical hierarchy they define - Files/Directories not included in that hierarchy
have no catalogs
24THREDDS examples
- Switch Hyrax to the THREDDS HTML view
Choose the HTML view
25The THREDDS HTML view
- The top-level THREDDS catalog on our test server
defines a single data root directory (SVN Test
Data Archive) - This illustrates how THREDDS can be used to
control the view of data presented by the server - Use catalog.xml in place of catalog.html to
get the catalog data in an XML document.
26Traverse the links to find data
27THREDDS data set page
- THREDDS catalogs can list more than one access
mechanism - here we see on the DAP, but WCS, WMS,
et c., are other possibilities
28Choosing DAP access leads to the HTML form
29DAP Summary
- DAP requests are made using a token appended to
the filename part of URL - Responses defined by the DAP2 and (in progress)
DAP4 are DDS, DAS, DODS and DDX. These return
metadata and data - Other responses are used to access ASCII data
values, HTML metadata pages and data access
interfaces - Constraint expressions are used to limit (subset,
projection, selection) data returned
30DAP Summary, cont.
- THREDDS is
- a distinct protocol
- compliments DAP
- as Hyrax implements it supports both HTML and XML
views of the catalogs - Defines a logical hierarchy that is distinct from
the way the data are actually stored
31Finding Data
- Ways to find data
- The OPeNDAP Data Set List
- GCMD
- TPAC
- Google
- THREDDS
- We maintain a page with links to dataset
searching sites - http//www.opendap.org/data/index.html
32Common Features
- All of these data location features except Google
depend on active community involvement in
building catalogs of data - The solutions can be described as static
documents or crawlers - Google and TPAC are crawlers
- Crawlers can discover datasets without human
intervention - They can make mistakes that seem silly
- The The Dataset List, GCMD and THREDDS are static
documents or collections of static documents - Static lists can be tailored by hand
- They can go out of date quickly
33Differentiating Features
- Google TPAC
- Google is just crawling HTML. If a server is not
linked to a HTML page, it wont be found. - TPAC is preset with server locations and picks up
changes at those sites
34Differentiating Features, cont.
- The Static Lists
- The Dataset List has a very low metadata
requirement - Not maintained as actively as either GCMD or
THREDDS catalogs - GCMD
- The GCMD has a fairly high entry level threshold
- Professional staff maintain the GCMD as their
sole job - THREDDS
- THREDDS catalogs are, or can be, located at the
data - locality distributes maintenance - Quality varies from site to site
35Finding Data Summary
- Locating data seems like it would be the place to
start building a system, but its far more varied
than the one-size-fits-all approach most tried in
the 1990s - Crawlers and hierarchical lists show the most
promise but maintained centralized lists are also
useful
36Accessing Data with DAP
- Web Browser
- Already discussed
- Graphical clients
- ncBrowse, ODC, Ferret, GrADS
- Command-line clients
- getdap (UNIX, win32), loaddap (Matlab, IDL), nco
(UNIX, win32) - Custom clients
- C, C, Java, Python
- netCDF
37Using a Graphical Client
- Example The OPeNDAP Data Connector
- Combines data location with retrieval and display
- Shows the built URL, including constraint
expression - Can be transferred to another application
38Start the ODC
39The ODC opens to the search pane
Five different panes
Choices within a pane
40Use the dataset list to find the TPAC
climatologies
Choose the Antarctic Cooperative Research Centre
TPAC/CISRO Climatologies
then hit To Retrieve to move the selection to
the next pane
41The Retrieve pane
Double click levitus_annual_97.nc To see the
contents of the file in The area on the right
42The ODC shows the URL as it builds it.
Click the checkbox for SALT and O2. For both, set
the range of z_index to 0 to 0. Make sure to
hit tab/return in The boxes.
then hit Output to to move to the View pane
43(No Transcript)
44There are a number of ways to view The data. Here
the plotter has been Chosen (the default).
Hit Plot to to generate a plot using
the Default settings.
45When the plot is made, the interface Switches to
the Preview tab
Switch back to the Variables tab to Plot O2
46Choose O2 from the menu, then hit Plot to.
47Now that the data have been read and Cached, you
can switch back and Forth between variables
quickly without Any additional data transfers
When ready, go back to the Retrieve Pane.
48Choose TEMP
Set the constraint
then plot
49(No Transcript)
50ODC Summary
- The ODC provides a way to search for, access and
plot data - Acts as a URL builder the URLs can be pasted
into other applications - We didnt need to know anything about DAP, its
Request or Response objects or how a URL is used
to request data - The data set list often contains stale entries
- Also supports using the GCMD for data location -
more on this when we cover searching
51Using a Command-line Client
- Matlab - demonstration
- NCO - a powerful tool developed and maintained by
another group
52Matlab
- Demonstration of custom-built Graphical
interfaces for Matlab - Matlab scripting is used to build the interfaces
and provide some dataset-specific processing - A Matlab command extension is used to read the
data (written in C/C). - Two things are required in addition to Matlab
The DAP command extension (loaddap) and the
graphical interface software.
53Running the Matlab Demonstration
- Start Matlab
- Download the command extension
- Download the interface software
- In Matlab change directory to the
ml-ocean-testbed directory. - Type OCEAN_TOOLBOX
- The interface will start
54The Ocean Toolbox
55Open a dataset
I choose the Pathfinder dataset
56Fill in the information
SST Quality fields
Load data into the Matlab workspace
57Get the data
Load data into the Matlab workspace
58Plot/Display the data
59Using the loaddap command extension directly
- Start Matlab
- Add the directory with the extension to the
Matlab command path - Verify the command extension is working
- Feed it a URL
- Plot the data
60Pass a URL, constrain the response To the u and
v vectors only
Plot those vectors See Figure 1
61Matlab Summary
- Command line client is the tool used to move the
data - Easily used in Matlab scripts to hide the details
and make custom interfaces - To the the command extension directory, user must
know - Data location (URL)
- Internal structure of the data set (syntactic
metadata - DDS/DDX) - How to write a constraint expression
62NetCDF Operators (NCO)
- Unix command line client
- Unlike the previous two clients, NCO uses the
netCDF client library to read from a DAP server - A client library is a collection of functions
which hide the mechanics of (most of) the
interaction with a server so the client can go
about its business - The NCO client is, in fact, just the NCO package
linked to our (OPeNDAPs) version of the netCDF
library (aka. the netCDF client library)
63Build the NCO Software
- Change directory to /root/src/nco-3.9.2
- root_at_slax cd /root/src/nco-3.9.2
- Run configure to build the Makefile, then build
and install the software - root_at_slax ./configure
- root_at_slax make
- root_at_slax make install
64Use NCO to Convert the FNOC1 vectors into a speed
- NCAP NCO Arithmetic Processor
- ncap -O -s windspeedsqrt(u2,v2)
http//localhost8080/opendap/data/nc/fnoc1.nc
wndspd.nc - The URL is the input file and wndspd.nc is the
output - Use ncdump to look at the result file
- ncdump -h wndspd.nc
- ncdump -v windspeed wndspd.nc
65View the Result ncBrowse
- We can use ncBrowse to look at the local neetCDFZ
we just built - ncBrowse can also look at the DAP server directly
- Built using the DAP-enabled Java netCDF library
(a client library where access to DAP servers
hides behind the netCDF API)
66Start ncBrowse
Double click on speed - the new data we made
with the previous NCO example
67Fix up the latitude and longitude axes, the
Graph Variable.
68We have to be somewhat savvy about the units -
check back and look at the attributes
69Custom clients
- What options exist to build clients
- C using libdap
- C using Ocapi
- C,Fortran using the netcdf client library
- Python using PyDAP
- Java using Java-OPeNDAP
- Matlab IDL using the respective versions of
loaddap
70Clients Summary
- Custom clients offer an opportunity to develop
for a specific audience or a particular
problem/project. - Example ComMIT Tsunami inundation model client
developed by NOAA/PMEL and BOM - General purpose clients like loaddap can read any
kind of data while clients built using the netCDF
client library are limited to the semantics of
netCDF - Example Record access is slow because each
access is separate network request
71Clients Summary, cont.
- ODC A client built specifically to provide a
browse capability for any data source - Uses Java-OPeNDAP
- Loaddap a client built to read any data into an
analysis application - Can be used as a building block for more
sophisticated applications - Use libdap (C, Matlab) or Ocapi (C, IDL)
- netCDF client library A client-building tool
- convert legacy code
- provide a simple way to write new applications
- C, C, Fortran