Title: How I found some data on the web
1- How I found some data on the web!
- (Aspects of on-line data transport)
Steven F. Gebert Bernard A. Megrey
AOOS DMAC Data Management Workshop January 27,
2005
2Overview
- OPeNDAP Open Data Access Protocol
- DODS Distributed Oceanographic Data System
- OCG Open Geospatial Consortium
- OWC Open GIS Web Services
3What the heck is OPeNDAP/DODS???
4What is OPeNDAP?
- provides a way for consumers to access
oceanographic data anywhere on the Internet from
a wide variety of new and existing programs. - By developing network versions of commonly used
data access Application Program Interface (API)
libraries, such as NetCDF , HDF , JGOFS , and
others, the OPeNDAP project can capitalize on
years of development of data analysis and display
packages that use those APIs, allowing users to
continue to use programs with which they are
already familiar.
5Data comes in many formats and flavors - from
text files to the desktop, to the mainframe, from
databases to gis.
The End Result
6Benefits of using OPeNDAP
- Frequently, efforts at collaboration between
groups of researchers are frustrated by technical
issues with sharing their datasets.
7Disadvantages of using OPeNDAP
- Works very well with netCDF data formats. Not so
well with others - A virtural server needed for each data format
used (possibly increases cost and/or maintenance) - netCDF files do not handle time very well
- netCDF files are binary and are not readable by
humans without translation software - Somewhat outdated technology but pervasive so
still in play
8Problems Solved by OPeNDAP
- Network communications problems impede the
collaborative efforts of geographically scattered
groups. - OPeNDAP uses existing, well understood
technologies (based on the http protocol) to move
data across the Internet. - Different groups use different data analysis
packages, and can't easily combine their data. - OPeNDAP seamlessly gives users access to data in
a variety of different formats. - Learning new software wastes effort that should
be directed toward looking at the data. - From the user's point of view, enabling an
application to use OPeNDAP doesn't change its
behavior - it just extends the range of available
data. - Most packages cannot use data in foreign formats.
Collaboration is effectively restricted to other
groups who chose the same package. Within the
group, researchers get "stuck" with a package
that doesn't really suit their requirements. - Since OPeNDAP can translate between data formats,
the range of available data is greatly extended. - Centralized data repositories cannot support
works in progress. - OPeNDAP uses existing network protocols to allow
direct access to any compatible datasets that
researchers care to make available. - It takes too long to rewrite an existing
application for a different data access API. It
also means the loss of procedures that were
developed in-house. - OPeNDAP does not require new code or new
applications - existing applications can be
converted easily.
9Think of OPeNDAP as a programming framework that
exists on a web server that provides access to
Common Gateway Interface applications
10(No Transcript)
11OPeNDAP Architecture
- OPeNDAP links a data-handling application with
disparate datasets in remote locations. - OPeNDAP uses the client-server model
- a client sends a data request across the Internet
to a server. The client is an application that
uses OPeNDAP functions for getting data. - the server answers with the requested data. The
server is a Web server that can retrieve data
from particular datasets.
12OPeNDAP Architecture
13DODs is accessed via URLs and added constraints
- http//www.cdc.noaa.gov/cgi-bin/nph-nc/Datasets/re
ynolds_sst/sst.mnmean.nc.dds
14.dds
- Dataset Float32 latlat 180Float32 lonlon
360Float64 timetime 254Grid
ARRAYInt16 ssttime 254lat 180lon
360MAPSFloat64 timetime 254Float32
latlat 180Float32 lonlon 360
sstGrid ARRAYInt16 masklat 180lon
360MAPSFloat32 latlat 180Float32
lonlon 360 mask sst.mnmean.nc -
- A 180-element vector called "lat",
- A 360-element vector called "lon",
- A 226-element vector called "time",
- A "Grid" containing a three-dimensional array of
integer values (Int16) called sst, and three
"Map" vectors, which may look familiar, and - Another Grid called mask.
15http//www.cdc.noaa.gov/cgi-bin/nph-nc/Datasets/re
ynolds_sst/sst.mnmean.nc.html
16Web Services-based Transport Protocol
17Web Services-based Transport Protocol
- Open GIS Web Services (OWS) from the Open
Geospatial Consortium (OGC) provide a new cutting
edge data transport protocol that includes - Sensor Collection Service (SCS) server gathers
readings from in-situ environmental sensors via a
private network (cellular, microwave, etc.), and
provides summaries or interpretations of those
readings to SCS clients over the Web
18Advantages
- Open source application with well established
support community - All encodings are based upon XML
- Encodings describe specialized vocabularies for
the transfer of specific kinds of data packages
as messages between application clients and
services, and between services. - Includes all Interoperability and connectivity
protocols
19(No Transcript)
20Services
- Sensor Web Enablement (SWE)
- thread to link environmental sensors to the World
Wide Web.
21Services
- Web Mapping Service (WMS)
- standardizes the way in which clients request
maps. Clients request maps from a WMS instance in
terms of named layers and provide parameters such
as the size of the returned map as well as the
spatial reference system to be used in drawing
the map.
22Services
- Web Feature Service (WFS)
- The Web Feature Service (WFS) supports INSERT,
UPDATE, DELETE, QUERY and DISCOVERY of geographic
features. WFS delivers GML representations of
simple geospatial features in response to queries
from HTTP clients. Clients access geographic
feature data through WFS by submitting a request
for just those features that are needed for an
application.
23Services
- Web Coverage Service (WCS)
- The Web Coverage Service supports the networked
interchange of geospatial data as "coverages"
containing values or properties of geographic
locations. Unlike the Web Map Service, which
returns static maps (server-rendered as
pictures), the Web Coverage Service provides
access to intact (unrendered) geospatial
information, as needed for client-side rendering,
multi-valued coverages, and input into scientific
models and other clients beyond simple viewers. -
24Services
- Coverage Portrayal Service (CPS)
- The Coverage Portrayal Service defines a standard
interface for producing visual pictures from
coverage data. CPS extends the WMS interface and
uses the Styled Layer Descriptor (SLD) language
to support rendering of WCS coverages.
25Services
- Sensor Collection Service (SCS)
- The basic function of the Sensor Collection
Service (SCS) is to provide a web-enabled
interface to a sensor, collection of sensors or
sensor proxy. The Sensor Collection Service
provides a standard interface for clients to
collect and access sensor observations and
manipulate them in different ways. SCS instances
are collection points on the web for disparate
types and instances of sensors. SCS instances
deliver sensor observation values (e.g.,
temperature, ppm, chemical type) in response to
queries form HTTP clients. - Sensor Collection Service (SCS) server gathers
readings from in-situ environmental sensors via a
private network (cellular, microwave, etc.), and
provides summaries or interpretations of those
readings to SCS clients over the Web
26(No Transcript)
27Services
- Geocoder Service
- Geocoding is the process of linking words, terms
and codes found in a text string to their
applicable geospatial features, with known
positions (i.e., usually a point with x, y
coordinates but more generally any geometry). The
most commonly known type of geocoding is
converting a street address to a geographic
location. -
28Services
- Gazetteer Service
- The Gazetteer Service is a network-accessible
service that retrieves the known geometries for
one or more features, given their associated
well-known feature identifiers (text strings),
which are specified at run-time through a query
(filter) request. The identifiers are any words
or terms that describe the features.
29RECOMMENDATIONS
- Construct a hybrid transport protocol system
- Implement a web services oriented transport
protocol, as the main workhorse, to take
advantage of emerging technologies. - Implement OPeNDAP/DODS so as to remain compliant
with IOOS connectivity and data sharing
requirements.