Title: Mining For Lost Treasure
1Mining For Lost Treasure
- National Geospatial Data Clearinghouse
- Archibald Warnock
- U.S. Federal Geographic Data Committee
- A/WWW Enterprises
2What is Clearinghouse?
- A distributed service to locate geospatial data
based on characteristics expressed in metadata - Clearinghouse allows a user to pose a query of
all or a portion of the community in a single
session - Like a spatial AltaVista
3National Geospatial Data Clearinghouse
- Distributed data producers and users.
- Key components
- Data documentation (metadata)
- Networking (Internet)
- Serving, searching, and accessing software
- Z39.50 Search and Retrieve Protocol
- WWW - World Wide Web
4Components of Clearinghouse
- There are three functional areas that interact to
create the Clearinghouse - Metadata preparation and indexing
- Metadata service
- User Access via Gateway forms
5Clearinghouse Method
Metadata preparation
6Clearinghouse Design
- The Clearinghouse in its distributed form
includes a registry of servers, several
WWW-to-Z39.50 gateways, and many Z39.50 servers - A primary goal of Clearinghouse is to provide the
ability to find spatial data throughout the
entire community, not one site at a time
7Essential Configuration
Clearinghouse Sites
Gateways
Node
Node
FGDC
Web Client
Node
Node
8User downloads query form
Clearinghouse Sites
Gateways
Node
Node
FGDC
Web Client
Node
Node
9User sends query to web server
Clearinghouse Sites
Gateways
Node
Node
FGDC
Web Client
Node
Node
10Gateway passes query to Clearinghouse Servers
Clearinghouse Sites
Gateways
Node
Node
FGDC
Web Client
Node
Node
11Gateway receives and collates hits
Clearinghouse Sites
Gateways
Node
Node
FGDC
Web Client
Node
Node
12Client receives results summary as HTML
Clearinghouse Sites
Gateways
Node
Node
FGDC
Web Client
Node
Node
13Client can request a specific metadata record for
viewing
Clearinghouse Sites
Gateways
Node
Node
FGDC
Web Client
Node
Node
14Node in More Detail
Metadata
Index/DB
15Data
- The most expensive investment for an organization
- Created by many different organizations
- To solve many different problems
- Using many different methods and technologies
16But . . .
- Data are hard to find
- Data are difficult to access
- Data are hard to integrate
- Data are not current
- Data are undocumented
- Data are incomplete
17The uses of metadata
- Provides documentation of existing internal
geospatial data resources within an organization
(inventory) - Permits structured search and comparison of held
spatial data by others (advertising) - Provides end-users with adequate information to
take the data and use it in an appropriate
context (liability)
18Metadata Solutions
- Numerous software solutions available
- Commercial and free-ware
- Standalone, DB-linked, GIS-linked
- Permit collection and structuring of
FGDC-compatible metadata - Present metadata as HTML, XML, or text
19GILS, Dublin Core and Others
- Dublin Core is a minimal (15 fields) generic
metadata scheme for virtually any kind of
document - GILS represents a more detailed approach,
including most of DC, providing greater
interoperability - GILS is less bibliographically oriented than
(Z39.50) BIB-1 - GILS is lightweight compared to GEO (FGDC) and
EOS/CIP (which have specific functional
requirements)
20What Structured Metadata Means -1
- GILS - Fewer fields
- More documents
- More metadata records
- Skinnier metadata records
- Easier abstraction
- FGDC - More fields
- Fewer documents
- Fewer metadata records
- Fatter metadata records
- Less abstraction
GILS is a good, general compromise
21What Structured Metadata Means - 2
- A Z39.50 profile as defines a language
- At some level, Z39.50 is a detail
- Protocols are about communication, profiles are
about abstraction and GILS is about content - Z39.50 guarantees that the users query can be
unambiguously decoded - no guarantees about
content - We could implement the profile over any protocol
- http, CORBA, etc. - Do we have to use Z39.50?
- No, but the abstraction is required
- Z39.50 already includes the abstraction model
22How much metadata is enough?
- Internal documentation for local use (local
inventory) - Basic documentation for discovery of information
holdings (catalog/search) - Detailed documentation to provide end-users with
adequate information for re-use (asset management)
23Server Solutions
- Z39.50 Protocol is used
- GEO Geospatial Metadata Profile is published
for Z39.50 implementors to understand FGDC
metadata structures - Supports search across numeric, text, date, and
spatial extent and full-text - Freeware and commercial solutions
24Gateway in more detail
Nodes
Gateway
25User Interfaces
- HTML-based forms hosted at Gateways are the
primary access method - Java map-based interface from MEL allows more
sophisticated search - Inclusion of search capabilities in GIS client
software is possible
26Whos in Clearinghouse?
- 109 Nodes (servers) online as of 3/1/99
- 28 Federal, national scope
- 35 State/University state-wide scope
- 28 International scope or location
- 18 Local or Regional scope
27US Federal Participation
- National Park Service
- Army Corps of Engineers
- Tri-Services Center
- National Wetlands Inventory
- Census (sampler)
- Minerals Management Service
- NOAA (10)
- USGS (6)
- FEMA (sampler)
- NRCS climate and soils
- CIESIN/EPA
- CIESIN/NASA
- DOT NTAD
28State Participation
- West Virginia
- Washington
- Wisconsin
- Wyoming (2)
- Florida
- Alabama
- New Mexico
- Arizona
Georgia Illinois Minnesota Alaska California Delaw
are Nebraska (2) New Jersey
- New York (2)
- North Carolina
- Oklahoma
- Kansas
- Texas
- Montana (3)
- Vermont
- Pennsylvania
29Regional/Local Participation
- Olympic Peninsula, WA
- Greater Yellowstone
- Helena NF
- Ecological Reserves, KS
- MIT/Mass Boston DOQs
- Great Lakes EIS
- Eastern Sierra
- McKinley Co, NM
- City of Santa Fe, NM
- North Texas GIS
- Research Planning
- Sabine R Authority, TX
- San Francisco Bay
- S Florida Ecosystem
- SW Natural Resources
30International Participation
- NOAA/Japan GOIN
- South Africa (2)
- ESA AVHRR sampler
- GELOS, Italy
- PAIGH, Mexico
- S57 Hydrography, Canada
- NRL MEL
- Africa DDS
- Inter-American Geospatial Data Network
- Hong Kong
- CIESIN/USDA Global Environmental Change
- Australia (10)
- Costa Rica
- Caribbean CEPNET, Jamaica
31Planned or Funded Nodes
- Mt Desert Island, ME
- SW Washington COG
- NASA GCMD
- CODEPLAN, Brazil
- Iowa
- Missouri
- Kentucky
- South Dakota
- Oregon
- Louisiana
- Ohio
- Connecticut MAGIC
- Colorado
- NW Ecosystems
32Clearinghouse provides...
- Discovery of spatial data
- Distributed search worldwide
- Uniform interface for spatial data searches
- Advertising for your data holdings
33For more information
Visit the FGDC website http//www.fgdc.gov Conta
ct the Clearinghouse Coordinator, Doug Nebert
(ddnebert_at_usgs.gov) or Archie Warnock
(warnock_at_awcubed.com)