Title: LongTerm Preservation of Digital Geospatial Data: A Cooperative Project with Library of Congress Ste
1Long-Term Preservation of Digital Geospatial
Data A Cooperative Project with Library of
Congress Steve MorrisNorth Carolina State
University Libraries
Mountain Region GIS Advisory Council Meeting
September 15, 2006
2Overview
- Spatial Data Preservation Values and
Considerations - NC Geospatial Data Archiving Project
- Approaches to Preservation
- Challenges
- Workflow
- NC Spatial Data Infrastructure
- NC OneMap
- Regional/Local Partnerships and Data Sharing
- Coordinated Content Transfer
- Industry Engagement
- Historic and Geologic Map Preservation Project
3Todays geospatial data as tomorrows cultural
heritage
Future uses of data are difficult to anticipate
(as with Sanborn Maps).
4Temporal Data Supports Decision Making
- Land use change analysis
- Real Estate trend analysis
- Site selection
- (past uses?)
- Forecasting
Parcel Boundary Changes 2001-2004 North Raleigh,
NC
5Time series Ortho imagery Vicinity of
Raleigh-Durham International Airport 1993-2002
6Geospatial Data Risks
- Producer focus on current data
- Future support of data formats in question
- Shift to web services- and API-based access
- Inadequate or nonexistent metadata
- Increasing use of spatial databases for data
management
Many digital archiving challenges
7Geospatial data types Aerial imagery
85 NC counties with orthophotos 1-5 flights per
county 30-300 gb per flight
8Geospatial data types Vector Tabular
Economic, infrastructure, and ethnographic data
9Geospatial data types Cartographic Project Files
Counterpart to the map is not just the dataset
but also models, symbolization, classification,
annotation, etc.
10How would you describe your current geospatial
archive?
11NC Geospatial Data Archiving Project (NCGDAP)
- Partnership between university library (NCSU) and
state agency (NCCGIA), with Library of Congress
under the National Digital Information
Infrastructure and Preservation Program (NDIIPP) - One of 8 initial NDIIPP partnerships
- Focus on state and local geospatial content in
North Carolina (state demonstration) - Tied to NC OneMap initiative, which provides for
seamless access to data, metadata, and
inventories - Objective engage existing state/federal
geospatial data infrastructures in preservation
Serve as catalyst for discussion within industry
12(No Transcript)
13NDIIPP Overview
- National Digital Information Infrastructure and
Preservation Program - Congress appropriated 100 million for this
effort, which instructs the Library to spend an
initial 25 million to develop and execute a
congressionally approved strategic plan - Eight initial projects, 2004-2007
- web pages, cultural heritage, numeric data,
video, business records, mixed content,
geospatial (2) - Developing partnerships and identifying issues
- Extensive interaction among NDIIPP projects
14Different Ways to Approach Preservation
- Technical solutions How do we archive acquired
content over the long term? - Tools
- Hardware
- Software
- Cultural/Organizational solutions How do we make
the data more preservableand more prone to be
archivedfrom point of production? - Collaboration
- Education
- Feedback
15Technical Approaches
- Receive data as is variety of distribution
methods - Migration of some at-risk formats
- Metadata remediation, standardization, and
synchronization - Distilling complex objects into repository ingest
items (not easy) - Using DSpace for demonstration purposes
- In the development use METS record as dormant
item brain within the repository
Some unsustainable activities for learning
experience
16Cultural/Organizational Approaches
- Feedback to metadata outreach program
- Feedback to coordinating bodies on adherence to
content standards - Engage existing spatial data infrastructure in
archiving and preservation - Engage software vendors and standards community
- Cross-fertilize with other national archiving
efforts
Current use and data sharing requirements not
archiving needs drive improved preservability
of content and improvement of metadata
17Challenge Vector Data Formats
- No widely-supported, open vector formats for
geospatial data - Spatial Data Transfer Standard (SDTS) not widely
supported - Geography Markup Language (GML) diversity of
application schemas and profiles threatens
permanent access - Spatial Databases
- The sum is more than the whole of the parts, and
the sum is very difficult to preserve - Can export individual data layers for curation
- Some thinking of using the spatial database as
the primary archival platform
18Challenge Geospatial Web Services
- How to capture records from decision-
- making processes?
- Possible Atlas collections from automated
- image capture
- Web 2.0 impact Emerging tiling and
- caching schemes (archive target?)
19Challenge Preserving Cartographic Representation
20General Workflow
- Receive Data from Agency
- Copy data from agency source to NCSU workstation
- Create Dspace collection space for the data
- Create administrative metadata
- Process geospatial metadata
- Scan geospatial formats and migrate to archival
format - Ingest original and archival data objects, and
geospatial administrative metadata to Dspace
21NCGDAP Leveraging Existing Spatial Data
Infrastructure (NC OneMap)
- NC OneMap "Historic and temporal data will be
maintained and available, RAMONA - Metadata outreach and content standards
- Regional Partnerships
- WGRT and other Coordination Efforts
- Data Sharing Agreements
- Frequent communication and discussion among
geospatial data community
22Challenge Coordinated Content Transfer
- How to allow one data snapshot to be accessible
by multiple agencies more compelling use cases
than preservation can put the data in motion
(business continuity, disaster preparedness,
etc.) - Question Capture frequency of data snapshot?
- Survey in progress to identify local government
best practices, consumer agencies needs - Working Group for Roads and Transportation (WGRT)
- Stakeholder group working to build data
depository for statewide local road data - First serious effort to develop a plan for
local-to-state data sharing on a regular basis - Other Activities? (DHS, State Archives, Census,
etc.)
23Partnership Activity
- ESRI
- Discussing software requirements meetings with
development teams April 2005 - Open Geospatial Consortium (OGC)
- Presented to Architecture Working Group Nov. 2005
- National Archives and Records Administration
- Investigations into GML for archiving
presentation to NARA technology team Dec. 2005 - FGDC Historical Data Working Group
- Ongoing, general geospatial data preservation
issues
24Partnership Activity
- EDINA (University of Edinburgh, UK)
- NCSU is Associate Partner on UK project for
geospatial institutional repositories - UC Santa Barbara Stanford University
- Collaboration with other NDIIPP geospatial
project - EROS Data Center
- Planned site visit
- Project visits to regional GIS groups
25Preservation of Digital Geologic and Historic Maps
- Georeferenced over 450 maps scanned by NC
Geologic Survey - Maps are available for download at
http//wfs.enr.state.nc.us/NCGeologicMaps
1,200 24,000
15-min topo maps
131,680 1430,000
1500,000 12.5 M
26Questions?
Contact Steve Morris Head, Digital Library
Initiatives NCSU Libraries Steven_Morris_at_ncsu.edu
Web site http//www.lib.ncsu.edu/ncgdap/
27NC Spatial Data Infrastructure NCOneMap
- NC OneMap is a next generation mechanism to
coordinate and disseminate geographic information
in North Carolina and interact with the NSDI. - Objectives
- Build a common
- understanding of North
- Carolina data resources
- Enable widespread
- access and distribution
- of geospatial data
28NC OneMap Viewer
29NC OneMap
- Objectives (cont.)
- Develop ongoing data
- inventory for all geospatial data
- holdings RAMONA
- http//nc.gisinventory.net
- Develop content standards
- for key data themes
- NC Geographic Information
- Coordinating Council (GICC)
- One of the defined characteristics of NC OneMap
is that Historic and temporal data will be
maintained and available.
30Emerging Regional Partnerships
- Focused on development of shared infrastructure
for cultivating access to data - Becoming test beds for innovation in the area of
data sharing and data management, including
archiving
31Local Govt. Data Sharing
- Becoming more open, fewer agreements to sign
- Recent survey over 20 state and federal agencies
use local data - Problem of local governments being swamped by
requests - Many requests are more compelling than
archiving - Content transfer is non-trivial large dataset
sizes, small rural staffs, technical limitations
32Earlier NCSU Acquisition Efforts
- NCSU University Extension project 2000-2001
- Target County/city data in eastern NC
- Digital rescue not digital preservation
- Project learning outcomes
- Confirmed concerns about long term access
- Need for efficient inventory/acquisition
- Wide range in rights/licensing
- Need to work within statewide infrastructure
- Acquired experience unanticipated collaboration
33Big Geoarchiving Challenges
- Format migration paths
- Management of data versions over time
- Preservation metadata
- Preserving cartographic representation
- Keeping content repository-agnostic
- Preserving geodatabases
- Harnessing geospatial web services
- More