Title: GIS Data Preservation: Partnership with Library of Congress Steve Morris North Carolina State Univer
1GIS Data PreservationPartnership with Library
of Congress Steve MorrisNorth Carolina State
University Libraries
NCPMA Fall Meeting
October 11, 2006
2Todays geospatial data as tomorrows cultural
heritage
Future uses of data are difficult to anticipate
(as with Sanborn Maps).
3Temporal Data Supports Decision Making
- Land use change analysis
- Real Estate trend analysis
- Site selection
- (past uses?)
- Forecasting
Parcel Boundary Changes 2001-2004 North Raleigh,
NC
4Time series Ortho imagery Vicinity of
Raleigh-Durham International Airport 1993-2002
5Digital Preservation Points of Failure
- Data is not saved, or
- cant be found, or
- media is obsolete, or
- media is corrupt, or
- format is obsolete, or
- file is corrupt, or
- meaning is lost
Solutions Migration Emulation Encapsulation XML
6Risks to Geospatial Data
- Producer focus on current data
- Data overwrite as common practice
- Future support of data formats in question
- No open, supported format for vector data
- Shift to web services-based access
- Data becoming more ephemeral
- Inadequate or nonexistent metadata
- Impedes discovery and use
- Increasing use of spatial databases for data
management - The whole is greater than the sum of the parts
7How would you describe your current geospatial
archive?
8NC Geospatial Data Archiving Project (NCGDAP)
- Partnership between university library (NCSU) and
state agency (NCCGIA), with Library of Congress
under the National Digital Information
Infrastructure and Preservation Program (NDIIPP) - One of 8 initial NDIIPP partnerships
- Focus on state and local geospatial content in
North Carolina (state demonstration) - Tied to NC OneMap initiative, which provides for
seamless access to data, metadata, and
inventories - Objective engage existing state/federal
geospatial data infrastructures in preservation
Serve as catalyst for discussion within industry
9(No Transcript)
10Geospatial data types Aerial imagery
85 NC counties with orthophotos 1-5 flights per
county 30-300 gb per flight
11Geospatial data types Vector tabular
Economic, infrastructure, and ethnographic data
12Geospatial data types Cartographic project files
Counterpart to the map is not just the dataset
but also models, symbolization, classification,
annotation, etc.
13Project Technical Approaches
- Receive data as is variety of distribution
methods - Migration of some at-risk formats
- Metadata remediation, standardization, and
synchronization - Distilling complex objects into repository ingest
items (not easy) - Build a digital repository (catalyst for
discussion) - Develop a repository ingest workflow (learning
experience)
Some unsustainable activities for learning
experience
14Project Cultural/Organizational Approaches
- Engage data producer community and spatial data
infrastructure through outreach and engagement
influence practice - Sell the problem to software vendors and
standards development - Find overlap with more compelling business
problems disaster preparedness, business
continuity, road building, etc. - Start a discussion about roles at the local,
state, and federal level
Current use and data sharing requirements not
archiving needs drive improved preservability
of content and improvement of metadata
15Challenge Coordinated Content Transfer
- How to allow one data snapshot to be accessible
by multiple agencies more compelling use cases
than preservation can put the data in motion
(business continuity, disaster preparedness,
etc.) - Other activities? (DHS, WGRT, State Archives,
Census, etc.) - Question Capture frequency of data snapshot?
- Survey just completed to identify local
government best practices, consumer agencies
needs
16NC Frequency of Capture Survey
- Survey objective
- Document current practices for obtaining archival
snapshots of county/municipal geospatial vector
data layers - Seek guidance about frequency of capture
- Survey topics
- General questions about data archiving practice
- Specific questions about parcels, street
centerlines, jurisdictional boundaries, and
zoning - Survey subjects
- All 100 counties and 25 municipalities
- 58 response rate
- Survey conducted September 2006
17Survey Results Overview
- Two-thirds of responding agencies create and
retain periodic snapshots - Long-term retention more common in counties with
larger populations - Storage environments vary, with servers and
CD-ROMs most common - Offsite storage (or both onsite and offsite) is
used by nearly half of the respondents - Popularity of historic images has resulted in
scanning and geo-referencing of hardcopy aerial
photos among one-third of the respondents
18Local Business Rules and Uses Driving Temporal
Snapshotting
- Information technology policy (20)
- Records retention policy (18)
- Tax administration rules (25)
- Land use change analysis (11)
- Resolution of legal issues (18)
- Historic mapping (56)
- Other (30)
19Frequency of Capture Parcel Data
20Parcel Data Archival Data Format
- Shapefile (76)
- Geodatabase (36)
- Arc Coverage (29)
- Arc Interchange (7)
- Other (10)
52 of respondents indicated that a format
conversion was carried out in creating the
archival snapshot
Respondents were allowed to select multiple
formats
21Parcel Attribute Data Handling
22Digital Conversion of Hardcopy Resources
- Historic hardcopy maps
- Scanned only (15.5)
- Scanned and georeferenced (9.9)
- Aerial photos
- Scanned only (8.5)
- Scanned and georeferenced (26.8)
- None (54.9)
23Questions?
Contact Steve Morris Head, Digital Library
Initiatives NCSU Libraries Steven_Morris_at_ncsu.edu
Web site http//www.lib.ncsu.edu/ncgdap/