Title: Collection and Preservation of AtRisk Digital Geospatial Data: North Carolina Geospatial Data Archiv
1Collection and Preservation of At-Risk Digital
Geospatial DataNorth Carolina Geospatial Data
Archiving Project (NDIIPP Partnership) Steve
MorrisHead of Digital Library InitiativesNCSU
Libraries
Library of Congress Brown Bag Discussion
Dec. 15, 2005
2Project Context
- Partnership between university library (NCSU) and
state agency (NCCGIA) - Focus on state and local geospatial content in
North Carolina (state demonstration) - Tied to NC OneMap initiative, which provides for
seamless access to data, metadata, and inventory
information - Objective engage existing state/federal
geospatial data infrastructures in preservation
3Targeted Content
- Resource Types
- GIS vector (point/line/polygon) data
- Digital orthophotography
- Digital maps
- Tabular data (e.g. assessment data)
- Content Producers
- Mostly state, local, regional agencies
- Some university, not-for-profit, commercial
- Selected local federal projects
4Geospatial data types Vector data
5Time series vector data Parcel Boundary Changes
2001-2004, North Raleigh, NC
6Geospatial data types Aerial imagery
7Geospatial data types Aerial imagery
8Geospatial data types Aerial imagery
9Time series Ortho imagery Vicinity of
Raleigh-Durham International Airport 1993-2002
10Geospatial data types Tabular data (w/vector)
11Todays geospatial data as tomorrows cultural
heritage
12Risks to Digital Geospatial Data
.shp
.mif
.gml
.e00
.dwg
.dgn
.bsb
.bil
.sid
13Risks to Digital Geospatial Data
- Producer focus on current data
- Time-versioned content generally not archives
- Future support of data formats in question
- Vast range of data formats in use--complex
- Shift to web services-based access
- Archives have been a by-product of providing
access - Preservation metadata requirements
- Descriptive, administrative, technical, DRM
- Geodatabases
- Complex functionality
14Industry Shift to Web Services
15(No Transcript)
16(No Transcript)
17(No Transcript)
18Work plan in a Nutshell
- Work from existing data inventories
- NC OneMap Data Sharing Agreements as the
blanket, individual agreements as the quilt - Partnership work with existing geospatial data
infrastructures (state and federal) - Technical approach
- METS with FGDC, PREMIS?, GeoDRM?
- Dspace now re-ingest to different environment
- Web services consumption for archival development
19Big Challenges
- Format migration paths
- Management of data versions over time
- Preservation metadata
- Harnessing geospatial web services
- Preserving cartographic representation
- Keeping content repository-agnostic
- Preserving geodatabases
- More
20Vector Data Format Options
- Option A use an open format and have a really
unfortunate transformation and limited vendor
support for the output object - Option B use closed format but retain the
original content and count on short- and
medium-term vendor support. - Option C do both to buy time and look for an
open, ASCII-based solution. (watch GML activity) - No sweet spot, just an evolving and changing mix
of - flawed options that are used in combination.
21Preservation Metadata Issues
- FGDC Metadata
- Many flavors, incoming metadata needs processing
- Cross-walk elements to PREMIS, MODS?
- Metadata wrapper/Content packaging
- METS (Metadata Encoding and Transmission
Standard) vs. other industry solutions - Need a geospatial industry solution for the
METS-like problem - GeoDRM a likely triggerwrapper to enforce
licensing (MPEG 21 references in OGIS Web
Services 3)
22Metadata Availability
23Preserving Cartographic Representation
24Repository Architecture Issues
- Interest in how geospatial content interacts with
widely available digital repository software - Focus on salient, domain-specific issues
- Challenge remain repository agnostic
- Avoid imprinting on repository software
environment - Preservation package should not be the same as
the ingest object of the first environment - Tension between exploiting repository software
features vs. becoming software dependent
25Project Status
- Completing inventory analysis stage
- Storage system and backup deployed
- DSpace deployed to production
- Metadata workflow finalized
- Ingest workflow near finalization
- Content migration workflow near finalization
- Regional site visits planned for coming months
- Wide range of outreach/collaboration FGDC, ESRI,
EDINA (JISC), USGS, OGC, TRB, etc. - Pilot project, georegistering digital archival
geologic maps
26Questions?
Contact Steve Morris Head, Digital Library
Initiatives NCSU Libraries Steven_Morris_at_ncsu.edu