Title: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries
1The North Carolina Geospatial Data Archiving
ProjectSteven P. MorrisNorth Carolina State
University Libraries
Maintaining Long-Term Access to Geospatial Data
October 27, 2006
2NC Geospatial Data Archiving Project
- Partnership between university library (NCSU) and
state agency (NCCGIA), with Library of Congress
under the National Digital Information
Infrastructure and Preservation Program (NDIIPP) - One of 8 initial NDIIPP partnerships
- Focus on state and local geospatial content in
North Carolina (state demonstration) - Tied to NC OneMap initiative, which provides for
seamless access to data, metadata, and
inventories - Objective engage existing state/federal
geospatial data infrastructures in preservation
Serve as catalyst for discussion within industry
3Targeted data Digital orthophotography
85 NC counties with orthophotos 1-5 flights per
county 30-200 gb per flight
4Targeted data Vector data (w/tabular)
Economic, infrastructure, and ethnographic data
5Todays geospatial data as tomorrows cultural
heritage
Future uses of data are difficult to anticipate
(as with Sanborn Maps).
6Risks to State/Local Geospatial Data
- Producer focus on current data
- Data overwrite as common practice
- Future support of data formats in question
- No open, supported format for vector data
- Shift to web services-based access
- Data becoming more ephemeral
- Inadequate or nonexistent metadata
- Impedes discovery and use
- Increasing use of spatial databases for data
management - The whole is greater than the sum of the parts
7Challenge Vector Data Formats
- No widely-supported, open vector formats for
geospatial data - Spatial Data Transfer Standard (SDTS) not widely
supported - Geography Markup Language (GML) diversity of
application schemas and profiles threatens
permanent access - Spatial Databases
- The sum is more than the whole of the parts, and
the sum is very difficult to preserve - Can export individual data layers for curation
- Some thinking of using the spatial database as
the primary archival platform
8Challenge Cartographic Representation
Counterpart to the map is not just the dataset
but also models, symbolization, classification,
annotation, etc.
9Challenge Geospatial Web Services
- How to capture records from decision-
- making processes?
- Possible Atlas collections from automated
- image capture
- Web 2.0 impact Emerging tiling and
- caching schemes (archive target?)
10Different Ways to Approach Preservation
- Technical solutions How do we archive acquired
content over the long term? - Build a data repository not as an end in itself
but as a catalyst for discussion within the data
community - Develop a repository ingest workflow create
technical points of engagement with the NDIIPP
partners
11Different Ways to Approach Preservation
- Cultural/Organizational solutions How do we make
the data more preservableand more prone to be
archivedfrom point of production? - Engage data producer community and spatial data
infrastructure through outreach and engagement
influence practice - Sell the problem to software vendors and
standards development - Find overlap with more compelling business
problems disaster preparedness, business
continuity, road building, etc. - Start a discussion about roles at the local,
state, and federal level
12NCGDAP Technical Approach
- Receive data as is variety of distribution
methods - Migration of some at-risk formats
- Metadata remediation, standardization, and
synchronization - Distilling complex objects into repository ingest
items (not easy) - Using DSpace for demonstration purposes (keeping
repository platform at arms length) - In the development use METS record as dormant
item brain within the repository
Some unsustainable activities for learning
experience
13Building Data Bundles The Zip Codes Example
14Where is the Dataset?
15Heres One!
- Files
- Multi-file dataset
- Georeferencing
- Metadata file
- Symbolization file
- Additional
- documentation
- License
- Disclaimer
- More
- Metadata
- FGDC
- Acquisition metadata
- Transfer metadata
- Ingest metadata
- Archive rights
- Archive processes
- Collection metadata
- Series metadata
16Hub-and-Spoke Metadata Workflow
17Hub-and-Spoke Metadata Workflow
18Cultural Changing Industry Thinking
- Is the geospatial industry temporally-impaired?
- Lack of access to older data
- Lack for tool/model support for temporal analysis
- Metadata poor support for changing data
- Education building class projects around
available data (i.e., not temporal) - Increased interest now in temporal applications?
- Increased demand for temporal data?
- Improved tool support ArcGIS 9.2 animation
tools Geodatabase History, etc.
19Cultural Content Exchange Networks
- Solving the present-day problems of data sharing
is a pre-requisite to solving the problem of
long-term access - Leveraging more compelling business problems
disaster preparedness and business continuity
needs can put the data in motion (siphon off to
the archive) - Engage existing spatial data infrastructure in
archiving and preservation - Content exchange network technical challenges
- Rights management
- Large-scale transfers on network
- Content packaging (MPEG 21 DIDL, XFDU, METS, )
20Cultural Engaging Standards Efforts
- Nov. 2005 EDINA and NCSU present on preservation
challenges at the OGC Technical Committee Meeting - Key points of intersection with standards
efforts - GML archival profile?
- Content packaging and content exchange
- Metadata support for temporal entities
- Archival use cases in GeoDRM
- Oct. 2006 meeting of Ad Hoc Historical Data
Working Group at OGC TC plans to develop a
formal Data Preservation Working Group
21Sept. 2006 Frequency of Capture Survey
- Survey objective
- Document current practices for obtaining archival
snapshots of county/municipal geospatial vector
data layers - Seek guidance about frequency of capture
- Survey topics
- General questions about data archiving practice
- Specific questions about parcels, street
centerlines, jurisdictional boundaries, and
zoning - Survey subjects
- All 100 counties and 25 municipalities
- 58 response rate
- Survey conducted September 2006
22NC County/Municipal Agency Frequency of Capture
Parcel Data
Based on a percentage of the respondents that
indicate they actually archive some data
23Project Status
What About Commercial Data?
Cultivating a commercial market for older data.
Part of permanent access is marketing,
advertising, and putting older data into the path
of the user
24New ChallengesPlatial vs. Spatial Imagery
- Mobile, LBS and, social networking applications
drive demand for placed-based data - Example sources
- Oblique Imagery
- Street-view Imagery (e.g., A9.com)
- Transportation Dept. Videologs
- Long-term cultural heritage value in non-overhead
imagery more descriptive of place and function
25New Challenges Ajax Applications, Google Earth
and All That
- Emerging online environments are increasingly
used to make decisions, how are these decisions
documented? - How far will KML go?
- Temporal component in emerging tiling caching
standards?
26- Web mashup interactions with existing systems
spur creation of intermediate content layers
e.g., tiling and caching of WMS services - Identification of a standard tiling scheme may
create a new preservation opportunity (temporal
axis on caches?)
27Questions?
Contact Steve Morris Head, Digital Library
Initiatives NCSU Libraries ph (919)
515-1361 Steven_Morris_at_ncsu.edu http//www.lib.nc
su.edu/ncgdap