Title: Long Term Archive LTA Improving Access to the USGSEROS Archive
1Long Term Archive (LTA)Improving Access to the
USGS/EROS Archive
- Tim Smith, SGT Data Management Lead
- October 5, 2009
2Long Term Archive (LTA)
- Mission
- The LTA provides data management, access,
archive, and distribution for all data sets
within the USGS Historical Archives that have
long term relevance to science and other USGS
missions - External Drivers
- Continued transition from data sales to free data
delivery over the internet (hard copy
distribution ended in FY2009) - Internal Drivers
- Improve access to the archive
- Develop and maintain data access, preservation,
and distribution infrastructure as a project to
project service
3What we do
- The LTA is responsible for over 70 data
collections - We operate maintain systems that schedule,
acquire, process, and archive aerial,
cartographic, topographic and satellite data - We provide reimbursable support activities for
data services and access for other federal
agencies - We provide products to the public via on-demand
high resolution scans from the film archive and
enable no charge web downloads for all LTA data
collections - We operate, develop, evolve, and manage the
infrastructure required to store, access, and
distribute USGS/EROS data - We perform activities that ensure long-term
preservation of the USGS/EROS data archive
4Where we are going
- We manage, operate, and maintain photographic and
digital archives and ensure long-term
preservation - We manage, engineer, operate and maintain the
Computer Room 1 Mass Storage System (SL8500) - We create and maintain off-site backups of LTA
data - We assist other projects in transitioning
selected archives to the LTA (Landsat DAAC
Topo, etc) - We operate a Data Ingest, Management and Archive
Service and offer these services to other
projects - We support USGS archive affiliation activities
with the National Archive and Records
Administration (NARA)
5Concept of Operation
6Where we are going
- Improve access to the archive
- Modernize and evolve archive access capabilities
including Earth Explorer and GloVis. We strive
to make these systems more customer self
sufficient - Continue to build data capture systems to support
the high-resolution film scanning capabilities
(Phoenix 5 1,000 dpi scanning back system) - Continue to generate single frame metadata from
photo indexes and map line plots for historical
aerial film
7External Interfaces
- National Park Service
- Film digitizing, management, metadata generation,
and public access - NASA
- EO-1 acquisition scheduling, processing,
archiving and distribution - NOAA
- AVHRR acquisition scheduling, data reception,
processing, archiving and distribution, ground
station engineering, operations and maintenance. - US Air Force
- Eagle Vision- commercial satellite data receipt,
processing, archiving, distribution, and license
management - USGS
- Geography Discipline - Commercial satellite and
aerial data receipt, processing, archiving,
distribution - Water Resources Discipline GOESS ground system,
NWIS, and NATWEB operations and maintenance. - National Interagency Fire Center
- AVHRR Greenness Mapping and associate processing
and distribution - Fish and Wildlife Service
- Film digitizing, storage, and public access
8LTA Collection Diversity
- Data Types
- Aerial
- Cartographic
- Topographic
- Satellite
- Digital and film archive media
- Static and growing collections
500 terabytes
9End to End Data Management
- Repeatable operations functions and processes
- Evolve the EROS access, preservation and
distribution data system - Mass Storage
- Inventory Server
- Browse Server
- Networking
- Earth Explorer
- GloVis
- TRAM
- Data Distribution
10Data Archive
- Manage all archived data within a near-line tape
storage system - Data migrations
- Backup copy generation
- Data access
- Backup copies stored at off-site locations
- Engineering studies lead development and
enhancement efforts
SUN SL8500 Mass Storage System
11Historical Film Archive Digitization
- Digitizing film rolls and photo indexes
- Created full frame browse with USGS logo attached
- Provided on-line access.
- Discovered a host of un-recorded photos (orphans)
- Archiving the medium-resolution products
(captured to make a browse image) in the Mass
Storage System and provide access as they become
available - 6.5 million frames digitized since Oct 2004
- SCAR, AHAP, NHAP, NAPP web-enabled for no-charge
downloads - USGS collection released in December 2008
- (National Parks medium resolution data also
released as free downloads)
12The LTA Film Digitizing Process
Digitizing Aerial Film Archive Placed photo
indexes on-line Full frame browse and
metadata Stored files near-line Free FTP down
loads of medium resolution 400 dpi imagery
Digitizing Completed USGS/ NAPP 1.4 M
frames Sept 05 USGS/NHAP 511 K frames
Nov 05 USGS/Survey 2.9 M frames
Mar 07 Other collects 1.7 M frames 2009
TIFF image To Storage System
Archive to Phoenix Digitizing system
JPEG browse to Web interface
Medium-resolution Digitized product To customer
Customer Request
13Creating Single Frame Metadata
- Scanned photo indexes linked to individual frame
metadata record - Implemented processes and software to
- Estimate the Lat/Long coordinates for each frame
- Update single frame metadata record
- Working USGS historical collection
- 68,000 indexes
- 2,600,000 frames
- 350,000 frames annually
- Over 1.7 million produced as of 9/30/09
Photo Index
Single Frame
14Historical Film Scanning Capability
- On-demand high-resolution scanned products from
the historical film archive based on customer
requests - Currently charge 30/file a 5 per order USGS
fee - Files only retained for 30-days in case there is
a customer question. Declass imagery are
retained as a 7 micron source for future free
downloads - Aerial scans are being saved as 25 micron (1,000
dpi) through the Phoenix 5 system
15New High Resolution Data Capture Capabilities
- Reuse of film advance robotics, software, and
workflow - Proven roll to roll advance capabilities
- Better Light Scanning Back
- 10200 x 13600 pixels
- 1000 dpi
- 25 micron
- 2 min/frame scan time
- Improved LED light source
- 16X brighter
- Operate 6 systems 24X5 and provide output images
free over the internet
16Medium Res. Web Enabling
- Large volume, static collection (110 TB)
- High public interest historical photos 1939-
Present - Small file sizes 10mb (B/W) 30mb (color)
- Web-enabling through standard systems
- Earth Explorer and also GloVis for NAPP and
NHAP - Registration required before each downloading
session throttle - Specific scene selection required for each
download, guards in place to exclude scripting. - No additional processing required for
distribution - Archive data is the distribution data
- All data stored in the Mass Storage System
- Nearly the entire collection resides on disk
(98TB cache) - Remaining data available near-line from tape
17The LTA an explosion of free data
18- Questions?
- http//edcsns17.cr.usgs.gov/NewEarthExplorer/
- http//earthexplorer.usgs.gov
- http//glovis.usgs.gov