Building the Archives of the Future: NARAs Electronic Records Archives Program - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Building the Archives of the Future: NARAs Electronic Records Archives Program

Description:

... enabled by, and dependent on, digital computer and communications technologies. ... Digital technology is both necessary and advantageous for discovering and ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 38
Provided by: kenthi8
Category:

less

Transcript and Presenter's Notes

Title: Building the Archives of the Future: NARAs Electronic Records Archives Program


1
Building the Archives of the FutureNARAs
Electronic Records Archives Program
Managing Electronic Records
Kenneth Thibodeau, Director ERA Program National
Archives and Records Administration Richard
Marciano, Reseach Scientist San Diego
Supercomputer Center 3 October 2000
2
ITS OUR FUTURE
  • - John Carlin Archivist of the United States

3
Why do we need anElectronic Records Archives?
  • The conduct of business is increasingly enabled
    by, and dependent on, digital computer and
    communications technologies.
  • The records that are being created in this
    environment are increasingly electronic.
  • Many of these records cannot be expressed in
    non-electronic form
  • Digital technology is both necessary and
    advantageous for discovering and delivering
    information

4
Technical Challenges in Building ERA
  • Overcome technological obsolescence in a way that
    enables the preservation of demonstrably
    authentic records.
  • Find ways to take advantage of continuing
    progress in information technology in order to
    maintain and improve customer service
  • Build solutions that recognize that todays
    progress is tomorrows obsolescence

5
What is the Electronic Records Archives?
The Electronic Records Archives is a
comprehensive, systematic, and dynamic means of
accomplishing the archival work that must be done
to provide continuing access to authentic
electronic records over time.
6
Archival Business Model Context
The Life Cycle of Records
7
Digital Preservation Strategies
  • Maintain original technology
  • Imitate original technology
  • Re-engineer software
  • Migrate data formats
  • Standardize data formats

Original Technology
State-of-the-art Technology
  • Collection-based Persistent Object Preservation

8
Collection Based Persistent Object
PreservationMethod
  • Create meta-data models
  • the internal components of objects
  • the sequence of components within objects
  • the attributes of presentation of preserved
    objects
  • Apply models by marking up objects
  • Express links among records and collections as
    persistent data values
  • Define the semantics of components
  • Preserve the models, the transformed records and
    procedures to apply the models.
  • Provide rich, comprehensive and flexible
    meta-data management for discovery, retrieval
    preservation

9
Open Archival Information System
10
Persistent Object PreservationImplementation
  • Comprehensive
  • All types of computer applications
  • All types of electronic records
  • Collections as well as individual records
  • All required archival processes
  • Infrastructure Independence
  • Objects and Collections of Objects
  • Enable replacement of any component
  • Scalable
  • Up to gtgt 100,000,000 objects
  • Down for small collections institutions
  • Metacomputing - over the Internet
  • Extensible over the Records Lifecycle

11
Basic Process Ingest
Archival Information Package
Submission Information Package
12
Basic Process Access
Retrieve Records
Dissemination Information Package
13
ERA Design Strategy
14
ERA Concept model
15
Technology DETOUR Persistent Archive
Infrastructure
  • Data object management
  • Ability to work with multiple types of storage
    systems, across separate administration domains
  • Richard Fisher 4 Legacy Data Base
    Archiving
  • - data records/objects
  • - conceptual Approach for Data Archives where a
    technology neutral format (XML) is used and where
    through a reverse process one can restore a
    collection and query it (data records, metadata,
    audit trails)
  • Information management
  • Ability to define a collection independent of
    database choice
  • Ability to migrate collection onto new databases
  • Jeff Rothenberg 3 Digital Records Last
    Forever, or Five Years, Whichever Comes First
  • - encapsulated document and metadata
  • Gregory Hunter, Charles Dollar 13 Strategies
    Best Practices for Managing the Storage
    Preservation of E-Records
  • PRINCIPLE 8 Encapsulated Electronic Records
  • Store raw data, processed data, analysis
    parameters, correspondence, and metadata as a
    single physical entity
  • Use XML-based software to define the components
    of the electronic wrapper, including indexing
    terms for retrieval
  • Knowledge management
  • Mark Gilbert 2 Content Management, XML
    Records Management
  • - knowledge map technology for navigation
  • July Gable 8 Document Management Update,
    Whats New, Whats Hot and What to be Wary
    About
  • - from DM to KM

16
ERA Synergy Beyond
  • A uniform architecture is emerging across
  • persistent archives (NARA)
  • digital libraries (NSF)
  • NSF -- DLI2, National SMET Education Digital
    Library
  • -- NPACI data grid for neuroscience brain image
    federation
  • grid development (DOE, NASA, NLM)
  • DOE -- ASCI Data Visualization Corridor remote
    data processing
  • -- Particle Physics Data Grid object
    replication
  • NASA -- Information Power Grid distributed data
    processing
  • NLM -- Digital Embryo Project data grid for
    image processing and storage

17
ERA Research Benefits
  • Validation mechanism for the
  • common data management architecture
  • differentiation between knowledge, information,
    and data and the choice of representation
    standards
  • Integration vehicle for tying together
  • persistent archives with grid environments
  • grid environments with digital libraries
  • digital libraries with persistent archives

18
Knowledge-based Persistent Archives
Ingest
Manage
Access
(Topic Maps / Model-based Access)
? 9 SLIDES
(Data Handling System - Storage Resource Broker)
? 3 SLIDES
19
Data Handling System (1/3) Storage Resource
Broker Meta-data Catalog
Application
Resource
Third-party copy
User
Remote Proxies
MCAT
Dublin Core
DataCutter
Application Meta-data
20
Collection Based Access (2/3)
  • Abstract data set naming and administration away
    from physical storage resource
  • Data sets defined by attributes
  • Logical collection used to group data sets across
    storage systems
  • Enables support for replication of data
  • Collection owned data
  • Authentication controlled by data handling system
  • Persistence controlled by data handling system

21
SRB Containers (3/3)Managing Archive Latency
SRB client
  • Create container in a logical storage resource
    containing at least one cacheable resource
  • Create objects in containers
  • Cache daemon will move filled containers to
    archive
  • synch and purge APIs

SRB Server
UNIX
HPSS
HPSS
container
Distributed Storage Resources
cached containers
22
Knowledge-based Persistent Archives
Ingest
Manage
Access
(Topic Maps / Model-based Access)
? 9 SLIDES
(Data Handling System - Storage Resource Broker)
? 3 SLIDES
23
Knowledge-based Access (1/9)
  • The relationships between knowledge and
    information layers define
  • Rules that can be applied to the collection
  • Rules for defining collection attributes
  • Rules for organizing attributes into a schema
  • Rules for feature extraction
  • Relationships that quantify associations
  • Organization of concepts into topic maps
  • Ontology mapping between concept maps
  • Mapping of concepts to collection attributes
  • Etc.

24
Knowledge Standards (2/9)usingTOPIC MAPS
ISO/IEC 13250 (Jan. 2000)Bridging knowledge
representation information management
  • STANDARD FOR
  • describing knowledge structures
  • associating them with information resources
  • solution for organizing and navigating large and
    large information pools
  • XTM SPECIFICATION

25
TOPIC MAPS (3/9)
  • Paradigm for K. navigation synthesis
  • Concept of creating style sheets for K.- based
    information access and navigation
  • TMs define semantically customized views

26
The TAO of Topic Maps (4/9)(from XML Europe 2000
papers)NEXT 4 SLIDES
T is for Topic
Topics
Topic types
Topic names
27
The TAO of Topic Maps (cont.) (5/9)
A is for Association
Topic associations
Association types
28
The TAO of Topic Maps (cont.) (6/9)
O is for Occurrence
Occurrences
Occurrence Roles
29
The TAO of Topic Maps (cont.) (7/9)
? Independence of topic associations topic
occurrences (information resources)
Topic maps as portable semantic networks
30
Model-Based Archival Collection Management (8/9)
31
Towards a Model-based ERA? (9/9)
  • Using XML (XML, DTD, TM, )
  • Introducing rules (e.g. retention schedule rule)
  • Inference rules ? to derive implicit knowledge
  • Validation rules ? to express constraints
  • Presentation rules ? style sheets / views
  • Archiving rules models
  • Migrating collections models restoring a
    collection and querying it!
  • END OF DETOUR Back to the ERA Infrastructure
    Concept!

32
(No Transcript)
33
Getting to ERA
  • Build on core technologies of the emerging
    National Information Infrastructure
  • Leverage efforts in the physical sciences, life
    sciences, spatial data, digital government,and
    digital library communities
  • Develop the Information Management Architecture
    for digital archives
  • Articulate and refine the archival business model

34
Partnerships
  • ISO draft Model of Open Archival Information
    System
  • NASA/Consultative Committtee on Space Data
    Systems
  • International research on Permanent Authentic
    Records in Electronic Systems (InterPARES)
  • 7 international research teams, 10 national
    archives
  • Intelligent processing of electronic records
  • Army Research Laboratory, Georgia Tech Research
    Institute
  • Distributed Object Computation Testbed
  • Defense Advanced Research Projects Agency, U.S.
    Patent and Trademark Office
  • National Partnership for Advanced Computational
    Infrastructure
  • National Science Foundation
  • Archivists Workbench
  • NHPRC Grant to San Diego Supercomputer Center

35
How do these activities fit together?
  • OAIS Model
  • InterPARES
  • Intelligent processing
  • DOCT
  • NPACI
  • Archivists Workbench
  • High level framework for entities, functions,
    data flows
  • Archival requirements, electronic records
    typology, preservation model, best practices
  • Tool sets for archival processes
  • Persistent Object Preservation
  • Core technologies for ERA
  • Scale ERA for smaller archives

36
What have we accomplished?
  • Research prototype
  • migratable information architecture
  • scalable archive
  • Demonstrated application
  • Process from ingest through access
  • Multiple types of collections
  • Databases, e-mail, GIS, digital images, office
    automation files.
  • Experiments
  • Application of knowledge-based, natural language
    processing, and other technologies to archival
    processing of records

37
Additional Information
  • http//www.nara.gov
  • http//www.sdsc.edu/NARA
  • http//www.ces.btc.gatech.edu/research.htm
  • Digital Strategies 2000
  • National Archives at College Park, MD
  • Nov 16-17, 2000
  • http//www.nara.gov/program
Write a Comment
User Comments (0)
About PowerShow.com