Persistent Digital Archives and Library System PeDALS - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Persistent Digital Archives and Library System PeDALS

Description:

AZ Archives submitted multi-state grant proposal to Library of Congress ... Digitize paper records. Capture agency website snapshots. Purchase hardware and software ... – PowerPoint PPT presentation

Number of Views:109
Avg rating:3.0/5.0
Slides: 35
Provided by: Jonatha401
Category:

less

Transcript and Presenter's Notes

Title: Persistent Digital Archives and Library System PeDALS


1
Persistent Digital Archives and Library System
(PeDALS)
  • South Carolina Information Technology Directors
    Association
  • September 8, 2008
  • Bill Henry, Matt Guzzi
  • SC Department of Archives and History

2
Background Last Year
  • 2007 NHPRC grant proposal not funded
  • AZ Archives submitted multi-state grant proposal
    to Library of Congress
  • AZ proposal had same basic goals
  • SC too late for funding
  • Paid own expenses to join project

2
3
Electronic Archives Funding
  • One-time funding from General Assembly
  • Digitize paper records
  • Capture agency website snapshots
  • Purchase hardware and software
  • Library of Congress approved additional funds for
    project
  • SC now a fully-funded partner

3
4
What is PeDALS?
  • Persistent Digital Archives and Library System
  • Multi-state grant project funded by the Library
    of Congress and the Institute for Museum and
    Library Services
  • Five state partners Arizona, Florida, New York,
    Wisconsin, South Carolina
  • Project will run 18-24 months if successful,
    SCDAH intends to continue participation beyond
    this period
  • At the end of the project each partner will have
    a functioning digital archives system

4
5
Why is PeDALS Needed?
  • An increasing number of long-term and archival
    records are created and maintained only in
    digital formats
  • Traditional archival practices designed for paper
    records wont work in digital environment
  • Need ability to preserve electronic records so
    that we can demonstrate authenticity and protect
    integrity
  • PeDALS is both a learning opportunity and a
    chance to implement a functioning system

5
6
Technical Goals
  • To develop a curatorial rationale that can be
    implemented in software to support an automated,
    integrated workflow to process collections of
    digital records
  • To build digital stacks storage that has
    appropriate controls for preservation and
    disaster preparedness

6
7
Traditional Curatorial Processes for Paper Records
  • Appraisal
  • Acquisition
  • Arrangement and description
  • Housing and storage
  • Reference and access
  • Preservation

7
8
Curatorial Rationale for Digital Records
  • Transformation of traditional, paper-based
    practices into the digital arena
  • Focus on the rules, not the records
  • Automate the rules

8
9
Digital Stacks
  • More than storing the data (CD, tape, disk)
  • LOCKSS
  • 1. Automatic integrity checking and
  • error detection
  • 2. Secure
  • 3. Geographically distributed

9
10
Additional Goals
  • To build a community of shared practice that
    meets the needs of a wide range of repositories
  • - For best practices
  • - For resource sharing
  • To remove barriers by keeping costs as low as
    possible

10
11
The Open Archival Information System (OAIS)
Reference Model
  • OAIS an international (ISO) standard
  • Defines minimal set of responsibilities for
    long-term preservation
  • Can be applied to any information or object that
    needs to be retained long-term
  • OAIS does not specify a specific design or
    implementation
  • http//public.ccsds.org/publications/archive/650x0
    b1.pdf

11
12
View of an OAIS Environment
OAIS (PeDALS)
Producer
Consumer
Management
12
13
PeDALS (OAIS) Functional Areas
  • Ingest
  • Archival storage
  • Data management
  • Administration
  • Preservation planning
  • Access

14
PeDALS Overview - 1
  • Agency records in an electronic records system
    are transferred via the Internet to the PeDALS
    system
  • Supplemental processing checks for file integrity
    and completeness prior to transfer

15
PeDALS Overview - 2
  • Agency records with associated metadata are
    transferred to middleware server (Microsoft
    BizTalk)
  • Rules-based software will transform records into
    format for long-term storage along with a copy
    for web access

16
PeDALS Overview - 3
  • Records are transferred into LOCKSS servers for
    long-term preservation
  • LOCKSS is a dark archives

17
PeDALS Overview - 4
  • Public access will be provided via the web
  • Restricted records will be blocked from public
    access

18
Technology behind the South Carolina Digital
Archive
19
PeDALS Network Architecture
  • Agencys will have the ability to login and
    upload records to the South Carolina Digital
    Archive.
  • Biz Talk will check the incoming records for
    completeness and matches the hash value on
    upload.

19
20
Archivist Review
  • Once records are received the Archivist will
    receive an email.
  • The files will then be reviewed and a high level
    description will be entered in the Database
    Catalog.
  • The SIP (Submission Information Package) is
    created.

20
21
Biz Talk
  • This is where the magic happens.

21
22
Biz Talk Processes
  • DIP (Dissemination Information Package) created.
  • The Catalog database is updated with Access,
    Description and Preservation Information.
  • The Archival records are placed on the Manifest
    Server for Ingest into LOCKSS.
  • The public access database is updated.

22
23
LOCKSS (Lots of Copies Keep Stuff Safe)
  • Based at Stanford University.
  • LOCKSS has primarily been used for scientific
    journals and publications.
  • Open Source and uses Open BSD which is a
    multi-platform 4.4BSD-based UNIX-like operating
    system.

23
24
LOCKSS
  • Boots from CD No operating system installed on
    the server.
  • Communicates using a VPN virtual private network.
  • Files for LOCKSS are stored on a separate Admin
    server running linux.
  • 1 LOCKSS cluster with 7 Servers in our private
    distributed LOCKSS network.
  • Initially setup to take in 1TB of data and can be
    expanded.

24
25
LOCKSS Storage
  • Dark secure archival storage
  • LOCKSS is a sophisticated data storage system
    that scans for and repairs file corruption and
    other data integrity problems
  • Level 4 firewalls and geographic distribution
    provide added security

25
26
Public Access Process
  • BizTalk Process - AIP (Archives Information
    Package).
  • This process moves records from LOCKSS to the
    Public Access web server based on the record
    access date.

26
27
PeDALS Network Architecture
  • Web server will provide Internet access to
    records through a web-based search interface.
  • Access to records restricted by statute or
    otherwise will be blocked during restriction
    period.
  • Restricted records are held in the LOCKSS dark
    archive no user copy is sent to the web server
    until public access is allowed.

27
28
Future Public Access
  • We are currently in the process of implementing
    the web component of Rediscovery.
  • This will allow the public to search our
    holdings.
  • We are hoping to use Biz Talk to automatic
    populate the Rediscovery catalog.
  • Public access will be granted through URls to the
    Rediscovery web component.

28
29
PeDALS Open Archival Information System (OAIS)
Network Architecture
29
30
Records Eligible for PeDALS
  • Permanently valuable electronic records scheduled
    for transfer to the SCDAH
  • Pilot project agencies and records
  • Judicial Department Supreme Court Case Files
  • Election Commission Voter Registration
    Master Files
  • Public Service Commission Orders
  • DHEC Electronic Index to Death Certificates

30
31
Project Status
  • Core metadata defined and data dictionary
    completed
  • System design completed
  • Hardware and software acquired and installed
  • Agency partners and records identified
  • System prototype built (AZ SC)
  • BizTalk training completed

32
On the Horizon
  • Other states purchase and configure hardware
    software
  • First ingest of records in early winter
  • Develop public search website

33
Post-Grant
  • Move from pilot to production mode
  • Develop procedures for agency participation
  • Expand participation to additional agencies and
    records

33
34
PeDALS
  • Bill Henry
  • Electronic Records Consultant
  • henry_at_scdah.state.sc.us
  • (803) 896-6137
  • Matt Guzzi
  • Electronic Records Archivist
  • guzzi_at_scdah.state.sc.us
  • (803) 896-6103

34
Write a Comment
User Comments (0)
About PowerShow.com