Dr' Martin Halbert - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Dr' Martin Halbert

Description:

Created a conspectus database to capture collection-level preservation metadata pre-ingest ... tools for enhanced conspectus, interoperability with grid ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 23
Provided by: MartinH89
Category:
Tags: halbert | martin

less

Transcript and Presenter's Notes

Title: Dr' Martin Halbert


1
Comparison of Strategies and Policies for
Building Distributed Digital Preservation
Infrastructure Initial Findings from the
MetaArchive Cooperative
  • Dr. Martin Halbert
  • MetaArchive Cooperative
  • Wednesday, December 3, 2008
  • International Digital Curation Conference
  • Edinburgh, Scotland

2
Overview
  • Needs of cultural memory organizations (CMO) for
    digital preservation infrastructure that led to
    creation of MetaArchive
  • Framing comparison of some major digital
    preservation efforts and service offerings
  • Common distributed digital preservation (DDP)
    strategies
  • Findings from the MetaArchive Cooperative about
    DDP cooperatives

3
Cultural Memory Organizations (CMOs)
  • Small to medium-sized libraries
  • Small research institutes
  • Historical associations
  • Archives
  • Museums
  • NOT enormous national agencies (US LoC, UK BL)
  • Organizations responsible for institutional
    memory / research assets of their communities
  • Culture here means any resource of primary
    research value, for humanities, science, or other
    scholarship

4
Gaps in Digital Preservation Efforts
  • 66 of cultural heritage institutions (academic
    libraries, archives, art museums, public
    libraries, and other similar kinds of
    institutions) report that no one is responsible
    for digital preservation activities
  • 30 of all archives have been backed up one time
    or not at all

Source 2005 NEDCC Survey by Bishoff and Clareson
5
The Problem
  • CMOs are rapidly digitizing or acquiring local
    digital archives with long term value for both
    scholarly and public research purposes
  • Yet CMO professionals most often lack affordable
    and scalable DP infrastructures
  • This lack of access to effective means for long
    term preservation of digital content is
    aggravated by a lack of consensus on DP issues
    and professional roles and responsibilities

6
Digital Curation/ Preservation An Emerging Field
  • Historically CMOs have been responsible for
    preservation of institutional memory
  • CMO administrators and funders are uncertain
    about how to carry out these responsibilities in
    the digital age
  • No consensus in CMOs on roles, best practices, or
    priorities in digital preservation
  • Many competing frameworks and assumptions brought
    forward from external groups and practitioners
    seeking to create this new field

7
What led to MetaArchive?
  • Planning meetings by a group of US librarians and
    archivists in 2002-2003 on concerns about
    preserving digital archives
  • Felt that we needed to do something practical to
    help each other preserve our data
  • Not based on studies, just the observation of our
    anxieties about doing something together to keep
    our (expensive) digital materials preserved and
    viable

8
The Need for Collaborative Approaches
  • The increased number and diversity of those
    concerned with digital preservationcoupled with
    the current general scarcity of resources for
    preservation infrastructuresuggests that new
    collaborative relationships that cross
    institutional and sector boundaries could provide
    important and promising ways to deal with the
    data preservation challenge.  These
    collaborations could potentially help spread the
    burden of preservation, create economies of scale
    needed to support it, and mitigate the risks of
    data loss.
  • - The Need for Formalized Trust in Digital
    Repository Collaborative Infrastructure
  • NSF/JISC Repositories Workshop (April 16,
    2007)

9
MetaArchive
  • A distributed digital preservation cooperative
    for digital archives
  • Established in 2003 under the auspices of and
    with funding from the National Digital
    Information and Infrastructure Preservation
    Program (NDIIPP) of the US Library of Congress
  • A functioning DDP network using/building open
    source software,
  • Organized as an incorporated nonprofit
    cooperative of libraries and other cultural
    memory organizations
  • Sustained by organization fee memberships,
    cooperative agreement with US LoC , and other
    sponsored funding
  • Provides training and models for other groups to
    establish similar distributed digital
    preservation networks
  • Fosters broader awareness of digital preservation
    issues
  • Designed to address in-the-trenches needs of
    CMOs after environmental scans of other options

10
Comparison of Selected Digital Preservation
Efforts
  • National Scientific Research Agency Efforts
  • PubMed Central Efforts in US and UK
  • Social Science Dataset Archives (UK DA, US ICPSR)
  • Big-Science Agency Efforts (UKRDS, NSF DataNET)
  • Cross-Disciplinary National Efforts
  • US NDIIPP
  • UK PLANETS
  • Non-Governmental E-Journal DP Efforts
  • LOCKSS
  • Portico

11
Differences and Variations
  • Variation evident in understanding of what
    constitutes digital curation/preservation (scope,
    practices, priorities)
  • Relative differences in prescriptivity and degree
    of centralization (top-down vs. bottom-up
    planning) between UK and US
  • Many specific differences in preservation and
    access aims and technologies

12
Similar Patterns
  • Emphasis on collaboration between groups to
    accomplish digital curation/preservation
  • Exploration of new professional roles, expertise,
    models, and best practices
  • Virtually all efforts examined embrace
    distributed digital preservation strategies
  • Most programs (then and now) do not directly
    address the needs of CMOs

13
Distributed Digital Preservation Strategies
  • Digital curation/preservation starts with secure
    and distributed bit-preservation good metadata
  • Technology for secure replication Many good DDP
    options (we use a private LOCKSS network)
  • Collaboration for digital curation/preservation
  • Provides a framework for systematically exploring
    new data curation lifecyle roles for CMOs to
    carry out their core responsibility for curating
    institutional memory materials
  • Cooperative strategies for sustaining distributed
    digital preservation infrastructures

14
MetaArchive Phase I (2004-2007)
  • Developed a functioning network for distributed
    digital preservation (DDP) used by institutions
    with shared subject domain focus for mutual
    benefit
  • Developed this technical solution for DDP based
    on a reuse of LOCKSS technology, in the form of a
    separate network with higher capacity nodes
  • Created a conspectus database to capture
    collection-level preservation metadata pre-ingest
  • Created an administrative nonprofit corporation
    as an independent legal entity for membership
    agreements
  • Now preserving via DDP more than 650 collections
    from many different organizations

15
Collection Variety
  • Collections include
  • Images
  • Text files
  • Multimedia files
  • Datasets
  • Program executables

16
MetaArchive Membership
  • 11 institutions currently
  • Emory, GA Tech, Auburn, VA Tech, FSU, Louisville,
    Hull, Rice, Boston College, Folger, and US
    Library of Congress
  • Doubled in size of membership within past year,
    plan to double again in next 12 months
  • Now undertaking strategic alliances with other
    membership organizations to provide DDP services
    (NDLTD)

17
Catalytic Efforts
  • Host workshops in distributed digital
    preservation strategies
  • Instructing new MetaArchive members in network
    processes
  • Advise other groups considering DDP approaches
  • Advised/assisted in creation of two additional
    DDPNs
  • Alabama
  • Arizona

18
MetaArchive Phase II (2007-2010)
  • Established additional distributed archives
  • African Diaspora
  • Electronic Theses and Dissertations
  • Early modern literature
  • New software tools for enhanced conspectus,
    interoperability with grid-computing, format
    migration services
  • Became international with addition of Hull
    University in UK
  • Upcoming DDP workshops
  • Plan to double in size each year (on average)
    for this period, to reach a robust cooperative
    size
  • With funding from NHPRC will provide consulting
    and outreach services on the MetaArchive model
    for DDP services

19
Membership Levels
  • Contributing Member Sites are institutions that
    need to preserve digital content, and therefore
    decide to contribute digital content into the
    preservation network. The preservation network
    acts for the common good to preserve the at-risk
    content submitted by the contributing sites.
    Contributing sites may also be preservation
    sites.
  • Preservation Member Sites are responsible for the
    basic ongoing activity of preserving digital
    content. At a minimum, every preservation site
    must include responsible staff and a node server
    of the relevant preservation network.
    Preservation sites collectively comprise a
    preservation network.
  • Sustaining Member Sites are responsible for
    steering committee of the cooperative, technical
    development of the computer systems that enable
    the preservation network. Obviously, development
    sites may also be preservation sites and/or
    contributing sites.

20
Individual Roles
  • Program Managers are leaders that accept
    responsibility for coordinating the activities of
    a digital preservation network.
  • Data Wranglers are programmers and other
    technically adept workers that prepare local
    digital archives for ingestion into a
    preservation network.
  • System Administrators are staff members that
    maintain individual preservation node servers of
    the relevant preservation network.
  • Selectors are staff that identify and prioritize
    content to be preserved. They will most often be
    knowledgeable concerning the content of an
    institutions digital archives, and may have been
    the same individuals that originally created or
    acquired the archives.

21
FindingsWhy DDP Cooperatives?
  • Enables collaborative pooling of resources
    (staff, expertise areas, technology,
    infrastructure, funds)
  • Also allows institutions to retain ownership
    individually of their part of the infrastructure,
    expertise, and operations
  • Defuses competitive jockeying between CMOs no
    one institution is the primary leader to which
    the others sign agreements
  • Allows for decentered ongoing operations as
    individual institutions may join or leave
  • Flexible cooperatives can be assembled quickly
    without onerous new overhead, by leveraging sunk
    costs in existing institutions
  • Nonprofit organization promotes trust by other
    institutions from public sector

22
Questions and Answers
  • Some contacts
  • Martin Halbert (MetaArchive President, Emory
    representative) mhalber_at_emory.edu
  • Tyler Walters (MetaArchive Treasurer, GA Tech
    representative) tyler.walters_at_library.gatech.edu
  • Katherine Skinner (MetaArchive Executive
    Director) kskinne_at_emory.edu
  • Martha Anderson (LoC Program Officer)
    mande_at_loc.gov
Write a Comment
User Comments (0)
About PowerShow.com