IAC Digital Preservation Committee ________________________________________________ - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

IAC Digital Preservation Committee ________________________________________________

Description:

Develop a digital preservation program by evaluating, compiling, documenting and ... Published a Digital Preservation policy that establishes a mission statement and ... – PowerPoint PPT presentation

Number of Views:229
Avg rating:3.0/5.0
Slides: 29
Provided by: audrey6
Category:

less

Transcript and Presenter's Notes

Title: IAC Digital Preservation Committee ________________________________________________


1
IACDigital Preservation Committee_______________
_________________________________
  • 10 April 2007
  • Yale University Library

10 April 2007
2
IAC Digital Preservation Committee_______________
_________________________________
  • Outline
  • Charge members.
  • Accomplishments
  • Policy
  • Best practices
  • Whats next

10 April 2007
3
IAC Digital Preservation Committee_______________
_________________________________
  • The DPC is an Integrated Access Council committee
    charged to
  • Develop a digital preservation program by
    evaluating, compiling, documenting and
    articulating policies, procedures, best practices
    and systems in order to establish a digital
    preservation infrastructure at Yale University
    Library.
  • Work from a base of clearly articulated policies,
    then focus on preservation program planning and,
    finally, make recommendations for program
    implementation through digital preservation
    projects, initiatives, and system development.

10 April 2007
4
IAC Digital Preservation Committee_______________
_________________________________
  • Members
  • Rebekah Irwin, BRBL
  • David Gewirtz, ILTS/AMT
  • Kevin Glick, MSS/A
  • Audrey Novak, ILTS (Co-Chair)
  • Bobbie Pilette, Preservation (Co-Chair)
  • E.C. Schroeder, BRBL
  • Former members
  • Ann Green, ILTS/ITS, Co-Chair
  • Nicole Bouche, Beinecke Library
  • Gretchen Gano, Social Science Library

10 April 2007
5
IAC Digital Preservation Committee_______________
_________________________________
  • Accomplishments
  • Published a Digital Preservation policy that
    establishes a mission statement and promulgates
    preservation policies for institutional standards
    governing the quality, type and source of digital
    assets to be archived in the repository (revised
    Feb 2007).
  • Published best practices addressing Local
    practice for implementing PREMIS Preservation
    Strategies Persistent Identifiers Fixity
    (checksums, message digest and digital
    signatures) Format Registries Encoding
    Transmission of Structured Metadata and Care and
    Handling of Originals.
  • Modeled an organizational structure for the
    ongoing coordination and management of digital
    preservation. This structure recognizes that the
    responsibility for the creation and
    administration of digital preservation services
    at Yale is shared by three services Metadata,
    Repository and Preservation.

10 April 2007
6
Digital Preservation Best Practices
________________________________________________
  • Digital preservation does not have established
    and vetted standards.
  • Issues and problems associated with preserving
    digital resources are
  • numerous, complex and dynamic. DPC best practices
    are an effort to
  • parse the larger digital preservation problem
    space into discrete issues and
  • to identify processes, activities and/or
    methodologies that are emerging as
  • standards. This work by the DPC is by no means
    finished. More work is
  • required to establish additional best practices
    for the myriad of related
  • topics and to keep these recommendations current
    with the latest
  • thinking and research in this field. Note, too,
    that although informed by
  • research, most of these best practices are
    untested in production
  • preservation archives.

10 April 2007
7
Best Practice Care Handling of Physical
Collections ______________________________________
__________
  • White paper to advise Library staff on how to
    protect originals during digital conversion.
    Available on the web site for easy access
  • Sections include
  • Assessment of Physical Collections
  • Criteria for Selecting Proper Scanning Equipment
  • Preparing the Scanning Surface
  • Specifications for Scanning
  • Handling Procedures for Library Materials

10 April 2007
8
Care Handling of Physical Collections,
continued ________________________________________
________
  • Assessment of Physical Collections
  • Important to include Preservation Department
    contact Tara Kennedy, Field Service Librarian
  • List of questions to ask before scanning an
    object
  • Criteria for Selecting Proper Scanning Equipment
  • Describes available equipment and appropriate use
  • Indicates which materials can be scanned safely
    on each type of equipment
  • Preparing the Scanning Surface
  • How to clean the scanning surface (flatbed)

10 April 2007
9
Care Handling of Physical Collections,
continued ________________________________________
__
  • Specifications for Scanning
  • Illumination levels and types,
  • Proper supports for bound materials,
  • Environmental considerations (dust, temperature,
    relative humidity)
  • Handling Procedures for Library Materials
  • Mostly common sense reminders, but also
    specific suggestions, e.g. oversized materials
  • Includes paper-based, multimedia (sound, film,
    historical, optical), objects

10 April 2007
10
Best Practice - Fixity ___________________________
_____________________
  • Fixity, in preservation terms, means that the
    digital object has not been changed between two
    points in time or events.
  • Fixity checks such as checksums, message digests
    and digital signatures are used to verify a
    digital objects fixity.
  • Information created by these fixity checks,
    provides evidence for the integrity and
    authenticity of the digital objects and are
    essential to enabling trust.

10 April 2007
11
Fixity, continued ________________________________
________________
  • Fixity checks are all used in the same basic way.
    A value is initially generated and saved. Then,
    in response to an event (e.g., ingest) or over
    time, it is recomputed and compared to the
    original to ensure the object (file or bitstream)
    has not changed.
  • All fixity checks are not the same.
  • Checksums are the simplest and least reliable
    method. They are typically used in
    error-detection to find accidental problems in
    transmission and storage. They do not account for
    such changes as the re-ordering of bytes or
    changes that cancel one another out.

10 April 2007
12
Fixity, continued ________________________________
________________
  • Message digests are more secure. They are
    computed by applying a more complex algorithm to
    the file of any length to produce a unique,
    short, uniform length character string. Change
    one pixel or one note in the file and the message
    digests will be completely different. (Ex
    93326bff6636655dcd6abff18ed2de997).
  • Digital signatures combine message digests with
    encryption. The message digest is created and
    then encrypted using a private/public key pair.

10 April 2007
13
Fixity, continued ________________________________
________________
  • Current best practice for digital preservation
  • repositories
  • The creation of message digests using two
    algorithms, MD5 and SHA-1.
  • These are implemented in the widely used JHOVE
    format identification, validation and
    characterization application (e.g, in the Rescue
    Repository before and after ingest).

10 April 2007
14
Best Practice Format Registries and Tools
________________________________________________
  • What is a Format?
  • A technical specification describing a standard
    encoding or representation of digital content
    stored in a file.
  • A file format extension such as .jpg indicates
    the encoded content is a digital image.
  • File encoding standards are used by programs to
    read the encoded information and present useable
    content of the file to a users monitor or
    another output device.

10 April 2007
15
Format Registries ________________________________
________________
  • What is a Format Registry?
  • A database that stores information about the
    technical specifications of an electronic files
    format.
  • Format registries record file format changes over
    time so that files remain readable in the face of
    technological obsolescence to a format standard.
  • How does a format registry work?
  • Global Digital Format Registry

10 April 2007
16
File Format Tools ________________________________
________________
  • File format identification validation tools
  • answer two questions
  • How can we tell a file's type?
  • If we know its type, how can we be sure that it
    conforms to its format specification so that we
    know it is still useable?

10 April 2007
17
File Format Tools ________________________________
__________
  • JHOVE  A  widely used tool file type
    identification, validation and characterization
    tool developed by Harvard Univ. Library JSTOR.
  • Handles many format types, (e.g., AIFF, ASCII,
    BYTESTREAM, GIF, HTML, JPEG, JPEG2000, PDF, TIFF,
    UTF8, WAV, XML.)
  • Is configurable in many respects, including the
    option to select full validation or short
    mode, in which only the headers signature is
    analyzed the ability to include or exclude
    message digests in the output and to choose from
    various output formats, including plain text and
    XML.
  • Because JHOVE does both file type identification
    as well as validation, it is currently Yale
    University Librarys format-related tool of
    choice.
  •  

10 April 2007
18
File Format Tools ________________________________
_______________
  • Other tools
  • DROID (Digital Record Object Identification) A
    file type identification tool developed by the
    Digital Preservation Department of the National
    Archives of the United Kingdom, to perform
    automated batch file format identification, using
    the PRONOM registry .
  • National Library of New Zealand Preservation
    Metadata Extract Tool A tool that extracts
    metadata from file headers. This Java tool uses
    adapters to extract metadata from filetypes
    including MS Word, Word Perfect, Open Office, MS
    Works, MS Excel, MS PowerPoint, TIFF, JPEG, WAV,
    MP3, HTML, PDF,GIF, and BMP.  This data is output
    in a standard XML format.

10 April 2007
19
Best Practice Persistent Identifiers
__________________________________________
  • A persistent identifier (PI) is a unique name
    (identifier) associated with an internet resource
    that provides a link to the content and persists
    over changes of server location, ownership, and
    other state conditions.
  • A location (e.g., a given URL) is not a
    persistent identifier if the content moves to
    another location.The principal problem addressed
    by PIs is Broken links to internet resources,
    i.e., the HTTP 404 Error Document not found.
  • Persistent identification is not possible without
    an associated service. It is the service that
    supports persistence. The identifier takes you to
    the service, the service resolves to the object.
  • Optimally a PI should be created and assigned
    when the digital object is created.

10 April 2007
20
Best Practice Persistent Identifiers
__________________________________________
  • Several technologies are available to create
    persistent identifiers such as
  • CNRI Handle System A generic system for
    assigning names to objects and resolving them.
    Key is the Global Handle Registry which manages
    the namespace of all handle prefixes.
  • DOI (Digital Object Identifier) - An application
    of the CNRI Handle System that associates
    intellectual property to structured metadata. A
    typical use of a DOI is to give a scientific
    paper or article a unique identifying number that
    can be resolved through the DOI resolver or the
    CNRI global handle resolver.
  • PURL A Persistent Uniform Resource Locator is a
    URL that describes an intermediate (and more
    persistent) location which when retrieved results
    in a standard HTTP redirect to the current
    location of the resource.

21
Persistent Identifiers - Handle Server
________________________________________________
  • The implementation of a CNRI handle server at YUL
    is tightly coupled to the implementation of the
    VITAL/Fedora Digital Repository Service.
  • Digital objects within the Digital Repository
    Service will have handles such as
  • http//moonpie8085/fedora/get/hdl10079.2F
    -2103288706 (opaque), or
  • http//hdl.rutgers.edu/1782.1/SPCOLSMAPS.Ma
    p.b1849 (semantic)
  • A handle server, like a web server, requires
    ongoing system administration, e.g., when
    resources are moved.
  • Continuing research in the assignment of handles
    to resources in other YUL repositories such as
    the Rescue Repository, Image Commons
    (DL/Insight), etc.
  • /

10 April 2007
22
Best Practice - Maintenance Strategies
________________________________________________
  • A1. Clear Allocation of Responsibilities
  • A2. Provision of the appropriate technical
    infrastructure
  • A3. Establishment implementation of a plan for
    system maintenance, support and replacement
  • A4. Establishment implementation of plan for
    regular transfer of records to new storage media
  • A5. Adherence to appropriate storage and handling
    conditions for storage media
  • A6. Ensuring redundancy and regular backup
  • A7. Establishment of system security
  • A8. Disaster planning

10 April 2007
23
Best Practice - Preservation Strategies
________________________________________________
  • B1. Use of standards
  • B2. Data extraction and structuring
  • B3. Encapsulation
  • B4. Restricting the range of formats to be
    managed
  • B5. Technology preservation
  • B6. Reliance on backward compatibility
  • B7. Migration
  • B8. Software re-engineering
  • B9. Viewers and migration at the point of
    access
  • B10. Emulation
  • B11. Non-digital approaches
  • B12. Data restoration

10 April 2007
24
Best Practice - PREMIS ___________________________
_______________
  • PREservation Metadata Implementation Strategies
  • Yale Working Group
  • Matthew Beacom, Metadata Librarian, Catalog and
    Metadata Services (Co-chair)
  • Rebekah Irwin, Catalog Librarian for Digital
    Projects, Beinecke Library (Co-chair)
  • Youn Noh, Digital Resources Catalog Librarian,
    Catalog and Metadata Services
  • George Ouellette, Senior Programmer Analyst,
    Library ILTS
  • David Walls, Preservation Librarian, Library
    Preservation Dept
  • Yale Advisory Group
  • Reed Beaman, Associate Director for Biodiversity
    Informatics, Peabody Museum
  • Lee Faulkner, Media Director, Digital Media
    Center for the Arts
  • David Gewirtz, Project Manager, Library Projects,
    ITS
  • Kevin Glick, Electronic Records Archivist,
    Manuscripts and Archives
  • Edward Kairiss, Director, Instructional Computing
    Instructional Technology, ITS
  • Daniel Lee, E-Publishing/Internet Marketing
    Manager, Yale University Press
  • Thomas Raich, Associate Director, Information
    Technology, Art Gallery

10 April 2007
25
Best Practice - PREMIS ___________________________
____________________
  • Outcome
  • Develop PREMIS profiles that match specific
    digital collection and administrative needs
  • Base profile (up to 6 elements) This base
    profile of elements would support digital
    preservation of a wide range of digital assets
  • Full profile (over 200) This full profile would
    provide guidance to administrators of digital
    information assets acting as trusted custodians
    of material deemed to be of long-term value

10 April 2007
26
Best Practices - Summary _________________________
_______________________
  • Most of these best practices are the outcome of
    current research projects.
  • Few are tested in production preservation
    repositories.
  • At Yale the Rescue Repository is becoming a local
    testbed.
  • Fixity MD5 and SHA-1 message digests
  • JHOVE file format identification and validation
  • Maintenance strategies
  • PREMIS base profile element set.
  • VITAL/Fedora Digital Repository Service
    implementation
  • Persistent identifiers through the CNRI Handle
    System.

10 April 2007
27
Whats Next______________________________________
__________
  • Goals
  • Creation of a Transition Team to continue the
    work of the DPC, and most importantly, within a 6
    month timeframe, create the roadmap for the
    implementation of the permanent management model
    for an ongoing digital preservation program.
  • The recommended structure consists of a core team
    representing 2FTE comprised of staff with
    expertise in metadata, repository and
    preservation services. It is modeled as a
    virtual Digital Curation Center (DCC). The DCC
    will put into practice the identified best
    practices and the Digital Repostiory Service
    (DRS) Preservation Archive.
  • The Transition Team will prepare a business plan
    for the Digital Curation Center. The business
    plan will identify the DCCs Vision, mission,
    goals and first year deliverables Staffing
    models Budget and Timeline for creation.

10 April 2007
28
IAC Digital Preservation Committee
________________________________________________
  • Website
  • http//www.library.yale.edu/iac/dpc.html

10 April 2007
Write a Comment
User Comments (0)
About PowerShow.com