Towards smart storage for repository preservation services - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Towards smart storage for repository preservation services

Description:

Towards smart storage for repository preservation services Steve Hitchcock, David Tarrant, Adrian Brown1, Ben O Steen2, Neil Jefferies2 and Leslie Carr – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 16
Provided by: Steve2095
Category:

less

Transcript and Presenter's Notes

Title: Towards smart storage for repository preservation services


1
Towards smart storage for repository preservation
services
  • Steve Hitchcock, David Tarrant, Adrian Brown1,
    Ben OSteen2, Neil Jefferies2 and Leslie Carr
  • Preserv 2 Project
  • School of Electronics and Computer Science,
    University of Southampton
  • 1The National Archives, Kew
  • 2Oxford University Library Services
  • _at_iPRES 2008 The Fifth International Conference
    on Preservation of Digital Objects, London, 29-30
    September 2008

2
Three-stage strategy for keeping your data safe
  • Ability to move data freely, easily and instantly
  • OAI, ORE, Atom
  • Reliable, trusted large-scale storage
  • Open Storage
  • Risk profiling invoke a range of selectable
    services
  • Smart storage

3
About institutional repositories
  • IRs in flux
  • Uncertainty in terms of target content -
    published papers, theses, research data, teaching
    materials - policy, rights, even locus of content
    and responsibility for long-term management.
  • OAI-ORE (Object Reuse and Exchange) effectively
    frees the data from being captive to repository
    software.
  • Commercial repository services, from
    software-specific services to digital library
    services or more general 'cloud' or network
    storage services.
  • Set up by institutions of higher education and
    research to manage and disseminate their digital
    intellectual outputs.
  • IRs are a special type of Web site, typically
    based on some repository software that presents a
    database of records pointing to the objects
    deposited.
  • The Preserv 2 project is investigating the
    provision of preservation services for IRs.

Photo Flickr/cpikas
4
IRs are
  • Open source repository softwares
  • Open access content
  • Open archives using OAI-PMH to share data with
    e.g. discovery services.
  • Open repositories, using OAI-ORE enables the easy
    movement of data between different types of
    repository software

Photo Flickr/Rightee
5
A new openHow open storage supports
preservation services
  • Open storage, large-scale storage devices based
    on open source software
  • Open storage averts the need for a repository
    layer to access first-class objects these are
    objects that can be addressed directly
  • In turn, these digital objects can be distributed
    and/or replicated over many open storage
    platforms.
  • In turn, able to select storage with built-in
    preservation support
  • Resilient storage platforms may be viable for
    preservation services aimed at multiple
    repositories
  • E.g. Sun Microsystems STK5800 (codenamed
    Honeycomb)
  • Google Repository

6
Smart storage
  • Smart storage combines an underlying passive
    storage approach with the intelligence provided
    through services.
  • The key to realising smart storage is to enable
    the services to communicate and share information
    with the digital content sources they may be
    acting on. This is done through machine-level
    application programming interfaces (APIs) and
    protocols.

7
APIs, interfaces and the Web architecture
  • Major services on the Web, such as deploy their
    own simple, but different, APIs, e.g.
  • Google Maps
  • Within the repository community, SWORD (Simple
    Web-service Offering Repository Deposit)
  • Open storage platforms such as Sun's STK5800 and
    the Amazon Simple Storage Service (S3)
  • To take advantage of open storage, repositories
    have to be able to talk to these services through
    their APIs.

8
Smart storage example format services
  • Preservation methods affecting formats can be
    classified in three stages (seamless flow)
  • Format identification and characterization (which
    format?)
  • Preservation planning and technology watch
    (format risk and implications)
  • Preservation action, migration, etc. (what to do
    with the format)
  • Format-based services tend to be ad hoc processes
    for which some tools are available
  • E.g. PRONOM-DROID from The National Archives (UK)
  • PRONOM is an online registry of technical
    information, such as file format signatures
  • DROID is a downloadable file format
    identification tool that applies these
    signatures)
  • These and other tools could be used in a more
    coordinated manner.

9
Smart storage DROID concept
10
Smart storage DROID scheduling/history
  • Scheduling interface controls when a DROID
    classification needs to be performed.
  • Preserv 2 has developed a scheduling service that
    uses the Darwin Calendar Server and iCalendar
    format.
  • Provides a powerful scheduling service with many
    clients already available - Apple iCal, Mozilla
    Sunbird, and others - that can read and interpret
    the files so that past and future events can be
    reviewed.

11
Smart storage DROID OAI-PMH interface
  • An OAI-PMH interface to open storage discovers
    the latest objects to have been deposited and
    which are ready for format classification.
  • Could also be performed by simpler RSS or
    Atom-based methods.
  • The interface has since been expanded to allow
    export of OAI-ORE resource maps in both RDF and
    Atom formats.

12
Smart storage DROID implementation
  • E.g. iCal, Outlook, Sunbird

DROID
Scheduler
DROID-OAI harvester
Open storage
Schedule event
Calendar server
OAI-PMH
Repository
History
Is event done?
url, date
Messaging
Atom?
Web server HTTP Stores results of DROID events
User interface
Get results of event
Machine interface, API
Implemented
To be implemented
13
  • Risk profiling
  • The scheduler will invoke actions based on the
    results of scanning by DROID allied to
    decision-making tools that use intelligence from
    planning and technology watch tools, such as
  • PRONOM,
  • Plato preservation planning tool from the
    EC-funded Planets project,
  • and others.

Photo Flickr/yourbartender
14
Summary smart storage in the storage scheme
Binary stream
File system need to store multiple streams with permissions
Content addressable adds content validation and object identifiers, metadata required to locate an object
Open adds error correction and recovery, places processing close to storage, solves some bandwidth problems
Smart opens up the close-to-storage approach for application development, transition to 'cloud' storage
How smart storage addresses current storage
issues see full paper
15
Storage can become smarter
  • Openness, in its various forms, the ability to
    move data freely and easily, needs to be
    supplemented by decision-making that can be
    automated based on the supplied intelligence and
    information.
  • In this way, open storage can become smarter.
  • http//preserv.eprints.org/

Thanks to
Write a Comment
User Comments (0)
About PowerShow.com