Title: A Shared Preservation Model for Institutional Repositories: SHERPA DIGITAL PRESERVATION Cambridge, J
1A Shared Preservation Model for Institutional
RepositoriesSHERPA DIGITAL PRESERVATIONCambridg
e, July 6th 2005
- Sheila Anderson
- Arts and Humanities Data Service
2SHERPA
- Acronym Securing a Hybrid Environment for
Research Preservation and Access - Initiator CURL (Consortium of University
Research Libraries) - Development Partners Nottingham (lead),
Edinburgh, Glasgow, Leeds, Oxford, Sheffield,
York, British Library, AHDS - Duration 3 years, November 2002 November 2005
- Funding JISC and CURL
- Programme FAIR (Focus on Access to Institutional
Resources) - Aims
- to construct a series of institutional
OAI-compliant e-print repositories - to investigate key issues in populating and
maintaining e-print repositories - to work with service providers to achieve
acceptable standards and the dissemination of the
content - to investigate standards-based digital
preservation e-prints - to disseminate learning outcomes and advocacy
- materials
3RCUK recognises the distinction between (a)
making published material quickly and easily
available, free of charge to users at the point
of use (which is the main purpose of open access
repositories), and (b) long-term preservation and
curation, which need not necessarily be in such
repositories. It is important to make the
distinction between these overlapping but
separate purposes. A resilient and
technologically robust framework for the
long-term storage and management of digital
resources will require the development of a
highly specialised and well coordinated service.
E-print repositories may have an important
contribution to make to such a service, for
instance through helping to set standards for the
formatting of data and metadata. Providing
effective access to such resources over the long
term will pose even greater challenges, and RCUK
will monitor the development and implementation
of the notion of Trusted Digital Repository as
a means of setting out clearly-defined standards
for the long term maintenance of digital
resources. However, it should not be presumed
that every e-print repository through which
published material is made available in the short
or medium term should also take upon itself the
responsibility for long-term preservation.From
RCUK Position Statement on Access to Research
Outputs www.rcuk.ac.uk/access/statement.pdf
4SHERPA DP Project
- Acronym Securing a Hybrid Environment for
Research Preservation and Access Digital
Preservation - Development Partners AHDS at Kings College
London (Lead), Nottingham, Glasgow, Edinburgh,
White Rose Consortium, London Leap Consortium - Repository Software DSpace and Eprints AHDS
Preservation Repository - Duration 2 years, March 2005 February 2007
- Funding JISC and CURL
- Programme JISC Digital Preservation and Records
Management Programme
5SHERPA DP Project
- Aims
- To develop a persistent preservation environment
for SHERPA Partners, based on the OAIS reference
model, and including a set of protocols and
software tools - To explore the use of METS for packaging and
transferring metadata and content - To explore the use of open source software and
tools to add functionality to and extend the
storage layer of repository software applications - To create a Digital Preservation User Guide
describing the model and its implementation
6- Disaggregated model
- Institutional repository for access
- Supra-institutional preservation service
7Developing the Model
- Review of the OAIS Model
- Map OAIS functionality onto the proposed
disaggregated model - Identify workflows and processes at IRs
- Identify rights and responsibilities of each
party - Identify and assign services and actions to be
carried out and apportion these - Review and define AIPs, DIPs and SIPs
- Work up draft processes and procedures
8Functionality
- Each party required to provide an agreed level of
functionality - Repositories likely to provide
- Support for publishing metadata to be harvested
- One or more methods for transferring content
across the network - Alerting mechanisms for updated/additional
content - Preservation Service likely to provide
- Support for harvesting metadata and content
- One or more methods for transferring content and
metadata back into institutional repository - File format conversions tools integrity
checking metadata extraction obsolescence
checking alerting and migration etc.
9Repository Archiving
- Investigate and implement automated transfers of
data between institutional repositories and
preservation repository - Review DSpace and Eprint APIs, storage layers and
module add-on capabilities - Prototype and test SRB as a common storage medium
- Prototype and test API based access mechanisms
- Prototype and test external synchronisation
mechanisms
10Preservation Actions
- Investigate the processes required to enable
changes and updates to e-print content that
ensures their long-term integrity and
preservation - Create repository integrity checking and
reporting services - Create repository obsolescence checking,
reporting and migration services - Investigate remote alerting service capabilities
- Investigate mechanisms for automatic creation of
new versions, or migration and redeposit
11Metadata and METS
- Review existing metadata captured by repositories
against agreed administrative and preservation
metadata set - Identify additional metadata requirements and
capture methods - Review the potential for the use of METS within
the SHERPA environment - As a framework for combining and packaging
metadata - As a transfer mechanism for metadata and e-prints
12Implementation
- Preservation plans drawn up
- Risk assessment finalised
- Policies and procedures finalised
- Cost models and business case developed
- Implement services
13Digital Repository Preservation User Guide
- The User Guide will recommend standards, best
practice, protocols and processes that might be
used in the management, preservation and
presentation of e-print repositories - Will draw on experiences of SHERPA and other
relevant projects, and include case studies - Will complement Beagrie and Jones The
Preservation Management of Digital Material
Handbook
14Developing the SHERPA DP Trusted Repository
Model
- Disaggregated model
- Analysing and integrating
- Institutional Repository workflow and processes
- Preservation Service workflow and processes
- OAIS as an ideal(?) functional model
- Lifecycle of an e-print
15Institutional Repositories
16Developing the SHERPA DP Trusted Repository
Model
- Analysing and integrating
- Institutional Repository workflow and processes
- Preservation Service workflow and processes
- Lifecycle of an e-print
- OAIS as an ideal(?) functional model
17Preservation Service
18Developing the SHERPA DP Trusted Repository
Model
- Analysing and integrating
- Institutional Repository workflow and processes
- Preservation Service workflow and processes
- Lifecycle of an e-print
- OAIS as an ideal(?) functional model
19(No Transcript)
20Implementation is rarely easy.
- Ingest The services necessary to accept
information packages from a Producer, QA, create
an archival version - Archival Storage The services required to
duplicate, store, and maintain the deposited data - Data Management Functions required to populate a
search database, to allow the user community to
locate a resource, and administer the archive in
its entirety - Preservation Planning Responsible for the
development and review of the preservation plan - Access The facilities available that allow users
to locate, request and receive information
packages in a usable form - Administration Responsible for managing the
day-to-day operation of an OAIS and coordinating
the activities of the above five OAIS functions
21OAIS Mandatory Responsibilities
- Negotiate for and accept information from
information Producers - Obtain sufficient control of the information
provided to the level needed to ensure Long-Term
Preservation - Determine which communities should become the
Designated Community and, therefore, should be
able to understand the information provided - Make the preserved information available to the
Designated Community - Ensure that the information to be preserved is
independently understandable to the designated
community without needing the assistance of the
experts who produced the information - Follow documented policies and procedures which
ensure that the information is preserved against
all reasonable contingencies, and which enable
the information to be disseminated as
authenticated copies of the original, or as
traceable to the original - (OAIS Reference model, page 3-1)
22Responsibilities..
- E-print repositories must
- Implement appropriate repository software
- Develop selection, retention and ingest polices
- Develop a rights framework
- Specify a minimum metadata set, and provide
details to the Preservation Service - Agree and implement a system for Persistent
Identifiers - Support mechanisms for harvesting of metadata
(and content) - Implement a mechanism for transferring IPs to the
Preservation Service - Alerting mechanisms for updated/additional
content -
23Responsibilities..
- Preservation Service Must
- Undertake preservation planning
- Evaluate contents of archive and undertake risk
assessment - Recommend updates to migrate current holdings
- Develop recommendations for preservation
standards and policies - Agree and implement a system for Persistent
Identifiers - Monitor changes in technology environment, users
service requests, and knowledge base - Develop detailed migration and test plans
24Responsibilities..
- Preservation Service Must
- Undertake preservation actions
- Provide a permanent storage facility
- Create and manage multiple copies of content,
including off-site storage - Manage storage hierarchy
- Refresh/replace media
- Provide disaster recovery capabilities
- Implement migration plans and migrate holdings as
appropriate - Manage version control
25Many outstanding issues.
- Best method for encapsulating the many elements
of an e-print information package? - Agreeing and assigning PIDS who, where, and
how? Single bibliographic record or multiple
PIDS? - What about audit trails?
- Who is the master and who the slave? Or will
this become a truly egalitarian and shared model
for digital preservation?
26Further Information
- http//www.ahds.ac.uk/about/projects/
- andrew.c.wilson_at_ahds.ac.uk
- kirti.bodhmage_at_ahds.ac.uk
- gareth.knight_at_ahds.ac.uk