Title: UK strategies for digital preservation and curation
1UK strategies for digital preservation and
curation
- Chris Rusbridge, Digital Curation Centre
Funded by
2Contents
- Background
- Digital Preservation Coalition
- Digital Curation Centre
- Other strategic activities
3Background
- CPA/RLG report 1995
- JISC/BL workshop, Warwick 1995
- CEDARS project starts, 1998
- CAMiLEON project starts, 1999
- JISC/BL workshop, March 1999
- Digital Preservation Coalition formed, 2002
- UK Legal Deposit legislation, 2003
- Digital Curation Centre formed, 2004
4Digital Preservation Coalition
- Aim develop UK digital preservation agenda
within international context - Established as non-profit company July 2002
- May 2003, Digital Preservation Coordinator
appointed - March 2004, DPC has 27 members (BL, CURL, JISC,
MLA, NA Scotland, OCLC, PRONI, TNA, ULCC founding
members)
5Rationale
- Reliance on digital resources increasing rapidly
- Increasing expectations of long term availability
- Responsibility for stewardship
- Cultural and scientific value of resources
6Types of membership
Libraries
Archives Museums Records
Cross-Sectoral Membership
Publishing Media
Data services Science Technology
Government Research Policy
7What does the DPC do?
- Raise profile of digital preservation
- Run advocacy campaigns, etc
- Provide examples of good practice
- Highlight gaps and reponsibilities
- Act as catalyst for action
8DPC action examples
- Digital preservation training, advocacy and
outreach - PR programme to raise media profile
- Survey members to assess DP needs
- Regular fora for training and advice
- Whats New in Digital Preservation and
Technology Watch reports
9DPC work packages
- Promoting Digital Preservation
- Acting to increase funding
- Fostering collaboration and forging strategic
alliances - Producing, providing, and disseminating
information - Promoting and developing services, technology,
standards and training - Continuing to develop the Coalitions activities
10Digital curation continuum
For later use? In use now (and the future)?
Static
Dynamic
Digital preservation
Digital curation
11Digital Curation Centre
- History
- JISC Strategy (Beagrie et al)
- Lord and MacDonald report
- JISC and e-Science funding
- Call for bids 2003
- DCC Project for 3 years
- Service began March 2004
- Research begun September 2004
12Digital Curation Rationale
- Definition?
- maintaining and adding value to a trusted body
of digital information for current and future
use
13Assuring permanent access to the records of
science the humanities?
- Long term access to primary data
- Increasing data volumes from eScience and
Grid-enabled / cyberinfrastructure applications - Changing research paradigm data-driven science,
big science - Observational data, simulations, large-scale
experimentation - Multi-media resources, statistical data,
surveys, geo-spatial data
14Reports
- Tony Hey et al The Data Deluge
- Atkins et al Blue-Ribbon report on
Cyberinfrastructure - NSB Draft report on Long-lived Digital Data
Collections - Recent report on Cyberinfrastructure in Social,
Behavioural and Economic sciences
15Census Data on the Web
16Internet Archaeology publication with data
17Lord/Macdonald 1data producers
18Lord/Macdonald 2
19Lord/Macdonald 3data producers re-visited
20Conclusions?
- Need for continuing care of data
- More than digital preservation
- Hence Digital Curation Centre
- Providing advice and support
21Structure to Engage Collaborate
curation organisations eg DPC
communities of practice users
community support outreach
service definition delivery
Collaborative Associates Network of
Data Organisations
management admin support
research collaborators
research
development co-ordination
testbeds tools
Industry
standards bodies
22Matrix structure
23Activities desiredfrom User Needs Analysis
RD issues Annotation services, Ontology
development, Automating metadata creation, Tools
and toolkits, Data Format Description Language,
Identifiers, Registries, Economic and
cost-benefits studies Advisory services
Ask-a-Curator,FAQs, reports, briefings,
awareness-raising materials, best practice
guidance, Storage media, Like Erpanet, advise
Government, Research Councils, funding
bodies Professional development Short courses,
conferences, seminars, workshops, secondments to
DCC and to working repository services Outreach
Leadership for the future, case studies, sharing
solutions, collaboration with other partners,
international peers, industry links
24Curated databases some issues
- Integrating and publishing data so that someone
else can use it. - Annotating existing data and moving annotations
to other databases - Provenance where did this data come from?
- Archiving how do you preserve something that is
constantly changing?
25Research approaches
- Publishing integrating scientific databases
- Archiving past states of volatile databases
- Database provenance and annotation
- Organisational dynamics of trusted repositories
- Automating metadata extraction
- Cost-benefit analysis of data curation
- Rights and responsibilities
26Development approach
- OAIS (Open Archival Information System) linkage
focus on representation information - link to global work on format registries?
- Concentrate on scientific data formats?
- Repository
- Representation Information
- Standards and Tools
- Aim for OAIS compliance
- Persistent identifiers
- Certification RLG task force
- Open development wiki and email list
27OAIS Reference Model Functional Model
28Representation Net
29Development Roadmap
- Registry complete prototype, link to PRONOM,
GDFR etc, handover to service - Representation information describe CCLRC
(science) data using EAST, etc - Certification work continues
- Additional tools metadata extraction
- Testbeds, interactions with others
30Service definition delivery
- Advisory services
- Responses to queriesfrom legal to technical
guidance HELPDESK_at_dcc.ac.uk - Site visits (National Institute of Environmental
eScience) - Information Services
- Briefing Documents - Freedom of Information by
Mags McGinley - DIGITAL CURATION MANUAL
- 20 chapters written by community experts e.g.
Metadata written by Michael Day, UKOLN - Peer-reviewed
- Checklist for Compliance with best practices and
standards - Technology Watch
31Services workshops
- 2005 Programme
- Preservation of medical databases 24-25 May at
the Gulbenkian Institute, Lisbon in collaboration
with ERPANET the Wellcome Trust - Persistent identifiers liaising with NISO, 30
June at University of Glasgow - Institutional repositories 6 July at the
University of Cambridge, UK in collaboration with
DSpace - Cost models in collaboration with Digital
Preservation Coalition July at British Library
32User requirements analysis
- Commissioned study
- Leona Carpenter
- Reporting now
- Desk-based research
- Focus groups
- Interviews
- Results will inform research, development service
definition / delivery and outreach - Recommendations and priority tasks
33www.dcc.ac.uk
34- www.ijdc.net
- Launch planned June/July
- Peer-reviewed contributions
- Peters Buneman and Burnhill, Editor (issue 1)
- Production editor Philip Hunter
35Sample issue Full papers Invited articles News
views Papers for submission are very welcome!
361st DCC International Conference
- Location - Bath UK
- 29-30 September 2005
- Keynote speakers
- Cliff Lynch CNI
- Graham Cameron European Bio-informatics
Institute - DCC Research update
- Social highlights
37Associates Network
Goals Develop understanding, share best practice,
advance research, promote recognition, develop
consensus Membership International groups,
national bodies, industry partners, funders,
research groups, HEIs, FEIs, individuals Benefit
s Early access to RD outputs, advisory services,
training, input to definition and design,
community participation Discussion Forum
www.dcc.ac.uk Please join us!
38Other strategic activities
- Legal deposit
- BL Digital Object Management system
- The National Archive
- European activities DELOS, ERPAnet etc
- JISC 4/04 Digital Preservation Programme
39JISC preservation programme
- Assessment of UK Data Archive and The National
Archives compliance with OAIS/METS - DAAT Digital Asset Assessment Tool
- Digital Preservation Training Programme
- eSPIDA An effective Strategic model for the
Preservation and disposal of Institutional
Digital Assets - LIFE (Lifecycle Information for E-literature)
- Managing Digital Assets in Tertiary Education
(Mandate)
40JISC preservation programme 2
- Managing Risk a Model Business Preservation
Strategy for Corporate Digital Assets - METS Awareness Training
- Personal Archives Accessible in Digital Media
(paradigm) - PRESERV (PReservation Eprint SERVices)
- SHERPA Digital Preservation Creating a
Persistent Preservation Environment for
Institutional Repositories
41Acknowledgements
- Slides from Maggie Jones, Liz Lyon, Peters
Burnhill Buneman, David Giaretta and others
used with thanks.
42Trusted Repositories of Knowledge
- The Maori entrusted their knowledge to people,
trained to be the repositories,who could - receive information with the utmost accuracy
- store information with integrity beyond doubt
- retrieve the information without amendment
- apply appropriate judgement in the use of the
information - pass on the information appropriately.
- Whatarangi Winiata, (2002), Repositories of Röpü
Tuku Iho A Contribution to the Survival of
Mäori as a People, Wellington Library
Information Association of New Zealand Aotearoa
Annual Conference, 17-20 November 2002 - Special thanks to Professors Derek Law Seamus
Ross
43Aims Objectives for the DCC
- quality improvement in data curation
digital preservation - initial focus data as evidence for scholarly
conclusions - wider remit scholarly communication eLearning
- excellence in research excellence in service
- working with repositories, rather than being one
- connecting communities via Associates Network
- universities research institutes
- scientific data tradition document tradition
- international cross-sectoral