Title: Surveys of Digital Preservation Practices and Priorities: Findings of the MetaArchive Cooperative
1Surveys of Digital Preservation Practices and
Priorities Findings of the MetaArchive
Cooperative
- Katherine Skinner, Emory University
- Gail McMillan, Virginia Tech
NDIIPP Annual Partners Meeting June 24, 2009
2Two surveys, 158 participants
- Central aim to better understand the terrain of
the emergent field of digital curation. - how emergent is it?
- what trends are beginning to emerge within it?
3Two surveys, 158 participants
- ETD
- December 2007-April 2008
- Universities and Colleges
- 96 Respondents
- Five Listservs
- Association of Research Libraries, Association of
Southeastern Research Libraries, Council of
Graduate Schools, Digital Library Federation, and
Electronic Theses and Dissertations
4Two surveys, 158 participants
- Cultural Memory
- March 2009
- Archives, Museums, Libraries, Historical
Societies, Government Agencies - 62 Respondents
- Three Listservs
- H-Museum, AA-L (Society of American Archivists),
and ERECS-L (Electronic Records Managers)
5Survey questions addressed
- Who is collecting digital materials, what are
they collecting, and how are they storing these
materials? - Who seeks to preserve their digital collections
and how do they want to preserve them? - What are the biggest barriers to preservation?
- What are the most desired offerings in
preservation?
6Who is collecting and what are they collecting?
- Cultural Memory
- 98.4 are collecting
- Range 1 GB-20 TB, average 2 TB
- Average Growth 540 GB/year
- Formats/Genres include text (83), video (76),
audio (75), email (47), databases (48),
websites (41), and GIS material (36) scads
more - Repository structures include home-grown (65),
CONTENTdm (17), Fedora (9), DSpace (7),
Access/Excel (6), plus SRB, Filemaker, and 10
others
7Who is collecting and what are they collecting?
- ETDs
- 80 accept ETDs 40 only accept ETDs
- Range 22-60 GB, average 41 GB
- Average Growth 4.5 GB/year
- Formats/Genres include images (92),
applications (89), audio (79), text (64),
video (52), and other (15) - Repository structures include DSpace (31),
ETD-db (15), Fedora (5), Eprints (2), as well
as locally developed solutions (34) and
vendor-based solutions bepress (6), DigiTool
(6), ProQuest (6), and CONTENTdm (6).
8Formats (ETD Cultural Memory)
- ETD
- .ppt
- .qt
- .tif
- .xml
- .wav
- .png
- .pdf
- .mpg
- .mp3
- .aif
- .avi
- .doc
- .gif
Cultural Memory Textual documents Databases Still
images Video Audio GIS Websites Email Computer
games Science data Publications Presentation
materials
.html .jpg .mov .dwt .xls .csv .zip .mix .snd .tex
.txt .midi .exe .jar
JP2 .ps
9Platforms (ETD Cultural Mem.)
- ETDdb
- Eprints
- Fedora
- DSpace
- Archimede
- bepress/
- Digital Commons
- CONTENTdm
- Cybertesis
- Dias
- DigiTool
- DLXS
- Proquest
MS Access Excell SRB ResCarta Augias-data Cumulus
CollectiveAccess Windows Explorer IRODS Filesystem
ArchivalWare Filmaker Pro iTunes
Documentum Fez Millennium Online
Catalog OhioLINK Oracle Sesame VTLS Vital Past
Perfect ANCS MINISIS CDs/DVDs In House
10Structure (ETD Cultural Mem)
- Cultural Memory
- subject (33)
- collection (35)
- format (21)
- date (10)
- department (10)
- creator (8)
- funder (4)
- some Cultural Memory respondents selected
multiple ways
ETD All in one directory (28) Date
(26) Departments, Authors, or Disciplines
(26) Access-level labels (7) Dont know (13)
11Who is collecting and what are they collecting?
- Variation is the theme
- Infrastructures
- Data Structures
- Presents preservation challenges, to be sure!
12Who seeks preservation and how do they want to
preserve?
- Readiness is low
- Most institutions are not even backing up
- Dearth of preservation plans and policies
- Desire is high
- Want training
- Want independent assessments
- Want to manage their own digital preservation
solutions
13Who seeks preservation and how do they want to
preserve?
- Cultural Memory
- Only 50 back up 100 of their digital holdings
- Only 19 report having in-house expert
knowledge in digital preservation - 79 have NO preservation plan
- 55 have NO written policies
- ETDs
- 95 are engaging SOME backup strategies
- 72 have NO preservation plan
14Who seeks preservation and how do they want to
preserve?
- Cultural Memory
- 83 will develop policies in the next 3 years
- 90 cited interest in participating in a
community-based digital preservation solution - Only 30 cited interest in third-party vendor
offerings, even at a reasonable cost - ETDs
- 70 have experience with/knowledge of LOCKSS
- 92 cited interest in participating in an
NDLTD-supported LOCKSS-based EDT archive
15Who seeks preservation and how do they want to
preserve?
- CMOs engaging actively with the idea of digital
preservation - High level of knowledge about community-based
approaches to digital preservation - Outsourcing is not the top choice of institutions
as they pursue digital preservation they would
rather participate in it themselves
16What are the biggest barriers to preservation?
- Growth of digital collection
- Backups. NOT
- File formats
- Platforms
- Structures. NOT
- Lack of documented policies, procedures
17What are the threats identified by our survey
respondents?
18What are the most desired preservation offerings?
- Training provided by professional organizations
- Independent study/assessment
- Local courses in computer or digital technology
- Hire staff with digital knowledge experience
- Hire consultants
- Training provided by vendors
19The MetaArchive Cooperative
- The most effective preservation strategies
incorporate - replication of content
- geographically distributed
- secure locations
- private network of trusted partners
20Desirable Preservation Service
- Cooperative preservation network
- Standards
- Training Best practices, inc. technical
- Model policies
- Conversion or migration services
- Preservation services provided by third party
vendors - Access services
21Conclusion
- Calf-Path Syndrome
- Idiosyncratic, ad-hoc data storage structures
- Increasingly difficult remediation
- MASH triage
- Survey documented narratives
- Outreach
- Offer help to those adrift in cyberspace
- Through collaboration there are cost-effective
and strong strategies that can protect cultural
memories
22Questions?
- Katherine Skinner
- katherine.skinner_at_emory.edu
- Gail McMillan
- gailmac_at_vt.edu