Title: Standards Showcase: PREMIS (Preservation metadata)
1Standards ShowcasePREMIS (Preservation metadata)
- Rebecca Guenther, Library of Congress
- ALA Annual 2006
- LC booth presentation
- June 24-25, 2006
2Overview
- What is preservation metadata?
- Background
- PREMIS work
- Survey
- Data dictionary
- Features of the data dictionary
- Implementing PREMIS
- Future
3Digital preservation advances remaining
challenges
- Groups around the world and conferences continue
to make significant progress in raising awareness
about digital preservation imperative - Gradual shift in focus from articulating problem
to solving it - Not so much Why is digital preservation
important anymore rather, What must be done to
achieve preservation objectives? - Many practical challenges in implementing
reliable, sustainable digital preservation
programs - One key implementation challenge preservation
metadata
4Preservation metadata includes
Preservation Metadata
Content
- Provenance
- Who has had custody/ownership of the digital
object? - Authenticity
- Is the digital object what it purports to be?
- Preservation Activity
- What has been done to preserve the digital
object? - Technical Environment
- What is needed to render and use the digital
object? - Rights Management
- What IPR must be observed?
- Makes digital objects self-documenting across time
10 years on
50 years on
Forever!
5PREMIS background
- Pre-2002 various preservation metadata element
sets released - Different scopes, purposes, underlying
models/assumptions - No international standard little consolidation
of expertise/best practice - June 2002 Preservation Metadata Framework
- International working group (jointly sponsored by
OCLC, RLG) - Comprehensive, high-level description of types of
information constituting preservation metadata - Used OAIS reference model as starting point
- Set of prototype preservation metadata elements
- Consensus-based foundation for developing formal
preservation metadata specifications but not an
off-the-shelf, ready to implement solution - Post-2002 Needed implementable preservation
metadata, with guidelines for application and
use, relevant to a wide range of digital
preservation systems and contexts - Motivated formation of PREMIS Working Group
6PREMIS Working Group
- Preservation metadata key component of
sustainable digital preservation - June 2003 OCLC, RLG sponsored international
working group - PREMIS Preservation Metadata Implementation
Strategies - Objective
- Define implementable, core preservation metadata,
with guidelines/recommendations for management
and use - Membership
- gt 30 experts from 5 countries, libraries,
museums, archives, government agencies, private
sector - Co-Chairs Priscilla Caplan (FCLA), Rebecca
Guenther (LC)
7Membership
- Priscilla Caplan, FCLA (Chair)
- Rebecca Guenther, LC (Chair)
- Michael Alexander, British Library
- George Barnum, GPO
- Charles Blair, U. of Chicago
- Olaf Brandt, U. of Göttingen
- Adam Farquhar, British Library
- David Gewirtz, Yale
- Kevin Glavash, MIT/Dspace
- Cathy Hartman, U. of N. Texas
- Helen Hodgart, British Library
- Nancy Hoebelheinrich, Stanford
- Roger Howard/Sally Hubbard, Getty Museum
- Pam Kircher, OCLC
- John Kunze, Calif. Digital Library
- Brian Lavoie, OCLC liaison
- Robin Dale, RLG liaison
- Vicky McCarger, LA Times
- Jerry McDonough, NYU/METS
- Evan Owens, JSTOR
- Erin Rhodes, NARA
- Madi Solomon, Walt Disney Co.
- Angela Spinazze, ATSPIN
- Gunter Waibel, RLG
- Lisa Weber, NARA
- Robin Wendler, Harvard
- Hilde van Wijngaarden, KB
- Andrew Wilson, NAA
8Advisory Committee
- Howard Besser, UCLA
- Liz Bishoff, OCLC (via Colorado Digitization
Program) - Gerard Clifton, National Library of Australia
- Gail Hodge, CENDI
- Steve Knight, National Library of New Zealand
- Maggie Jones, Digital Preservation Coalition
- Nancy McGovern, Cornell
- Cliff Morgan, Wiley UK
- Richard Rinehart, U. of California, Berkeley
9Survey Report
- September 2004 Implementing Preservation
- Repositories for Digital Materials Current
Practice and - Emerging Trends in the Cultural Heritage
Community - Survey of existing and planned digital
repositories - Mission, content, funding, preservation
policies/strategies, - take up of OAIS, access mechanisms, and more
- Use of metadata to support repository processes,
functions, policies types of metadata collected
metadata storage/management practices - 50 responses
- 28 libraries, 7 archives, 3 museums, and 11 other
- 13 different countries 45 from U.S.
- 38 in planning 33 development 46 production
- Snapshot of current practices and emerging trends
related to managing preservation metadata in
digital archiving systems - Variety of preservation contexts, institution
types, and domains
10Survey findings
- Little experience with digital preservation
- Most didnt have active preservation strategy
- Many not yet in production
- Cannot assess adequacy of metadata
- Lack of common vocabulary and conceptual
framework - Informed by OAIS reference model
- Difference of opinion as to meaning of OAIS
compliance - Metadata
- Many recording rights, provenance, technical,
administrative, descriptive and structural - Most repositories serve goals of both
preservation and access
11PREMIS Data Dictionary
- May 2005 Data Dictionary for Preservation
- Metadata Final Report of the PREMIS Working
Group - 237-page report includes
- PREMIS Data Dictionary 1.0
- Accompanying report (context, data model,
assumptions) - Special topics, glossary, usage examples
- Set of XML schema to support implementation
- Data Dictionary comprehensive, practical
resource for implementing preservation metadata
in digital archiving systems - Comprehensive view of information requirements
needed to support digital preservation - Based on deep pool of institutional experiences
in setting up and managing operational capacity
for digital preservation - Builds on previous work
12From theory to practice
Preservation Metadata Requirements
Digital Archiving Systems
Framework
OAIS
PREMIS Data Dictionary
13Winner 2005 Digital Preservation Award
14Some guiding principles and assumptions
- Implementable, core, preservation metadata
- Preservation metadata maintain viability,
renderability, understandability, authenticity,
identity in a preservation context - Core What most preservation repositories need
to know to preserve digital materials over the
long-term - Implementable rigorously defined supported by
usage guidelines/recommendations emphasis on
automated workflows - Implementation neutral
- No assumptions on specific implementation
- Promote flexibility/interoperability
- Focus on semantic units what you need to know
(implementation-neutral) vs. metadata elements
how you record it (implementation-specific) - Information that needs to be recoverable from
the digital archiving system, independent of
local implementation
15Scope of data dictionary
- Implementation independent
- Descriptive metadata out of scope
- Technical metadata applying to all or most format
types - Media or hardware details are limited
- Business rules are essential for working
repositories, but not covered - Rights information for preservation actions, not
access
16PREMIS data model
Intellectual Entities
Rights
Agents
Objects
Events
17Sample Data Dictionary entry
18Semantic units pertaining to objects
- objectIdentifier
- preservationLevel
- objectCategory
- objectCharacteristics
- creatingApplication
- originalName
- Storage
- environment
- signatureInformation
- relationship
- linkingEventIdentifier
- linkingIntellectual Entity Identifier
- linkingPermission StatementIdentifier
19Semantic units pertaining to Events
- eventIdentifier
- eventType
- eventDateTime
- eventDetail
- eventOutcome
- eventOutcomeDetail
- linkingAgentIdentifier
- linkingObjectIdentifier
20Semantic units pertaining to Agents
- agentIdentifier
- agentName
- agentType
21Semantic units pertaining to Rights
- permissionStatement
- permissionStatementIdentifier
- relatedObject
- grantingAgent
- grantingAgreement
- permissionGranted
- act
- restriction
- termOfGrant
- permissionNote
22Community interest
- As of March 2006
- 25,000 hits on Data Dictionary
- More than 100 subscribers to the PREMIS
Implementers Group discussion list - PREMIS Data Dictionary product of collaboration
and consensus - PREMIS membership reflects variety of
institutions, domains, countries - Multiplicity of perspectives promotes
applicability in multiplicity of contexts - Digital preservation is a shared problem this
invites shared solutions - Data Dictionary useful to any institution or
organization committed to the long-term
preservation of digital materials
23PREMIS Maintenance Activity
Permanent Web presence, hosted by Library of
Congress Centralized destination
for information, announcements, and other
PREMIS-related resources Discussion list for
PREMIS implementers (PIG list) Coordinate future
revisions of Data Dictionary and XML
schema Editorial committee being established to
guide development and revisions
http//www.loc.gov/standards/premis/
24Current activities
- Documenting errata and proposed revisions to Data
Dictionary (feedback through PIG list) - http//www.loc.gov/standards/premis/changes.html
- PREMIS Implementers Registry
- http//www.loc.gov/standards/premis/premis-registr
y.html - Consultancies, etc.
- Rights issues for digital preservation (Karen
Coyle) - PREMIS implementation guidelines and
recommendations (Deborah Woodyard-Robinson) - PREMIS-to-OAIS mapping (Brian Lavoie)
- PREMIS on the road
- Digital Curation Center PREMIS workshop (July
17-18 Glasgow) - Repository workshop at National Library of
Australia (Aug. 31) - Investigating workshops in US
25Going forward
- Establish Editorial committee
- First revision of Data Dictionary
- Work with other initiatives (e.g., METS, Z39.87)
to integrate PREMIS with existing standards,
technologies, best practices (e.g. METS) - Contribute preservation metadata resources to
digital preservation community that are - Openly available
- Oriented toward practical implementation
- Supported by a long-term commitment
- Tools
26Some implementers
- MathArc (Germany) A joint project funded by NSF
(Cornell) and SUB Göttingen (DFG) to build a
distributed archive for mathematical journals
distributed between two archives to keep
information redundant. -
- DAITTSS (Florida) a preservation repository for
the use of the libraries of the public
universities of Florida. Uses a locally-developed
software application (DAITSS), which implements
most of the PREMIS data elements. - Ex Libris (DigiTool) an enterprise solution for
the management of digital assets in libraries and
academic environments consisting of a number of
modules, each designed to address different
needs, functions, and workflows pertaining to the
life cycle of a digital object - For more information see
- http//www.loc.gov/premis/premis-registry.html
27Conclusion
- PREMIS Data Dictionary provides critical piece of
reliable digital preservation infrastructure
comprised of technology, standards, and best
practice - PREMIS Data Dictionary is a building block with
which effective, sustainable digital preservation
strategies can be implemented - PREMIS Data Dictionary tightly focused on
implementation - Practical implementation was guiding principle in
all discussions - Developed tools to support implementation
released with Data Dictionary - Further work with encouragement for international
participation and tools development is ongoing - Unglamorous but necessary infrastructure!
28URLs, etc.
- PREMIS Maintenance Activity
- http//www.loc.gov/standards/premis/
- PREMIS Working Group
- http//www.oclc.org/research/projects/pmwg/
- Data Dictionary for Preservation Metadata Final
Report of the PREMIS Working Group - http//www.oclc.org/research/projects/pmwg/premis
-final.pdf - Please send project information to Implementers
Registry and join the PIG list!