Title: Common Use Cases for Preservation Metadata
1 Common Use Cases for Preservation Metadata
- Deborah Woodyard-Robinson
- Digital Preservation Consultant
- deb_at_woodyard-robinson.com
- Long-term Repositories Taking the Shock out of
the Future, - APSR, NLA, Canberra, Australia , 31 August 2006
2Use of use cases
- Broad use cases viewed from 2 angles
- Metadata must support the functions in a system
and the function (aim) of a system
A system (Repository) as a use case
Functions as use cases
3PREMIS definitions
- Preservation Metadata Information a Preservation
Repository uses to support the digital
preservation process. - Digital Preservation Process functions to
maintain viability, renderability,
understandability, authenticity identity of
digital material in a preservation context
4Digital information
- Step back further to understand the source of
these functions - Digital information exists on 3 levels
5Intellectual (e.g. its a photo of Deb dancing)
Conceptual (e.g. its a file called IMGP0132.jpg)
0101010111 1001010010 0010011001
Physical (e.g. its a CD-ROM)
6- The OAIS recognises these levels and discusses
functions to do this as well as the information
required to support these functions, i.e. the
metadata.
7OAIS metadata
- Packaging Information (i.e. how and where the
bits are stored) - Content Information including Representation
Information (i.e. how to locate the bits and
interpret the bits into data) - Preservation Description Information including
- Reference Information
- Context Information
- Provenance Information
- Fixity Information
- (i.e. how to identify the data and interpret the
data into information)
8Intellectual Preservation Description Information
Conceptual Content Information including
Representation Information
0101010111 1001010010 0010011001
Physical Packaging Information
9Understandability Authenticity Identity
Intellectual Preservation Description Information
Conceptual Content Information including
Representation Information
Renderability
Physical Packaging Information
Viability
10Maintain Renderability
Monitor technology
Customer
Staff
Design preservation actions
Supply renderable version
Repository System
Perform preservation actions
11Use Case Monitor technology
- Description
- Repository staff request or schedule a technology
report. - The system creates a report on the file formats,
inhibitor types and technology required by
repository contents. - An external survey is conducted by the system
based on report results and level of preservation
required. - The system registers endangered formats and
technologies. - The System surveys possible solutions available
from available registries. - The system creates a report of findings for
repository staff
12Maintain Renderability
Monitor technology
Customer
Staff
Design preservation actions
Supporting metadata required for these
functions Object Identifier Preservation
level Format Inhibitors Environment Relationships-
structural
Supply renderable version
Repository System
Perform preservation actions
13Maintain Viability
Monitor storage media
Staff
Refresh media
Replicate on backup media
Content Location Storage Medium
Repository System
Replace media
14Maintain Understandablility
Record history provenance
Customer
Staff
Maintain Context
Creation details Original file name Relationships
context Rights Events Agents
Repository System
Understandan object
15Maintain Authenticity
Check Fixity
Customer
Apply fixity check
Read signature
Fixity / check-sum details Digital signature
details
Repository System
Apply signature
16Maintain Identity
Apply unique identifier
Customer
Repository System
Resolve unique identifier
Object Identifier
17Application in a system
- Different repositories have different needs and
will use different functions and therefore
different metadata - Mandatory if applicable
- Possible differences
- Handling objects at rep/file/bitstream level
- User community gt Understandability details
- Authenticity
18Application scenarios/2 use cases
- Government record archives
- large volumes of government records to be
archived, often under legislative obligation from
electronic government initiatives - mandated to preserve records, but also
implementing specific retention schedules - more influence over what the producers of records
deposit - authenticity is usually a vital aspect
19Application scenarios/2 use cases
- Private sector library (e.g. Wellcome Trust)
- very specific collection remit
- main users of the collection are internal to the
organisation, therefore well defined user group
and knowledge base - interest in content only, can easily discard
look and feel if not desired - can normalise files to one format to manage if
desired
20Viability
- Government record archives
- Private sector library
- Requires
- Content Location
- Storage Medium
21Renderability
- Government record archives
- Private sector library
- Requires
- Object Identifier
- Format details
- Inhibitors
- Environment details
- Relationships- structural
22Understandability
- Government record archives
- Private sector library
- Requires
- Creation details
- Original file name
- Relationships context
- Rights
- Events
- Agents
23Understandability
- Government record archives
- Private sector library
- Requires
- Creation details
- Original file name
- Relationships context
- Rights
- Events
- Agents
24Authenticity
- Government record archives
- Private sector library
- Requires
- Fixity / check-sum details
- Digital signature details
25Authenticity
- Government record archives
- Private sector library
- Requires
- Fixity / check-sum details
- Digital signature details
26Implementation
- PREMIS does not differentiate between what is
required to be implicit or explicit - Mandatory means need to know rather than must
exist as a metadata element - Record only if applicable. E.g. signature
information required only if signatures are used
27Example The National Archives, UK
- Identifier types are the same throughout the
system so not explicit in metadata - Storage media and location handled by system
- Relationships between representation, file and
bitstream equivalents are implicit via the
structure of data in the system. - Format, inhibitor and environment information
is/will be
kept via PRONOM Unique Identifiers and the
PRONOM registry - i.e. format name, environment details etc are
known and explicitly recorded but not held with
the object - Levels of preservation are recorded in policies
and retention schedules
28Summary
- Problem gt solution gt functions
- Functions gt use cases
- Use cases gt metadata (Reality check 1)
- However, do rememberThe information you need to
know may not need to be explicitly recorded in
object metadata to be functional (Reality check 2)