Defining File Format Obsolescence: a risky journey - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Defining File Format Obsolescence: a risky journey

Description:

Format obsolescence is potentially a major problem for every repository manager. ... Risk is about the impending loss of the means of providing access. ... – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 29
Provided by: ruth115
Category:

less

Transcript and Presenter's Notes

Title: Defining File Format Obsolescence: a risky journey


1
Defining File Format Obsolescencea risky
journey
  • David Pearson
  • APSR Project Manager
  • National Library of Australia
  • dapearso_at_nla.gov.au

2
The Problem
  • Format obsolescence is potentially a major
    problem for every repository manager. This is
    particularly true given
  • Ever-increasing volume of digital material.
  • Plethora of file formats.
  • Dynamic nature of computing environments.
  • Rapid and unpredictable drivers that cause
    formats to become obsolete.
  • High business value of the specific content of
    some digital materials or collections can result
    in policies that mandate that access be
    maintained to this data for extended periods of
    time.
  • Repository managers need help to manage the
    quantity and diversity of file formats and their
    obsolescence risks.

3
AONS II
  • APSR Project Objective 2006 was to
  • refine the Automatic Obsolescence Notification
    System (AONS) developed in an earlier stage of
    APSR, to a platform-independent downloadable tool
    that automatically provides information from
    authoritative international registries to support
    decisions on preservation action required to
    retain access to information resources stored in
    repositories.
  • However,
  • the international target registries could not
    provide machine-harvestable risk metrics.
  • in the context of AONS II we had to come up with
    another way of quantifying file format risk.

4
Precedents
  • A number of different paradigms have informed our
    thinking on the nature of File Format
    Obsolescence
  • The performance model developed by the National
    Archives of Australia.
  • The view-path model developed by the Koninklijke
    Bibliotheek (National Library of the
    Netherlands).

5
File Format Obsolescence
  • There are two predominant factors which may
    impede the retrieval of digital information.
    Access to
  • The physical storage medium.
  • The logical file content.

(Dinosaurs, media and image courtesy of National
Archives of Australia).
6
Some initial thoughts onobsolescence
  • We are not making judgments about which formats
    should be used.
  • Similarly, we are not making judgments based on
    how hard a format will be to deal with once
    preservation action is needed.
  • We should not only look for indicators of
    obsolete formats, but also obsolescence in
    formats.
  • Risk is about the impending loss of the means of
    providing access.
  • The same format may well have different levels of
    obsolescence risk in different repositories.
  • It is perfectly reasonable to take into account
    that there may be more than one means of
    providing access to a file.
  • The purpose of obsolescence risk assessment is to
    inform decisions about the need to take action.
  • We are not about to be overwhelmed by the
    juggernaut of technical change.

7
  • The risk assessment questions must
  • seek answers that will indicate the likely stage
    of obsolescence for a file format (in a specific
    real world repository).
  • As a consequence of having to cater for
    potentially thousands of possible file formats,
    the questions need to be generic and somewhat
    simplistic.
  • The questions still aim to allow a repository
    owner to build specific risk profiles of an
    individual file format.
  • The risk questions are classified into two
    general groups
  • Community questions (which should be answerable
    by reference the digital preservation community).
  • Repository view-path questions (which relate
    specifically to an individual environment and
    depend on the sustained availability of
    combinations of software and hardware).

8
Community Questions
  • At a community level, the questions assume
    certain generic information might serve as useful
    indicators
  • The current level of support for rendering the
    format.
  • How long it has been since the format version was
    first released.
  • How many versions have been released since that
    time.
  • The range of view-paths that could be used for
    acceptable presentation of content.

9
Step. 1 - Community Information Questions
10
Step. 1 - Community Information Questions
11
Step. 1 - Community Information Questions
12
Step. 1 - Community Information Questions
13
Step. 1 - Community Information Questions
14
Step. 1 - Community Information Questions
15
Step. 1 - Community Information Questions
16
Step. 1 - Community Information Questions
17
Step. 1 - Community Information Questions
18
Repository Questions
  • At a local repository level, the questions assume
    that it is possible for a repository manager to
    determine whether required view-paths for access
    are locally available and workable.
  • Other issues where subjective judgments may be
    needed include
  • Decisions about how much notice is needed in
    order to take manageable action.
  • The degree of rendering difficulty that the
    repository owner and users are willing to bear.
  • The degree of loss that is acceptable.
  • What constitutes a base format unlikely to
    require repeated assessment (because it can be
    expected to be readable in all expected computing
    environments).
  • Whether there may be other sources of information
    worth checking for indications of a looming
    accessibility problem.

19
Step 2 - Collection/Repository Information
Questions
20
Step 2 - Collection/Repository Information
Questions
21
Step 2 - Collection/Repository Information
Questions
22
Step 2 - Collection/Repository Information
Questions
23
Step 2 - Collection/Repository Information
Questions
24
Step 2 - Collection/Repository Information
Questions
25
Step 2 - Collection/Repository Information
Questions
26
Some further thoughts onObsolescence
  • Some interesting points have already arisen in
    trying to apply the questions
  • The approach tries to identify the need for a
    decision to take preservation action. We take
    action in order to regain or maintain access.
  • A file format is heading for obsolescence when a
    large part of the community of users cannot
    access it, or have decided to move content away
    from it.
  • Obsolescence may begin with inconvenience to
    users and ends in the digital black hole of loss.
  • Not all file formats are created equal.
  • Open source renderers are a good thing, but they
    may not obviate the need to take preservation
    action.
  • Not all repository environments are maintained
    equally.

27
Next Steps?
  • The usefulness of these questions depends on
    there being a community to
  • share the output.
  • Next steps
  • Develop questions further into an acceptable
    standard (with partners).
  • Develop and quantify risk metrics (machine- and
    human-harvestable).
  • Develop automated workflows (usable by any
    application).
  • Develop a mechanism to share metrics (such as a
    exporting results to a central web service
    external voting system).

28
Questions?
  • (Dinosaurs, media and image courtesy of National
    Archives of Australia).
Write a Comment
User Comments (0)
About PowerShow.com