DEVELOPING AN ISO REFERENCE MODEL FOR AN OPEN ARCHIVAL INFORMATION SYSTEM OAIS Presentation to Socie - PowerPoint PPT Presentation

1 / 78
About This Presentation
Title:

DEVELOPING AN ISO REFERENCE MODEL FOR AN OPEN ARCHIVAL INFORMATION SYSTEM OAIS Presentation to Socie

Description:

Producer-Archive Ingest Methodology Abstract Standard. Standard Submission ... Experienced many technology changes since 1966. Consultative Committee for ... Provenance ... – PowerPoint PPT presentation

Number of Views:88
Avg rating:3.0/5.0
Slides: 79
Provided by: LouReichan
Category:

less

Transcript and Presenter's Notes

Title: DEVELOPING AN ISO REFERENCE MODEL FOR AN OPEN ARCHIVAL INFORMATION SYSTEM OAIS Presentation to Socie


1
ISO Reference Model For anOpen Archival
InformationSystem (OAIS)Tutorial
Presentation Don Sawyer NASA/National Space
Science Data Center (NSSDC) Lou Reich Computer
Sciences Corporation (CSC) Library of
Congress June 13, 2003
2
Outline
  • History
  • Reference Model overview
  • Some Applications
  • Follow-on Activities
  • Producer-Archive Ingest Methodology Abstract
    Standard
  • Standard Submission Information Package
  • Archive Certification

3
NASA Role
  • National Space Science Data Center
  • NASAs first digital archive
  • Experienced many technology changes since 1966
  • Consultative Committee for Space Data Systems
  • International group of space agencies
  • Developed variety of science discipline-
    independent standards
  • Became working body for an ISO TC 20/ SC 13 about
    1990
  • TC20 Aircraft and Space Vehicles
  • SC13 Space Data and Information Transfer Systems

4
Initial Archive Standards Proposal
  • ISO suggested that SC 13 should develop archive
    standards
  • Address data used in conjunction with space
    missions
  • Address intermediate and indefinite long term
    storage of digital data

5
Response
  • Response to Consultative Committee for Space Data
    Systems (CCSDS) and ISO TC 20/SC 13
  • No framework widely recognized for developing
    specific digital archive standards
  • Begin by developing a Reference Model to
    establish common terms and concepts
  • Ensure broad participation, including traditional
    archives
  • (Not restricted to space communities all
    participation is welcome!)
  • Focus on data in electronic forms, but recognize
    that other forms exist in most archives
  • Follow up with additional archive standards
    efforts as appropriate

6
What is a Reference Model?
  • A framework
  • for understanding significant relationships among
    the entities of some environment, and
  • for the development of consistent standards or
    specifications supporting that environment.
  • A reference model
  • is based on a small number of unifying concepts
  • is an abstraction of the key concepts, their
    relationships, and their interfaces both to each
    other and to the external environment
  • may be used as a basis for education and
    explaining standards to a non-specialist.

7
Organizational Approach
  • Organize US contribution under a framework with
    NASA lead
  • Established liaison with Federal Geographic Data
    Committee (FGDC) and National Archives and
    Records Administration (NARA)
  • Agency archives and users must be represented in
    this process
  • An Open process
  • Important to stimulate dialogue with broad
    archive/user communities
  • Results of US and International workshops put on
    WEB
  • Support e-mail comments/critiques
  • Broad international workshops also held
  • UK and France
  • Issue resolution at ISO/Consultative Committee
    for Space Data Systems international workshops

8
Technical Approach
  • Investigate other Reference Models.
  • ISO Seven LayerCommunications Reference Model
  • ISO Reference Model for Open Distributed
    Processing
  • ISO TC211 Reference Model for Geomantics
  • Define what is meant by archiving of data
  • Break archiving into a few functional areas
    (e.g., ingest, storage, access, and preservation
    planning)
  • Define a set of interfaces between the functional
    areas
  • Define a set of data classes for use in Archiving
  • Choose formal specification techniques
  • Data flow diagrams for functional models and
    interfaces
  • Unified Modeling Language (UML) for data classes

9
Results
  • Reference Model targeted to several categories of
    reader
  • Archive designers
  • Archive users
  • Archive managers, to clarify digital preservation
    issues and assist in securing appropriate
    resources
  • Standards developers
  • Adopted terminology that crosses various
    disciplines
  • Traditional archivists
  • Scientific data centers
  • Digital libraries

10
Reference Model Status
  • Already widely adopted as starting point in
    digital preservation efforts
  • Digital libraries (e.g., Netherlands National
    Library)
  • Traditional archives (e.g., US National Archives)
  • Scientific data centers (e.g., National Space
    Science Data Center)
  • Commercial Organizations (e.g., Aerospace
    Industries Association preservation working team)
  • Published as final CCSDS standard (Blue Book)
    available from
  • http//www.ccsds.org/documents/650x0b1.pdf
  • Recently published as a final ISO standard ISO
    14721 2003

11
Reference Model for anOpen Archival Information
System Technical Overview
12
Open Archival Information System (OAIS)
  • Open
  • Reference Model standard(s) are developed using a
    public process and are freely available
  • Information
  • Any type of knowledge that can be exchanged
  • Independent of the forms (i.e., physical or
    digital) used to represent the information
  • Data are the representation forms of information
  • Archival Information System
  • Hardware, software, and people who are
    responsible for the acquisition, preservation and
    dissemination of the information

13
Document Organization
  • Introduction
  • Purpose and Scope, Applicability, Rationale, Road
    Map for Future Work, Document Structure, and
    Definitions of Terms
  • OAIS Concepts and Responsibilities
  • High level view of OAIS functionality and
    information models
  • OAIS external environment
  • Minimum responsibilities to become an OAIS
  • Detailed Models
  • Functional model descriptions and information
    model perspectives
  • Preservation perspectives
  • Media migration, compression, format conversions,
    and access service preservation
  • Archive Interoperability
  • Criteria to distinguish types of cooperation
    among archives
  • Annexes
  • Scenarios of existing archives, compatibility
    with other standards

14
Purpose, Scope, and Applicability
  • Framework for understanding and applying concepts
    needed for long-term digital information
    preservation
  • Long-term is long enough to be concerned about
    changing technologies
  • Starting point for model addressing non-digital
    information
  • Provides set of minimal responsibilities to
    distinguish an OAIS from other uses of archive
  • Framework for comparing architectures and
    operations of existing and future archives
  • Basis for development of additional related
    standards
  • Addresses a full range of archival functions
  • Applicable to all long-term archives and those
    organizations and individuals dealing with
    information that may need long-term preservation
  • Does NOT specify an implementation

15
Model View of an OAIS Environment
  • Producer is the role played by those persons, or
    client systems, who provide the information to be
    preserved
  • Management is the role played by those who set
    overall OAIS policy as one component in a broader
    policy domain
  • Consumer is the role played by those persons, or
    client systems, who interact with OAIS services
    to find and acquire preserved information of
    interest

16
OAIS Responsibilities
  • Negotiates and accepts Information from
    information producers
  • Obtains sufficient control to ensure long-term
    preservation
  • Determines which communities (designated) need to
    be able to understand the preserved information
  • Ensures the information to be preserved is
    independently understandable to the Designated
    Communities
  • Follows documented policies and procedures which
    ensure the information is preserved against all
    reasonable contingencies
  • Makes the preserved information available to the
    Designated Communities in forms understandable to
    those communities

17
OAIS Information Definition
  • Information is always expressed (i.e.,
    represented) by some type of data
  • Data interpreted using its Representation
    Information yields Information
  • Information Object preservation requires clear
    identification and understanding of the Data
    Object and its associated Representation
    Information

18
Information Package Definition
Preservation Description Information
Content Information
  • An Information Package is a conceptual container
    holding two types of information
  • Content Information
  • Preservation Description Information (PDI)

19
Information Package Variants
  • Submission Information Package
  • Negotiated between Producer and OAIS
  • Sent to OAIS by a Producer
  • Archival Information Package
  • Information Package used for preservation
  • Includes complete set of Preservation Description
    Information (PDI) for the Content Information
  • Dissemination Information Package
  • Includes part or all of one or more Archival
    Information Packages
  • Sent to a Consumer by the OAIS

20
External Data Flow View
Producer
OAIS
queries
result sets
orders
Consumer
21
Detailed Models
  • Overview

22
Overview of Detailed Models
  • It was decided to do both a functional and an
    information model of the OAIS
  • Both models were tasked to
  • Use the models to better communicate OAIS
    Concepts
  • Use a well established, formal modeling technique
  • Stay as implementation independent as possible
  • Avoid detailed designs

23
Detailed Models
  • Information Model

24
General Principles
  • Define classes of information objects that
    illustrate information necessary to enable
    Long-term storage and access to Archives
  • The class definition should be implementation
    Independent
  • Use a subset of Unified Modeling Language (UML)

25
UML Notation
26
Information Object
Information Object
1
Interpreted using
1
Data Object
Representation Information
Interpreted using
Physical Object
Digital Object
1
Bit Sequence
27
Representation Information
  • The Representation Information accompanying a
    physical object, like a moon rock, may give
    additional meaning
  • It typically is a result of some analysis of the
    physically observable attributes of the rock
  • The Representation Information accompanying a
    digital object, or sequence of bits, is used to
    provide additional meaning.
  • It typically maps the bits into commonly
    recognized data types such as character, integer,
    and real and into groups of these data types.
  • It associates these with higher level meanings
    which can have complex inter-relationships that
    are also described

28
Recursive Nature ofRepresentation Information
Interpreted using
  • Structure Information
  • Semantic Information
  • Other Representation Information

Representation Information
1
1

Other Representation Information

Semantic Information
Structure Information



adds meaning to
29
Types of Information Used in OAIS
Information Object
. . .
Packaging Information
Preservation Description Information
Descriptive Information
Content Information
30
Content Information
  • The information which is the primary object of
    preservation
  • An instance of Content Information is the
    information that an archive is tasked to
    preserve.
  • Deciding what is the Content Information may not
    be obvious and may need to be negotiated with the
    Producer
  • The Data Object in the Content Information may be
    either a Digital Object or a Physical Object
    (e.g., a physical sample, microfilm)

31
Preservation Description Information
  • Provenance Information
  • Describes the source of Content Information, who
    has had custody of it, what is its history
  • Context Information
  • Describes how the Content Information relates to
    other information outside the Information Package
  • Reference Information
  • Provides one or more identifiers, or systems of
    identifiers, by which the Content Information may
    be uniquely identified
  • Fixity Information
  • Protects the Content Information from
    undocumented alteration

32
PDI Examples
33
Descriptive Information
  • Contain the data that serves as the input to
    documents or applications called Access Aids.
  • Access Aids can be used by a consumer to locate,
    analyze, retrieve, or order information from the
    OAIS.

34
Packaging Information
  • Information which, either actually or logically,
    binds and relates the components of the package
    into an identifiable entity on specific media
  • Examples of Packaging Information include tape
    marks, directory structures and filenames

35
OAIS Archival Information Package
Archival Information Package (AIP)
Packaging Information
Package Description
delimited by
derived from
e.g., How to find Content information and PDI
on some medium
e.g., Information supporting customer searches
for AIP
Preservation Description Information (PDI)
Content Information
further described by
e.g., Hardcopy document Document as an
electronic file together with its format
description Scientific data set consisting
of image file, text file, and format
descriptions file describing the other files
e.g., How the Content Information came into
being, who has held it, how it relates to
other information, and how its integrity is
assured
36
AIP Types
  • Archival Information Unit (AIU) contains a single
    Data Object as the Content Object
  • Archival Information Collection (AIC) contains
    multiple AIPs in its Content Object
  • Each member of an AIC is an AIP containing
    Content Information and PDI
  • The AIC contains unique PDI on the collection
    process

Archival Information Package
Archival Information Unit
Archival Information Collection
37
Package Descriptions and Access Aids
  • Package Descriptions are needed by an OAIS to
    provide visibility and access to the OAIS
    holdings
  • Package Descriptions contain 1 or more Associated
    Descriptions which describe the AIP Content
    Information from the point of view of a single
    Access Aid
  • Some example of Access Aids Include
  • Finding Aids - assist the consumer in locating
    information of interest
  • Ordering Aids - allow the consumer to discover
    the cost of and order AIUs of interest
  • Retrieval Aids - enable authorized users to
    retrieve the AIU described by the Unit Descriptor
    from Archival Storage

38
Information Model Summary
  • Presented a model of information objects as
    containing data objects and representation
    objects
  • Classified information required for Long-term
    archiving into 4 classes Content Information,
    PDI, Packaging Information and Descriptive
    Information
  • Described how these classes would be aggregated
    and related in an AIP to fully describe an
    instance of Content Information
  • Presented information needed for Access, in
    addition to that needed for Long-term
    Preservation
  • Put the Access oriented structures in the context
    of the other data needed to operate an OAIS

39
Detailed Models
  • Functional View

40
General Principles
  • Highlight the major functional areas important to
    digital archiving
  • Use functional decomposition to clarify the range
    of functionality that might be encountered
  • Don't decompose beyond two levels to avoid
    becoming too implementation dependent
  • Provide a useful set of terms and concepts
  • Do not imply that all archives need to implement
    all the sub-functions
  • Identify some common services which are likely to
    be needed, and are assumed to be available, as
    underlying support

41
Common Services
  • Modern, distributed computing applications assume
    a number of supporting services
  • Examples of Common Services include
  • inter-process communication
  • name services
  • temporary storage allocation
  • exception handling
  • security
  • file and directory services

42
Open Archival Information SystemSix Functional
Entities
Preservation Planning
P R O D U C E R
C O N S U M E R
Data Management
queries
result sets
SIP
Ingest
Access
orders
Archival Storage
DIP
Administration
MANAGEMENT
SIP Submission Information Package
AIP Archival Information Package
DIP Dissemination Information Package
43
Functional Entities In An OAIS
  • Ingest This entity provides the services and
    functions to accept Submission Information
    Packages (SIPs) from Producers and prepare the
    contents for storage and management within the
    archive
  • Archival Storage This entity provides the
    services and functions for the storage,
    maintenance and retrieval of Archival Information
    Packages
  • Data Management This entity provides the
    services and functions for populating,
    maintaining, and accessing both descriptive
    information which identifies and documents
    archive holdings and internal archive
    administrative data.
  • Administration This entity manages the overall
    operation of the archive system
  • Preservation Planning This entity monitors the
    environment of the OAIS and provides
    recommendations to ensure that the information
    stored in the OAIS remain accessible to the
    Designated User Community over the long term even
    if the original computing environment becomes
    obsolete.
  • Access This entity supports consumers in
    determining the existence, description, location
    and availability of information stored in the
    OAIS and allowing consumers to request and
    receive information products

44
Ingest Data Flow Diagram
45
Preservation Planning
46
Preservation Perspectives

47
Migration Context
Content Information Identifier
Data Management And Access View
Descriptive Information Mapping
AIP Identifier
Archival Storage View
Archival Storage Mapping
Packaging Information
Preservation Description Information
Content Information
48
Digital Migration
  • Digital Migration is defined to be the transfer
    of digital information, while intending to
    preserve it, within the OAIS.
  • Focus on preservation of the full information
    content
  • New information implementation replaces the old
  • OAIS has full control and responsibility over all
    aspects of the transfer

49
Migration Motivators
  • Motivators driving digital migrations
  • Media Decay
  • Often this is superceded by escalating media
    drive maintenance costs
  • Increased Cost Effectiveness
  • More cost-effective media types with higher
    volumes and lower drive maintenance costs
  • New User/Consumer Service Requirements
  • New formats more compatible with users
    technology and applications
  • Proprietary software evolution
  • New software versions used to upgrade formats
    of the information objects being preserved

50
Digital Migration Approaches
  • Four primary types of digital migration in
    response to motivators, ordered by increasing
    risk of information loss
  • Refreshment
  • Media replacement with no bit changes
  • Replication
  • No change to Packaging Information or Content
    Information bits
  • Repackaging
  • Some bit changes in Packaging Information
  • Transformation
  • Reversible Bit changes in Content Information
    are reversible by an algorithm
  • Non-reversible Bit changes in Content
    Information are not reversible by an algorithm

51
Access Preservation
  • Effective access to digital information requires
    the use of software
  • Application Programming Interfaces (APIs) may be
    cost-effectively maintained across time by an
    OAIS when
  • API is not too complex
  • API is applicable to a wide variety of AIUs
  • API source code may be ported to new environments
  • Extensive testing is needed to ensure against
    information loss
  • Preservation of executables by full emulation of
    underlying hardware is problematic
  • Hard to know what is the information being
    preserved
  • May not be possible to fully emulate associated
    devices

52
Archive Interoperability
53
Archive Interoperability Motivators
  • Users of multiple OAIS archives have reasons to
    wish for some interoperability or cooperation
    among the OAISs.
  • Consumers
  • Common finding aids to aid in locating
    information over several OAIS archives
  • Common Package Descriptor schema for access
  • Common DIP schema for dissemination, or a single
    global access site.
  • Producers
  • common SIP schema for submission to different
    archives
  • a single depository for all their products.
  • Managers
  • Cost reduction through sharing of expensive
    hardware increasing the uniformity and quality of
    user interactions with the OAIS

54
Categories of Archive Interactions
  • Independent no knowledge by one OAIS of
    Standards implemented at another
  • Cooperating Potentially common submission
    standards, and common dissemination standards,
    but no common access. One archive may make
    subscription requests for key data at the
    cooperating archive
  • Federated Access to all federated OAIS is
    provided through a common set of access aids that
    provide visibility into all participating OAISs.
    Global dissemination and Ingest are options
  • Shared resources An OAIS in which Management has
    entered into agreements with other OAISs is to
    share resources to reduce cost. This requires
    various standards internal to the archive (such
    as ingest-storage and access-storage interface
    standards), but does not alter the communitys
    view of the archive

55
Federated Archives
56
3 Levels of Autonomy in Associated Archives
  • No interactions and therefore no association
  • Associations that maintain your autonomy. You
    have to do certain things to participate, but you
    can leave the association without notice or
    impact to you.
  • Associations that bind you by contract. To
    change the nature of this association you will
    have to re-negotiate the contract. The amount of
    autonomy retained depends on how difficult it is
    to negotiate the changes.

57
Reference Model Summary
  • Reference model is to be applicable to all
    digital archives, and their Producers and
    Consumers
  • Identifies a minimum set of responsibilities for
    an archive to claim it is an OAIS
  • Establishes common terms and concepts for
    comparing implementations, but does not specify
    an implementation
  • Provides detailed models of both archival
    functions and archival information
  • Discusses OAIS information migration and
    interoperability among OAISs

58
Some Applications

59
Selected OAIS Usage Examples
  • Networked European Deposit Library (NEDLIB)
  • Royal Library of the Netherlands
  • IBM is developing an OAIS like mplementation
  • British National Library
  • Asking IBM to extend its OAIS like
    implementation
  • Research Library Group and OnLine Computer
    Library Center
  • Developed an OAIS based approach to trusted
    repositories
  • Web page to track OAIS implementation
    efforts/issues
  • http//www.rlg.org/longterm/oais.html
  • Library of Congress
  • Hosting METS XML data packaging approach
  • National Digital Information Infrastructure
    Preservation Program (NDIIPP)

60
Selected OAIS Usage Examples-2
  • InterPARES
  • Body of National Archives from many countries,
    adopted OAIS as a starting point for their
    modeling work
  • France set up a working group within ARISTOTE
  • interested in archive of digital information,
    including libraries and Dept of Justice.
  • http//www.aristote.asso.fr/ (in french)
  • astonishing unifying role from OAIS reference
    model
  • System for Preservation and Access to Data and
    Information (SIPAD)
  • French space agency plasma physics archive used
    the OAIS as a basis for design
  • National Space Science Data Center (NSSDC)
  • Evolving our archive using OAIS as a basis for a
    new architecture

61
Selected OAIS Usage Examples-3
  • National Archives and Records Administration
    contracted preservation work with San Diego Super
    Computer Center
  • Both parties claimed use of the OAIS RM saved
    several weeks of effort in the specification of
    the task
  • Similar experiences between
  • National Library of France and French space
    agency (CNES) representatives
  • National Center for Supercomputer Applications
    HDF format developers and DNA researchers
  • Life Sciences Archive developer and micro-gravity
    researchers
  • United States Department of Agriculture and
    digital preservation experts

62
Follow-on Activities
  • Research Libraries Group has established a web
    page to track OAIS implementation efforts and
    issues
  • http//www.rlg.org/longterm/oais.html
  • CCSDS Certification Coordination Function
  • Will track and summarize various archive
    certification efforts
  • Will attempt to extract high-level
    model/checklist
  • RLG is organizing a group to establish
    certification approaches

63
Follow-on Activities - 2
  • Standard Submission Information Package
  • Just getting started under CCSDS Archive Ingest
    Working Group
  • CCSDS/ISO Producer-Archive Interface Methodology
    Standard
  • Provides framework for Producer/Archive
    interactions
  • Identifies steps and types of information
    exchanged during the negotiation
  • May be used as a checklist by archives

64
CCSDS/ISO Producer-Archive Interface Methodology
Abstract Standard Overview
65
Model View of an OAIS Environment
  • Producer is the role played by those persons, or
    client systems, who provide the information to be
    preserved
  • Management is the role played by those who set
    overall OAIS policy as one component in a broader
    policy domain
  • Consumer is the role played by those persons, or
    client systems, who interact with OAIS services
    to find and acquire preserved information of
    interest

66
Purpose
  • Standardize the relationships and interactions
    between an information Producer and an Archive
  • Abstract Model
  • Terms and Concepts
  • Define a methodology
  • Allows all actions to be structured within this
    context
  • Covers times from first contact by Producer until
    all information objects are received by the
    Archive
  • Provide guidance on the specialization of the
    methodology to meet the needs of classes of
    archives, or of specific archives

67
Scope
  • Identifies different phases in process of
    transferring information between Producer and
    Archive
  • Defines objectives of each phase
  • Actions to be carried out during each phase
  • Expected results from end of each phase
  • General framework able to be re-used for all
    processes related to Producer-Archive
    interactions
  • Basis for development of additional related
    standards
  • Basis for development of software tools to assist
    in different stages of the interactions between
    Producer and Archive

68
Applicability
  • All archives conformant to OAIS Reference Model
  • May be of interest to archives not conformant to
    OAIS Reference Model
  • Relevant to archives holding physical as well as
    digital materials

69
Methodology Conformance
  • When methodology is used by an archive for a
    particular archive project (acquiring a set of
    information)
  • Usage conforms when all actions in this standard
    have been considered and implemented as
    appropriate
  • When methodology has been specialized or extended
    to be a Community Standard, it conforms when
  • All actions have been considered and incorporated
    appropriately, AND
  • Methodology for creating the Community Standard
    has addressed the various work phases defined in
    section 4

70
Document Organization
  • Section1
  • Purpose, Scope, Applicability, Conformance
  • Section 2
  • Overview of methodology, players, their
    relationships, activity phases
  • Section 3
  • Detailed analysis of the four phased defined
  • Preliminary definition phase
  • Formal phase
  • Transfer Phase
  • Validation Phase
  • Section 4
  • Work stages leading to a Community Standard
  • Annex Overview of OAIS Reference Model
    applicable to this standard

71
Overview Schematic
72
Preliminary Phase Outline
First contact
Information to be archived Digital objects and
standards applied Quantification Object
references Security conditions Legal and
contractual aspects Transfer operations Validation
Schedule Permanent impact on archive Summary of
cost, risks Critical points
Preliminary definition, Feasibility and assessment
Preliminary agreement
73
Example Actions
74
Formal Phase Outline
75
Transfer Phase Actions
76
Validation Phase Actions
77
Creating a Community Producer-Archive Standard
  • Examples of communities creating such a standard
  • National or international standards bodies
  • National or international organizations
  • An individual archive to guide interactions with
    its Producers
  • Work stages to be considered
  • Definition of terminology
  • Information model for community
  • Standards and tools available or required
  • Address actions defined in the Abstract Standard
  • Best practices
  • Broad definition of the community
  • Include diverse representation on the writing
    committee
  • Publicize and seek comments from the community
  • Submit to standards body as appropriate

78
Status
  • Track versions of the document from
  • http//ssdoo.gsfc.nasa.gov/nost/isoas/us/overview.
    html
  • Register to participate
  • Version R-1, April 2003 just released for
    formal review
  • Review Site
  • http//www.ccsds.org/review/RPA305/uRPA305.html
  • Document
  • http//www.ccsds.org/review/RPA305/651x0r1.pdf
Write a Comment
User Comments (0)
About PowerShow.com