Title: Reference Model for an Open Archival Information System (OAIS) And Submission Agreements
1Reference Model for an Open Archival Information
System (OAIS)And Submission Agreements
- NOAA DSA TIM
- Donald Sawyer/NASA/GSFC
- 26-October 2005
2Topics (time permitting)
- OAIS Reference Model
- Producer-Archive Interface Methodology Abstract
Standard - Submission Information Package (SIP)
standardization (separate presentation)
3OAIS Reference Model
- Consultative Committee for Space Data Systems
- International group of space agencies
- Developed variety of science discipline-
independent standards - Became working body for an ISO TC 20/ SC 13 about
1990 - TC20 Aircraft and Space Vehicles
- SC13 Space Data and Information Transfer Systems
- Ensured broad participation, including
traditional archives - (Not restricted to space communities all
participation was welcomed!)
4What is a Reference Model?
- A framework
- for understanding significant relationships among
the entities of some environment, and - for the development of consistent standards or
specifications supporting that environment. - A reference model
- is based on a small number of unifying concepts
- is an abstraction of the key concepts, their
relationships, and their interfaces both to each
other and to the external environment - may be used as a basis for education and
explaining standards to a non-specialist.
5Organizational Approach
- Organized US contribution under a framework with
NASA lead - Established liaison with Federal Geographic Data
Committee (FGDC) and National Archives and
Records Administration (NARA) - Agency archives and users must be represented in
this process - An Open process
- Important to stimulate dialogue with broad
archive/user communities - Results of US and International workshops put on
WEB - Supported e-mail comments/critiques
6Technical Approach 1
- Investigate other Reference Models.
- ISO Seven LayerCommunications Reference Model
- ISO Reference Model for Open Distributed
Processing - ISO TC211 Reference Model for Geomantics
- Define what is meant by archiving of data
- Break archiving into a few functional areas
(e.g., ingest, storage, access, and preservation
planning)
7Technical Approach 2
- Define a set of interfaces between the functional
areas - Define a set of data classes for use in Archiving
- Choose formal specification techniques
- Data flow diagrams for functional models and
interfaces - Unified Modeling Language (UML) for data classes
8Results 1
- Reference Model targeted to several categories of
reader - Archive designers
- Archive users
- Archive managers, to clarify digital preservation
issues and assist in securing appropriate
resources - Standards developers
- Adopted terminology that crosses various
disciplines - Traditional archivists
- Scientific data centers
- Digital libraries
9Results 2
- Widely adopted as starting point in digital
preservation efforts - Digital libraries (e.g., Netherlands National
Library) - Traditional archives (e.g., US National Archives)
- Scientific data centers (e.g., National Space
Science Data Center) - Commercial Organizations (e.g., Aerospace
Industries Association preservation working team) - Published as final CCSDS standard (Blue Book)
available from - http//www.ccsds.org/documents/650x0b1.pdf
- Published as a final ISO standard ISO 14721 2003
10Purpose and Scope 1
- Framework for understanding and applying concepts
needed for long-term digital information
preservation - Long-term is long enough to be concerned about
changing technologies - Also can be starting point for model addressing
non-digital information
11Purpose and Scope 2
- Provides set of minimal responsibilities to
distinguish an OAIS from other uses of archive - Framework for comparing architectures and
operations of existing and future archives
12Purpose and Scope 3
- Basis for development of additional related
standards - Addresses a full range of archival functions
- Ingest, Archival Storage, Data Management,
Access, Preservation Planning, Administration
13Applicability
- Applicable to all long-term archives and those
organizations and individuals dealing with
information that may need long-term preservation - Does NOT specify an implementation
14Conformance
- How does an archive conform?
- It discharges the set of minimal responsibilities
- It supports the basic information concepts that
address a definition of information and types of
information packages - How do other documents conform?
- By using OAIS terms and concepts
15Who wants to conform to OAIS?
- All organizations that need to preserve digital
information for extended periods - To demonstrate a level of awareness of digital
preservation needs - Other standards and documents
- For effective communication and integration
16Open Archival Information System (OAIS)
- Information
- Any type of knowledge that can be exchanged
- Data are the representation forms of information
- Archival Information System
- Hardware, software, and people who are
responsible for the acquisition, preservation and
dissemination of the information
17View of an OAIS Environment
- Producer provides the information to be preserved
- Management sets overall OAIS policy
- Consumer seeks and acquires preserved information
of interest
OAIS (archive)
Producer
Consumer
Management
18OAIS Responsibilities 1
- Negotiates and accepts information from
information producers - Obtains sufficient control to ensure long-term
preservation - Determines which communities (designated) need to
be able to understand the preserved information
19OAIS Responsibilities 2
- Ensures the information to be preserved is
independently understandable to the Designated
Communities - Follows documented policies and procedures that
ensure the information is preserved against all
reasonable contingencies - Makes the preserved information available to the
Designated Communities in forms understandable to
those communities
20OAIS Information Definition
- Information is always expressed (i.e.,
represented) by some type of data - Data interpreted using its Representation
Information yields Information
Interpreted Using its
Yields
Data Object
Representation Information
Information Object
21Information Package Definition
Preservation Description Information
Content Information
- An Information Package is a conceptual container
holding two types of information - Content Information
- Preservation Description Information (PDI)
22Content Information
- The information that is the original target of
preservation - Deciding what is the Content Information may not
be obvious and may need to be negotiated with the
Producer - The Content Data Object in the Content
Information may be either a Digital Object or a
Physical Object (e.g., microfilm, a physical
sample)
23Preservation Description Information (PDI) 1
- Reference Information
- Provides one or more identifiers, or systems of
identifiers, by which the Content Information may
be uniquely identified - Provenance Information
- Describes the source of Content Information, who
has had custody of it, what is its history
24Preservation Description Information (PDI) 2
- Context Information
- Describes how the Content Information relates to
other information outside the Information Package - Fixity Information
- Protects the Content Information from
undocumented alteration
25Examples of PDI
- Reference
- Bibliographic description Persistent Ids
- Provenance
- Metadata on preservation process
- Context
- Pointers to related collections
- Fixity
- Digital signatures, checksums
26Information Package Variants
- Submission Information Package (SIP)
- Negotiated between Producer and OAIS
- Sent to OAIS by a Producer
- Archival Information Package (AIP)
- Information Package used for preservation
- Holds complete set of Preservation Description
Information for the Content Information - Dissemination Information Package (DIP)
- Includes part or all of one or more Archival
Information Packages - Sent to a Consumer by the OAIS
27External Data Flow View
Producer
queries
result sets
orders
Consumer
28OAIS Archival Information Package
Archival Information Package (AIP)
Packaging Information
Package Description
delimited by
derived from
e.g., How to find Content information and PDI
on some medium
e.g., Information supporting customer searches
for AIP
Preservation Description Information (PDI)
Content Information
further described by
e.g., Hardcopy document Document as an
electronic file together with its format
description Scientific data set consisting
of image file, text file, and format
descriptions file describing the other files
e.g., How the Content Information came into
being, who has held it, how it relates to
other information, and how its integrity is
assured
29Packaging Information
- Information which, either actually or logically,
binds and relates the components of the package
into an identifiable entity on specific media - Examples of typical Packaging Information include
tar files, directory structures, filenames, and
tape marks
30Package Description
- Contains the data that serves as the input to
documents or applications called Access Aids. - Access Aids can be used by a Consumer to locate,
analyze, retrieve, or order information from the
OAIS.
31Functional Entities 1
- Ingest This entity provides the services and
functions to accept Submission Information
Packages (SIPs) from Producers and prepare the
contents for storage and management within the
archive - Archival Storage This entity provides the
services and functions for the storage,
maintenance and retrieval of Archival Information
Packages - Data Management This entity provides the
services and functions for populating,
maintaining, and accessing both descriptive
information that identifies and documents archive
holdings and internal archive administrative data.
32Functional Entities 2
- Administration This entity manages the overall
operation of the archive system - Preservation Planning This entity monitors the
environment of the OAIS and provides
recommendations to ensure that the information
stored in the OAIS remain accessible to the
Designated Community over the long term. - Access This entity supports Consumers in
determining the existence, description, location
and availability of information stored in the
OAIS and allows Consumers to request and receive
information products
33OAIS Functional Entities
Preservation Planning
P R O D U C E R
C O N S U M E R
Data Management
Descriptive Info.
Descriptive Info.
queries
result sets
Ingest
Access
orders
SIP
DIP
AIP
AIP
Archival Storage
Administration
MANAGEMENT
SIP Submission Information Package AIP
Archival Information Package DIP Dissemination
Information Package
34Submission Agreement
- Negotiated between Producer and Archive
- Identifies the SIPs to be submitted by the
Producer - May include mandatory requirements
- Not further expanded in the OAIS Reference Model
35Reference Model Summary
- Reference model is applicable to all digital
archives, and their Producers and Consumers - Establishes common terms and concepts for
comparing implementations, but does not specify
an implementation - Identifies a minimum set of responsibilities for
an archive to claim it is an OAIS - Provides detailed models of both archival
functions and archival information - Also discusses OAIS information migration and
interoperability among OAISs
36Producer-Archive Interface MethodologyAbstract
Standard(PAIMAS) NOAA DSA TIM
C. Huc/CNES, D. Boucon/CNES-SILOGIC,D.M.
Sawyer/NASA/GSFC, J.G. Garrett/NASA-Raytheon
RAYTHEON
37Why a new standard?Needs for standardization
problems
- The relations between archives and data Producers
are rarely simple and easy - nonconformity of received data
- unclear and imprecise definition of the data to
be delivered, - failure to meet delivery schedule,
- late detection of errors in archived data,
- non-management of modifications
- gt Can be detrimental to archived information
quality and the cost of the operation. - Ever increasing diversity of the producers
- Data complexity
- Each project develops its own methodology on the
basis of a process that is roughly the same from
one project to another - gt Work duplicated, no generality, excessively
high costs, etc.
38Methodology Context
PAIMAS Focus
Preservation Planning
P R O D U C E R
C O N S U M E R
Data Management
queries
result sets
SIP
Ingest
Access
orders
Archival Storage
DIP
Administration
MANAGEMENT
SIP Submission Information Package
AIP Archival Information Package
DIP Dissemination Information Package
39MethodologyDescription
- The archive project is broken down into 4 main
phases - Preliminary Phase,
- Formal Definition Phase,
- Transfer Phase,
- Validation Phase.
- Each phase has extensive action tables.
-
- Specialization for a community.
40MethodologyThe phases relationships
- Negociate the Submission
- Develop agreement (data to
- be delivered, complementary
- elements, schedule)
- Actual transfer of the data
- Actual transfer of the
- objects
Validate the transferred objects
- Identification and preliminary
- Define the information
- to be archived
- resources estimation
Phase objective
Transferred object files
Preliminary Phase
Formal Definition Phase
Validation agreement
Preliminary Agreement
Dictionary Formal model Submission Agreement
Anomalies
Data ready to archive
41Methodology Preliminary phase context
Producer
Preliminary Phase
Preliminary Agreement
Archive
42MethodologyPreliminary phase sub-phases
43MethodologyFormal Definition Phase context
Preliminary Agreement
Formal Definition Phase
Dictionary Data Model Submission Agreement
44MethodologyFormal Definition Phase sub-phases
and action table
Id Formal Definition Phase Submission
Agreement Involves F-36 Draw up the
Submission Agreement Producer and/or
Archive
- information to be transferred (e.g., SIP
contents, SIP packaging, data models, Designated
Community, legal and contractual aspects) - transfer definition (e.g. specification of the
Data Submission Sessions) - validation definition
- change management (e.g. conditions for
modification of the agreement, for breaking the
agreement) - schedule (submission timetable).
45MethodologyTransfer Phase
Data Model of object files to deliver Schedule
Transfer Phase
- Actual transfer of the objects
- carry out the transfer test
- manage the transfer
Transferred object files
46MethodologyValidation Phase
Transferred object files
Data ready to archive
Validation Phase
- Validate the transferred objects
- carry out the validation test
- manage the validation
Anomalies
Validation acknowledgement
Producer
47Specialization
- Adapt the generic standard to a particular
community (which can range from an international
organization to a simple archive service) - Steps involved to define a community standard
- terminology,
- data dictionary and information model,
- standards,
- common tools.
- Analyze each action of the generic standard (add
and delete actions as appropriate)
48Conclusion
- PAIMAS identifies
- the phases in the process of transferring
information, - the objective of the phases,
- the actions that must be carried out,
- the expected results.
-
- PAIMAS is a basis
- for further specialization by a particular
community - for the identification of standards and
implementation guides, - for identification and development of a set of
software tools.
49PAIMAS Status
- PAIMAS approved as final Consultative
Committee for Space Data Systems standard - .. http//public.ccsds.org/publications/archive/6
51x0b1.pdf - PAIMAS is undergoing ISO review as a final
ISO standard - Expect approval this Fall, 2005
50End of presentation