Title: An Open Archival Repository System for UT Austin
1An Open Archival Repository System for UT Austin
- Preparing the Way for the Next Generation of
Scholars and Information Seekers
2Knowledge Gateway
- We will provide access for every citizen, via a
personalized Internet window, into the resources
of our libraries, collections, museums and much
more. - Larry Faulkner
3UT Austin Owns an Abundance of Content
- We have print and analog content that can be
digitally re-formatted - We have a growing abundance of born digital
material that only exist in digital formats
4And much of it is already on the the Web.
5Not just about how to publish anymore
- It is about how to keep publishing and keep
maintaining - Its about what to keep and how to keep it
6What to keep publishing
- Librarians refer to the process of continuing to
maintain something as PRESERVATION
7When it was just paper
- Most people didnt care that much because paper
(even the highly acidic kind) outlived the people
who produced it - Plus, libraries were around and you could just
dump all the paper you wanted to keep into the
library and they would worry about it.
8Information (Especially Digital Information)
Doesnt Outlive People Without Lots of Work and
Planning
9 - The University wants a great deal of the
knowledge it produces and the information it
makes available to persist over long periods of
time
10Quick Review
- We have lots of information
- We will continue to produce lots more information
- Some of it we need to keep for a long time
- Some of it we dont
11So, whats the problem?
12Long-term Preservation
- Implicit in the Knowledge Gateway concept is
continued access to the riches of content we hold
in our collections, laboratories, museums,
lectures, classrooms, etc
13Hardware required to consume digital information
Client
This is all post-production
Network
Storage Systems
Storage Media
Backup System
Server
Information Content
14Cant just preserve media, information content,
disk drivesmust focus on all components of the
systems that provide digital content
15Back in the day
- THE content was catalog records that described
library materials
16Today
- The digital content is the actual research data
and scholarship itself-books, journals, software,
datasets, music, video, still images, all manner
of digitized and born digital content
17Yesterday
- We needed systems that were designed optimized
for producing and managing electronic catalog
records that described print and analog materials
18Today
- We need systems designed optimized for
producing and managing the actual material to
which the metadata refer
19Yesterday
- The content that libraries produced was catalog
records
20Today
- Libraries are producing digital collections AND
catalog records
21Need a new model for working in this environment
- We still have authors/producers
- We still have libraries/archives
- We still have end-users
22We are not the only ones facing this challenge
- Harvard Librarys Digital Initiative
- MITs Dspace
23The Digital Library community is adopting a new
ISO Standard
- Open Archival Information System (OAIS)
- A conceptual framework to assist organizations
who have material they want to preserve
24Origins of OAIS
- In 1995 the International Organization for
Standardization (ISO) and the Consultative
Committee for Space Data Systems (CCSDS)
initiated a planning process to develop a
reference model for the long term preservation of
digital materials obtained from observations of
the planetary space
25Purpose 1
- Framework for understanding and applying concepts
needed for long-term digital information
preservation - Long-term is long enough to be concerned about
changing technologies
26Purpose 2
- Ascribes a set responsibilities to an OAIS that
distinguish it from other archives - Conceptual framework useful for comparing
archives and benchmarking
27Purpose 3
- Basis for development of additional standards
- Metadata and encoding scheme for example
- Identifies a full range of archival functions
- Ingest, archival storage, data management,
access, preservation planning, administration
28OAIS
- Information
- Any type of knowledge that can be exchanged
- Data are the representation forms of information
- Archival Information System
- Hardware, software, and people who are
responsible for the acquisition, preservation,
and dissemination of the information
29OAIS is not an implementation plan.
Implementation is based on the concepts in the
reference model.
30OAIS Role
- Assumes information is produced outside of the
archives and is intended for delivery to users
who are outside the system - Describes functions of a long-term archive
- Distinguishes content management from content
production and development -
31Just as a vocabulary for discussing content
production has taken root
- Digitization
- Graphics design
- Markup language
- User interface design
- Usability
- Accessibility
- Layout
- Development
32So we need a common vocabulary for discussing the
archival
- Librarians/archivists and content producers will
need to be able to communicate more effectively
about content than they have in the past - To facilitate this OAIS offers definitions
33Functions of the Archive
- Accessioning of information content
- Storage
- Data Management
- Access
- Preservation Planning
- Administration
34OAIS Definition for Information
- Information always expressed or represented by a
data type - Data yields information
Interpreted using
Yields
Data Object
Representation Information
Information Object
35Primary roles in OAIS Reference Model
Provides information to be preserved
Sets overall OAIS policy and Manages content
Consumer
Producer
Archive
Seeks and acquires Preserved information
36Information Package Definition
Content Information
Preservation Information
37Information package
Dissemination Information package DIP
Archival Information package AIP
Submission Information package SIP
38Content Information
- The information that is the original target of
preservation - This may not be obvious and may require
negotiation with producer
39Preservation Description Information (PDI) 1
- Reference Information
- Identifiers by which content information may be
uniquely identified - Provenance Information
- Description of the source of the Content
Information, who has had custody, its history
40Preservation Description Information (PDI) 2
- Context Information
- Describes how the Content Information relates to
other information outside the Information Package - Fixity Information
- Protects the Content Information from
undocumented alteration
41Examples of PDI
- Reference
- Bibliographic description, persistent Ids (URN,
PURL) - Provenance
- Metadata on preservation process
- Context
- Pointers to related collections
- Fixity
- Digital signature, check sum
42Submission Information Package
- Negotiated between Producer and OAIS
- Sent to OAIS by Producer
- Consists of metadata and additional information
about the producers content
43Archival Information Package
- Information Package used for preservation
- Holds complete set of Preservation Description
Information for the Content Information
44Dissemination Services
C O N S U M E R
Metadata registry Workbench tools Access
control Security
A R C H I V E
45Producers must identify content for long term
preservation and negotiate agreement for
ingestion and dissemination
M E T A D A T A
Producers Audio Video Text Still Images
OAIS
Decision
Temp -local -shared
46President Faulkner did not say
- That we would make some content available one
day, then other content the next - Or that we would simply make the most currently
digitized or produced content available
47Library as metaphor
- Libraries dont produce books or journals or
Computer Science Technical Reports - but they
serve as the repository for those materials - The Library represents a broad array shared
systems - systems shared by faculty, students,
staff and public