Integrating metadata schema registries with digital preservation systems to support interoperability - PowerPoint PPT Presentation

About This Presentation
Title:

Integrating metadata schema registries with digital preservation systems to support interoperability

Description:

information over time' ... of metadata schemas may offer one way to deal with this ... support from the University of Bath, where it is based. ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 27
Provided by: micha558
Category:

less

Transcript and Presenter's Notes

Title: Integrating metadata schema registries with digital preservation systems to support interoperability


1
Integrating metadata schema registries with
digital preservation systems to support
interoperability
  • Michael DayUKOLN, University of Bath,
    UKm.day_at_ukoln.ac.uk
  • 2003 Dublin Core Conference, Seattle, Washington,
    USA28 September - 2 October 2003

2
Presentation outline
  • Preservation metadata
  • Purpose
  • Standards
  • Concepts of interoperability
  • Metadata capture and inheritance
  • Object exchange
  • Metadata schema registries
  • Definitions
  • Application to digital preservation systems
  • Concluding thoughts

3
Preservation metadata (1)
  • All digital preservation strategies depend - to
    some extent - on the creation, capture and
    maintenance of metadata
  • "Preserving the right metadata is key to
    preserving digital objects" (ERPANET Briefing
    Paper, 2003)
  • Defined as
  • The various types data that will allow the
    re-creation and interpretation of the structure
    and content of digital data over time (Ludäsher,
    Marciano Moore, 2001)

4
Preservation metadata (2)
  • Metadata fulfil various roles, e.g.
  • " to find, manage, control, understand or
    preserve information over time" (Cunningham,
    2000)
  • Descriptive information technical information
    about formats and structure information about
    provenance and context administrative
    information, e.g. for rights management
  • Current schemas either very complex or only
    provide a basic framework (sometimes both!)
  • Perception that different strategies and objects
    will need different metadata

5
Preservation metadata - standards
  • Developed from many different perspectives
  • Digital libraries
  • METS, NISO Z39.87 (to support digitisation
    initiatives)
  • OCLC/RLG Framework, Cedars, NEDLIB, NLA, NLNZ
  • OAIS influence has been greatest in this area
  • Records management and archival description
  • Pittsburgh BAC, RKMS, NAA, VERS, PRO, EAD, etc.
  • Also standards not specifically developed for
    preservation, but with some overlap
  • Multimedia
  • MPEG-7, SMPTE, etc
  • Rights management
  • ltindecsgt, MPEG-21, etc.

6
The OAIS model
  • Reference Model for an Open Archival Information
    System (OAIS)
  • ISO 147212003
  • Established a common framework of terms and
    concepts
  • Influential on the design of some schemas
  • e.g., OCLC/RLG Metadata Framework
  • Identified basic functions
  • Ingest, Data Management, Archival Storage,
    Administration, Access, Preservation Planning

7
OAIS functional model
PRODUCER
CONSUMER
Preservation Planning
DIP
Descriptive info.
Access
Descriptive info.
queries
Data Management
SIP
result sets
Ingest
orders
Archival Storage
SIP
AIP
AIP
SIP
DIP
Administration
MANAGEMENT
OAIS Functional Entities (Figure 4-1)
8
OAIS information objects
  • Information Object (basic concept)
  • Data Object (bit-stream)
  • Representation Information (permits the full
    interpretation of Data Object into meaningful
    information)
  • Information Object Classes
  • Content Information
  • Preservation Description Information (PDI)
  • Packaging Information
  • Descriptive Information

9
OAIS information packages
  • Information package
  • Container that encapsulates Content Information
    and PDI
  • Packages for submission (SIP), archival storage
    (AIP) and dissemination (DIP)
  • AIP ... a concise way of referring to a set of
    information that has, in principle, all of the
    qualities needed for permanent, or indefinite,
    Long Term Preservation of a designated
    Information Object
  • PDI other information (metadata) which will
    allow the understanding of the Content
    Information over an indefinite period of time
  • Reference, Provenance, Context, Fixity

10
Draft categorisation (1)
NLA
CEDARS
NEDLIB
NLNZ
OCLC/RLG
OAIS
METS
Z39.87
Practical
Conceptual
DCMI
RKMS
PITT
VERS
PRO
MPEG-7
11
Draft categorisation (2)
  • Earliest schemas were largely conceptual in
    nature
  • e.g. Pittsburgh BAC model, Cedars outline
    specification, OCLC/RLG WG I
  • Gradually moving towards a more practical focus
  • e.g., VERS, NLNZ, METS, PREMIS WG
  • Convergence on XML (DTDs and Schemas)
  • But there is an urgent need for all this
    practical experience to be shared
  • e.g., published schemas, advice on
    implementation, etc.

12
Implementation
  • We need to prove the practical value of metadata
    frameworks and 'outline specifications'
  • It can be difficult for implementers to use these
    as a guide to the design of real systems?
  • We need to move from the conceptual to the
    practical, need to move beyond proof-of-concept
  • Positive signs
  • METS/NISO Z39.87
  • OCLC/RLG PREMIS WG looking at implementation
    strategies for preservation metadata

13
Sustainability (1)
  • Balance risks with costs
  • There is a perception that metadata creation and
    maintenance will be expensive
  • But costs associated with data recovery are not
    trivial
  • Need to balance the risks of data loss with the
    cost of creating metadata
  • Cost/benefit analysis
  • Robust selection criteria
  • Co-operation between repositories
  • Re-use of existing metadata

14
Sustainability (2)
  • Avoid imposing unnecessary costs
  • Avoid large schemas (?)
  • Need to identify the right metadata - 'core
    metadata' (?)

15
Interoperability (1)
  • Heterogeneity
  • The need to cope with a wide (and growing) range
    of preservation metadata standards, object types,
    formats, etc.
  • No realistic prospect of a single standard
  • Repositories will need to manage a range of
    metadata standards, at least within ingest and
    access functions

16
Interoperability (2)
  • Metadata creation and capture
  • Created by humans or captured automatically?
  • Some metadata already exists, e.g.
  • Embedded within objects
  • In separate databases
  • Generated by particular processes
  • Need for this metadata to be captured at
    creation, ingest, migration, and at other
    appropriate points in object life-cycle

17
Interoperability (3)
  • Benefits of interoperability
  • To support the capture or inheritance of
    metadata, e.g. on ingest
  • To support the management of multiple formats and
    metadata schema within a digital preservation
    system
  • Current metadata specifications not entirely
    clear on how this should be done
  • To support the exchange of information packages
    outside the repository, e.g. by converting to
    standard 'exchange formats'
  • Networks of 'trusted repositories'

18
Registries (1)
  • Registries of metadata schemas may offer one way
    to deal with this problem
  • Parallel concept of format registries
  • There is " a pressing need to establish
    reliable, sustained repositories of file format
    specifications, documentation, and related
    software" (Lawrence, et al., 2000)
  • DSpace 'bitstream format registry'
  • Typed Object Model (TOM) project
  • Digital Library Federation, et al. recently
    proposed a Global digital format registry

19
Registries (2)
  • Metadata schema registries
  • " formal systems that can disclose authoritative
    information about the semantics and structure of
    the data elements that are included within a
    particular metadata scheme" (Heery, et al., 2000)
  • Existing registries include the XML.org Registry
    and Repository (OASIS), and metadata registries
    set up by DCMI and SMPTE

20
Registry functions (1)
  • Provides support for the ingest process
  • Support conversion and metadata capture tools
  • May also provide support for the access function
  • The export of Dissemination Information Packages
  • The exchange of information packages (AIPs?) with
    other repositories conversion to exchange
    standards
  • Can link metadata where there are multiple
    instances within the system
  • May help to manage schema evolution

21
Registry functions (2)
PRODUCER
CONSUMER
Preservation Planning
DIP
Descriptive info.
Access
Descriptive info.
queries
Data Management
SIP
result sets
Ingest
orders
Archival Storage
SIP
AIP
AIP
SIP
DIP
Administration
MANAGEMENT
OAIS Functional Entities (Figure 4-1)
22
Registry functions (2)
Registry
Schemas
Schemas
PRODUCER
CONSUMER
Preservation Planning
DIP
Descriptive info.
Access
Descriptive info.
queries
Data Management
SIP
result sets
Ingest
orders
Archival Storage
SIP
AIP
AIP
SIP
DIP
Administration
MANAGEMENT
OAIS Functional Entities (Figure 4-1)
23
Registry functions (2)
Registry
Schemas
Schemas
PRODUCER
CONSUMER
Preservation Planning
DIP
Descriptive info.
Access
Descriptive info.
queries
Data Management
SIP
result sets
Ingest
orders
Archival Storage
SIP
AIP
AIP
SIP
DIP
Administration
MANAGEMENT
OAIS Functional Entities (Figure 4-1)
24
Registries - organisational issues
  • Registries are part of infrastructure
  • Distributed vs. centralised approaches
  • Concept of 'shared services'
  • Experimental distributed registries are based on
    Resource Description Framework (RDF)
  • CORES Registry
  • Encourage re-use of metadata (the 'application
    profile' concept)
  • Are other technologies more suitable?
  • Who should be responsible for them?

25
Summing up
  • Interoperability supports the reuse of metadata
    and the exchange of information objects
  • There is a possible role for metadata registries
    to help manage these (and other) processes - but
    the concept needs extensive scoping and
    evaluation
  • Registries are NOT a panacea

26
Acknowledgements
  • UKOLN is funded by Resource the Council for
    Museums, Archives and Libraries, the Joint
    Information Systems Committee (JISC) of the UK
    higher and further education funding councils, as
    well as by project funding from the JISC and the
    European Union. UKOLN also receives support from
    the University of Bath, where it is based.
Write a Comment
User Comments (0)
About PowerShow.com