Central Registry for Digitized Objects: Linking Production and Bibliographic Control - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Central Registry for Digitized Objects: Linking Production and Bibliographic Control

Description:

Query / Ingest. Simple implementation into existing workflow-tools. Batch mode (lists) ... Ingest. Backend Services. EROMM / EDL / OCLC / ... Metadata Store ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 13
Provided by: Rapp67
Category:

less

Transcript and Presenter's Notes

Title: Central Registry for Digitized Objects: Linking Production and Bibliographic Control


1
Central Registryfor Digitized ObjectsLinking
Production andBibliographic Control
Ralf StockmannGöttinger Digitization Center
2
As things are now
  • Huge ventures in
  • Digitization
  • Google
  • Microsoft
  • National programs
  • Local centers
  • Accessibility
  • World Digital Library
  • European Digital Library
  • National portals
  • Google Book Search

3
As things are now
  • We just face the dawn of mass digitization
  • Leaving behind the state ofmanufacturing
  • Entering industrialization
  • Scanning Robots
  • Accessible Full Text (OCR)

4
Lack of
  • Coordination in digitization activities
  • Who scans what where when in which quality and
    how will it be accessible
  • How is quality defined?
  • Do we agree on what?

5
Facing the Consequences
TechnicalImprovements
Costs
Waste of Ressources
AdditionalBenefit
6
The Solution
  • Central registry for digitized objects
  • Focused on the production context (no user
    frontend)
  • API driven
  • Application Programming Interface
  • Query / Ingest
  • Simple implementation into existing
    workflow-tools
  • Batch mode (lists)
  • Open Source / free service
  • Matching on volume level
  • Score / probability

7
Implementation
Backend Services EROMM / EDL / OCLC /
Registry / Meta Data Store
Aggregator / Normalizer / Mapping
API
Query
Ingest
Ingest
Ingest
Collections / Projects
!
!
!
Notice of Intent
Running Project
Present Collections
8
Metadata Store
  • Bibliographic
  • Title
  • Author
  • Date
  • Place of publication
  • Number of Pages (?)
  • Language
  • Print / Format
  • Edition
  • Technical
  • Resolution
  • Color depth
  • File type / compression
  • Accessibility
  • Institution
  • Persistent identifier
  • Rights
  • URL
  • Status

Matching / Score what
Additional Judging who, where, which quality,
how accesible
Decisive Factor when
9
Obstacles
  • (open source) Tools for automated matching /
    scoring?
  • Interface for manual comparison / decision making
  • Multivolume works low rate of uniformity (near
    50 of physical SUB stock before 1900)
  • Unicode
  • Transliteration tables
  • Random bound books
  • Reliable identifier
  • ISBN for old books?
  • Anticipated rate of accuracy 50 70

10
Appreciation of Values
  • The goal is NOT to build a reliable database in
    terms of library standards
  • But to prevent further waste of resources.
  • If we manage to archive just 50 precision,
  • We saved a min. 50 of founding!

11
Work Packages
  • Define metadata model
  • Set up database
  • Implement mapping tools
  • Define API calls
  • Implement API
  • Build some connectors to popular mass
    digitization workflow tools (e.g. Goobi)
  • Establish ISBN workflow
  • Harvest existing sources
  • Start with a community of actual projects
  • Get some (!) founding
  • Estimated schedule plan 6 months

12
Thank You(stockmann_at_uni-goettingen.de)
Write a Comment
User Comments (0)
About PowerShow.com