Introducing - PowerPoint PPT Presentation

1 / 50
About This Presentation
Title:

Introducing

Description:

A FEDORA-based Digital Library System utilizing Digital Object Prototypes. Kostas Saidis ... Kostas Saidis, George Pyrounakis, Mara Nikolaidou, Proc. ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 51
Provided by: kostas
Category:
Tags: introducing | mara

less

Transcript and Presenter's Notes

Title: Introducing


1
Introducing Pergamos
European FEDORA User MeetingCopenhagen, 28
September 2005
A FEDORA-based Digital Library System utilizing
Digital Object Prototypes
Kostas Saidissaiko_at_di.uoa.gr
Libraries Computer Center Department of
Informatics Telecommunications University of
Athens
2
Outline
  • Motivation The University of Athens (UoA) DL
  • Digital Objects (DOs)
  • DO Storage (FEDORA)
  • DO Manipulation (DL Application Logic)
  • Digital Object Prototypes
  • Automatic DO Type Conformance
  • Scope of Prototypes Collection Management
  • Implementation Details
  • A Preview of Pergamos
  • Discussion

3
The UoA DL Project
  • Over 1 million objects originating from 8
    disparate collections
  • Folklore notebooks, Ancient papyri, UoA
    Historical Archive, Byzantine music manuscripts,
    Theatrical photos brochures, Informatics
    research papers and dissertations, Medical
    images, Press articles
  • Heterogeneous material, in terms of content type,
    metadata, structure, user requirements
  • Mostly digitized material, requiring detailed
    cataloging

4
UoA DL Project Metadata
  • Build a Web-based DL System to handle all
    material
  • Centralized DL approach due to
  • Existing hardware infrastructure
  • Funding restrictions
  • Administration simplicity
  • FEDORA is our DO Repository

5
UoA DL Project Metadata Contd.
  • Small Team
  • 2.5 developers, 1 librarian, 1 manager
  • Requirements, Specifications, Development,
    Digitization Cataloging Management
  • while everyday tasks keep running!
  • Cataloging Personnel
  • Scholars Experts in each collections domain
    (not librarians)
  • Strict Schedule
  • First Collection deadline early 2006
  • Project deadline end of 2006

6
Motivation
  • Simplify speed up the cataloging process
  • Provide effective Web-based cataloging interfaces
  • Automate content ingestion
  • Decrease development time
  • Avoid custom coding for each content variation
  • Elaborate on reusable and configurable DL modules
  • Provide the means to treat content variations in
    a unified manner

7
Digital Objects
  • A Digital Object is a human generated artifact
    consisting of the digital content and related
    information

8
FEDORA
  • FEDORA Digital Object Model
  • Content Models, Datastreams, Behavior
    Definitions, Mechanisms Disseminators
  • FEDORA is a DO Repository
  • Focus on how each DO part is encoded stored
  • Handles effectively issues related to storage,
    preservation versioning, searching indexing,
    interoperability

9
Traditional 2-tier Approach
10
DL Application Logic
  • Cataloging, Workflows, Collection Building
    Management, User Interfaces, etc
  • DL Modules manipulate DOs in a higher level of
    abstraction
  • Focus on the overall behavior of the DO (what are
    the DO parts and how do they behave)
  • DOs reflect the underlying real world objects
    they behave according to their nature, their
    essence, their type

11
DO Typing information
  • Do we effectively capture, express and utilize
    the nature (type) of DOs?

12
An example Theatrical Collection
  • Albums containing photos of National Theater
    Performances
  • What is a Photo DO?
  • A digital image
  • stored in various formats (e.g high quality, www
    quality, thumbnail)
  • accompanied by the metadata required for
    describing the picture
  • What is an Album DO?
  • A container of Photo DOs accompanied by
    theatrical play metadata

13
A 2nd example Historical Archive
  • Universitys Senate Session Proceedings gt Folders
    gt Sessions gt Items
  • What is a Item DO?
  • A digital image (capturing 1 or 2 pages)
  • stored in various formats (e.g high quality, www
    quality, thumbnail)
  • What is a Session DO?
  • A container of Item DOs metadata
  • What is a Folder DO?
  • A container of Session DOs metadata

14
DO Typing Information
  • FEDORA Content Models express DO Typing
    information
  • Content Models are metadata attributes (e.g.
    photo, album) that we use as a guide
  • Humans interpret Content Models, not the DL
    System
  • Manual resolution of DO Typing issues

15
Problems
  • Catalogers carry out manual XML editing in a low
    level of abstraction with too technical, complex
    over detailed semantics
  • Developers generate ad-hoc, custom not reusable
    implementations of DO types variations of
    behavior
  • DL modules exhibit limited evolution and
    configuration capabilities

16
DO Typing Information
  • The DL System should resolve DO Typing issues
    automatically
  • (in a manner transparent to the DL Application
    Logic)

17
Automatic DO Type Conformance
  • The designer specifies the various DO types
  • and the DL System makes DOs conform to these
    type specifications automatically
  • How?

18
By drawing on the notions of OO
19
The OO Viewpoint
  • In the OO model an object is itself aware of its
    nature and behaves accordingly
  • Objects are conceived as instances of a type,
    automatically conforming to the types
    definitions specifications
  • OO types are separate entities (named either
    classes or prototypes)

20
Digital Object Prototypes
  • A DO Prototype is a DO Type Specification, a
    separate entity that defines the DOs
  • Constitutional parts metadata sets, files,
    structure, etc
  • Private behaviors DO internal operations such
    as serializations, validations, assignment of
    default values, content conversions, etc
  • Public behaviors (behavior schemes) the DO
    external interface, consisting of high level
    operations such as Detail view, Browse View, Edit
    View, etc

21
OO Encapsulation
22
Photo Prototype Instances
23
DO Prototypes Instances
  • The designer carries out the definition of DO
    Prototypes the DL System handles the rest
  • DO Prototypes represent the realization of the
    Content Model notion in a OO fashion
  • The process of generating a DO from a Prototype
    is called instantiation
  • The resulted object is an instance of the
    prototype
  • A DO instance automatically conforms to the
    Prototypes specifications
  • Stored DOs vs DO instances

24
3-tier DL Architecture
25
Digital Object Dictionary
  • The runtime environment in which DO instances and
    Prototypes operate
  • Instantiation of DOs based on the prototype
    specifications (private behaviors load parse
    XML, assign default values, etc)
  • Exposure of the public DO behaviors in a high
    level, uniform API (for use by DL Modules)
  • Serialization of the DO instance back to FEDORA
    (private behaviors serialize data structures in
    XML, perform validations, etc)

26
Expression of DL Application Logic
  • A DL Module performs the following steps
  • Acquire the DO Instance
  • do dictionary.acquireObject(type)
  • do dictionary.acquireObject(uoadl1024)
  • Perform operations upon it
  • do.getMDSet(DC).getField(title)
  • dictionary.executeBehavior(do, editView)
  • Store the DO in the repository
  • dictionary.saveObject(do)
  • Cleaner, simpler, more effective

27
3-tier DL Architecture
Separation of Concerns
28
3-tier DL Architecture
Separation of Concerns
Storage
29
3-tier DL Architecture
DO Typing Instantiation
Separation of Concerns
Storage
30
3-tier DL Architecture
Composition of DO behaviors
DO Typing Instantiation
Separation of Concerns
Storage
31
Pergamos
  • If it sounds like Greek

32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
Scope of Prototypes
  • Should we have global DO Types?
  • Collection-pertinent types A DO Prototype is
    defined in the context of a Collection
  • Support fine grained definition of collection
    specific kinds of material
  • Hierarchical naming scheme for types
  • Theatrical Collection Photo dl.theatre.photo
  • Medical Collection Photo dl.medical.photo
  • Stored in the contentModel metadata attribute
  • Avoid type collisions

36
Album Prototype Instances
37
(No Transcript)
38
(No Transcript)
39
Collection Management
  • DL Hierarchy of DO instances
  • Collections are also DOs
  • The DL itself is a DO, representing the
    super-collection (the collection of all the
    collections)
  • Easily add new collections sub-collections
  • All content is modeled in a unified manner can
    be characterized
  • Allow the DL designer to work out the details of
    each collection independently, yet in a uniform
    manner

40
DL as a Hierarchy of DO instances
41
(No Transcript)
42
(No Transcript)
43
Implementation details
  • DO Prototypes are
  • Specified in XML form
  • Stored in the TEMPLATE datastream of the
    appropriate Collection DO
  • Loaded, parsed interpreted by the DO Dictionary
    in its bootstrap procedure
  • Transparent to FEDORA
  • DO Instances are supplied with the CONTAINER
    datastream, containing the pids of the DOs they
    contain

44
DO Prototypes in detail
  • MD Sets
  • Specification of each individual field (label,
    description, multi-value, mandatory, UI
    characteristics)
  • Serialization information (how to store it in
    FEDORA)
  • Field mappings (under development)
  • Files Automatic conversions (tiff -gt jpeg
    thumb)
  • Batch Import automatically create Dos from zip
    bundles
  • Structure allowed children types
  • Browsers browse field
  • Indices e.g. subject catalog
  • Behavior schemes atomic DO elements

45
Discussion
46
Pergamos
  • Historical Archive (production)
  • Folklore Notebooks (testing)
  • Theatrical Collection, Medical Images Byzantine
    music manuscripts (finalization of requirements
    specifications)
  • Undergoing development the remaining
    collections are coming next
  • Historical Archive will be published on early
    2006
  • with a multi-lingual UI, hopefully!

47
Public DO Behaviors
FEDORA Behaviors Behavior Schemes
Are defined in each DO separately Are defined once and in one place (in the Prototype)
Operate on the datastreams Operate on the atomic elements of a DO
Invoked directly on the DO Invoked as in OO Dynamic Method Dispatch
Require the a priori existence of datastreams Instantiation (empty DO)
Generic Targeted on UI issues
Exposed as Web services Web services will be of use after the DL has been built
48
Future Work
  • Fully implement the OO paradigm
  • OO Inheritance for DO Prototypes (e.g the
    Notebook type derives from the Book type)
  • OO Polymorphism for DO instances (e.g the DO
    uoadl1234 is both a Notebook a Book)
  • Supply general purpose linking capabilities that
    exceed structural relations (FEDORA Metadata for
    Object-to-Object Relationships?)
  • Deliver on schedule

49
Conclusions
  • If in doubt, use FEDORA
  • Flexible Extensible (they mean it)
  • 1 year of Pergamos development, 2 months of
    testing 3 months of production use (Historical
    Archive) with no serious problems
  • Though, Sandy Carl, Id be grateful for some
    minutes of your time!!!
  • DO Prototypes a realization of Content Models in
    OO terms, implemented on top of FDOM to handle DO
    Typing issues automatically
  • Detailed report on Pergamos to appear

50
Thank You
  • Questions?
  • Comments?
  • For details
  • "On the Effective Manipulation of Digital
    Objects A Prototype-based Instantiation
    Approach"Kostas Saidis, George Pyrounakis, Mara
    Nikolaidou, Proc. 9th European Conference on
    Research and Advanced Technology for Digital
    Libraries, ECDL 2005, Vienna, Austria, September
    2005
  • email saiko_at_di.uoa.gr
Write a Comment
User Comments (0)
About PowerShow.com