FEDORA - PowerPoint PPT Presentation

About This Presentation
Title:

FEDORA

Description:

Sheet Music Object. Data. MODS Metadata. Images of the pages (Image Objects) ... Implementers are free to choose. Best practices still being fleshed out. LC's ... – PowerPoint PPT presentation

Number of Views:3435
Avg rating:3.0/5.0
Slides: 43
Provided by: cke2
Learn more at: http://worldcat.org
Category:
Tags: fedora | free | music | sheet

less

Transcript and Presenter's Notes

Title: FEDORA


1
FEDORA
  • Selecting and Implementing an Open Source Digital
    Repository
  • Corey Keith
  • ckeith_at_loc.gov

2
Introduction
  • History
  • FEDORA Overview
  • Object Oriented Principals
  • LCs Requirements
  • LCs Architecture
  • Review

3
Pop Quiz
  • XML
  • OAIS
  • METS
  • FEDORA
  • DSPACE

4
FEDORA History
  • Continuing Research Project
  • Cornell 1997
  • Prototype Application
  • University Virginia
  • Fedora 1.0
  • Open Source Release 2002
  • Fedora 1.2
  • Tomorrow!

5
Options, options, options
  • Very few tools directly compete with each other
  • Many tools can be used to accomplish similar
    behavior
  • Many tools fulfill parts of the functionality
    needed for a repository
  • Roll your own solution

6
Why Fedora?
  • Repository Architects Developers Excited ?
  • Object oriented approach to digital objects
  • Open Source Project
  • Funded development (and support)
  • Java Based
  • Multiple HW Platforms

7
Flexible
  • Integrates well with existing systems
  • CGI Scripts
  • Web Services
  • Leaves most decisions to implementers

8
Extensible
  • Again, no product can do it all
  • Imaging, Audio, Transformations, Courseware
  • Easy to add new functionality to objects
  • Embraces web services
  • Open APIs
  • Access
  • Management

9
Digital Object
  • What is the definition of a digital object?
  • Documents, such as articles, preprints, working
    papers, technical reports, conference papers
  • Books
  • Theses
  • Data sets
  • Computer programs
  • Visualizations, simulations, and other models
  • Multimedia publications
  • Administrative records
  • Published books
  • Bibliographic datasets
  • Images
  • Audio files
  • Video files
  • Reformatted digital library collections
  • Learning objects
  • Web pages

list taken from the dspace.org website
10
Repository Architecture
  • Objects
  • Behavior Definitions
  • Behavior Mechanisms
  • API
  • Management
  • Access

11
Object Oriented
  • A software design method that models the
    characteristics of abstract or real objects using
    classes and objects.
  • Proven Techniques for Software Development
  • Requirements gathering Use Cases
  • Developers speak to librarians and other
    stakeholders
  • Facilitates reuse of functionality
  • Design Patterns
  • Not hacking Perl Scripts to make an institutional
    repository

12
Object Oriented
  • Data
  • Metadata
  • MODS Descriptive
  • METS Structural
  • MIX, etc Technical
  • Bit streams
  • Actual Files JPG, TIF, WAV, MP3, TEI, EAD
  • Methods (Behaviors)
  • Do stuff with the data

13
Object Oriented Concepts
  • Classes
  • Objects of the same type belong to a class
  • Interfaces
  • A contract defining behaviors a class of objects
    will implement
  • Encapsulation
  • Behaviors operate on the data in an object
  • Reflection
  • Discover what interfaces and behaviors an object
    implements

14
Image Objects
  • Two File Image Object
  • Data
  • Hi Resolution Version tif
  • Low Resolution Version jpg
  • MrSID File Image Object
  • Data
  • MrSID File

15
Basic Image Interface
  • getHighResolutionTIF
  • getLowResolutionJPG

16
Basic Image Interface Implementations
  • Two File Image Object
  • getHighResolutionTIF
  • returns high resolution TIF
  • getLowResolutionJPG
  • returns low resolution JPG
  • MrSID Image Object
  • getHighResolutionTIF
  • processes the MrSID file to return a high
    resolution TIF file of the image
  • getLowResolutionJPG
  • processes the MrSID file to return a low
    resolution JPG of the image

17
Sheet Music Object
  • Data
  • MODS Metadata
  • Images of the pages (Image Objects)
  • TEI encoded text of the lyrics (TEI Objects)
  • Behaviors
  • getPageImage(Pagenumber)
  • Invoke the getLowResolutionJPG to return the
    image!
  • getMODS
  • getLyrics

18
FEDORAs Interface Implementation
Behavior Definition Object
Data Object
Behavior Mechanism Object
graphics taken from presentations available at
www.fedora.info
19
What is FEDORA?
  • Plumbing
  • Manage associations between objects and their
    interfaces
  • Invoke behaviors from an interface which an
    object subscribes
  • Manages or references files

20
What FEDORA currently does not do?
  • Digital Library in a Box
  • Requires integration and custom development
  • Prescribe the right way to do things
  • Implementers are free to choose
  • Best practices still being fleshed out

21
LCs Requirements
  • Complex Digital Objects
  • Structurally
  • METS structMap
  • Rich descriptive metadata
  • Exploiting MODS features
  • relatedItem

22
Choosing Repository Software
  • Fedora provides a foundation to build on
  • LC member of initial deployment team
  • No other software is like FEDORA
  • Except general purpose programming languages

23
How LC is implementing FEDORA
  • Types of Digital Objects
  • Sheet Music
  • Scores
  • Sound Recordings
  • Compact Discs
  • Manuscripts
  • Photographs
  • Websites
  • Collections
  • Less emphasis
  • Intellectual output of universitys research
    faculty

24
METS Profiles
  • Correlates well with classes of objects
  • Articulates
  • Structure of an object
  • Metadata requirements
  • METS documents conforming to profiles are
    ingested into repository
  • Atomization
  • Behavior association

25
Architecture
user
  • Fedora (Repository)
  • Cocoon (Application Layer)

web browser
cocoon
Fedora Service APIs
Fedora Repository System
26
SIP vs AIP
  • Complex digital objects are atomized into small
    reusable objects upon ingest to FEDORA
  • Sheet Music METS Profile (SIP)
  • Sheet music object (AIP)
  • Structural metadata encoded in METS
  • Descriptive encoded in MODS
  • Image objects for each page (AIP)
  • TIF and JPG Files
  • Technical encoded in MIX
  • TEI object for the lyrics (AIP)
  • TEI File

27
Why this Architecture?
  • Clean Separation of Concerns
  • Logic Makes it go!
  • Content From FEDORA
  • Style Web Designers
  • Object not bound to display
  • Repository is for preservation of metadata and
    files not markup (HTML)
  • Markup accomplished in cocoon layer
  • Leverage use of METS structural metadata
  • Performance Cocoon Caching

28
User Interface Development
  • Web Designers
  • Relate to objects and behaviors
  • Can develop in HTML for display
  • XSLT
  • Uses XML from repository to drive display

29
(No Transcript)
30
Other Pieces of the Repository Puzzle
  • Other open source tools
  • Cocoon
  • XML Publishing Framework
  • Lucene
  • Text Indexing and Search API
  • Someone has to write software!
  • Java to build Lucene indexes
  • XSP searching
  • More XSLT than you want to see

31
Digital Object Production
  • How are we building these digital objects?
  • MySQL
  • Cocoon
  • XSLT
  • Homegrown Java
  • Technical metadata extraction

32
Cocoon
  • XML Publishing Framework (Toolbox)
  • Generate
  • From files (or URLS)
  • From databases
  • From code (XSP, JSP, PHP)
  • Transform
  • XSLT
  • Serialize
  • XML, HTML, PDF, SVG, MIDI?
  • Caching

33
XSLT
  • Philosophy
  • Get data into XML as early in the workflow as
    possible
  • Flexibility
  • Easy to change logic in XSLT
  • No need to recompile
  • Performance Issues

34
Resources Needed for FEDORA (Cheap)
  • Hardware Requirements
  • Minimal for experimentation
  • Installs on Windows PC
  • Packaged to get up and running quickly
  • Demo set of objects
  • Scales with hardware in a production environment

35
Resources Needed for FEDORA (Expensive)
  • 1 or More Developers
  • 1 Kick the tires
  • or More Real production
  • Application Architects
  • Requirement Analysts
  • Subject Matter Experts
  • Articulate requirements
  • Object Structure
  • Descriptive Metadata

36
Summary
  • Five Questions
  • Who
  • What
  • When
  • Why
  • Where

37
Who
  • Institutions with resources to do software
    development
  • Unique requirements for digital library software
  • Preexisting tools do not fit the need
  • Need for integration of existing systems into one
    management infrastructure

38
What
  • Digital Library Plumbing
  • Very general purpose
  • Use it to build almost any digital library
    application

39
When
  • December 10th Version 1.2

40
Why
  • Robust Set of tools to build YOUR repository
  • User support high from FEDORA development team
  • Smart people working on hard problems

41
Where
  • www.fedora.info

42
Questions
Write a Comment
User Comments (0)
About PowerShow.com