Title: The Fedora Project April 28-29, 2003 CNI, Washington DC
1The Fedora Project April 28-29, 2003CNI,
Washington DC
- Thornton Staples
- University of Virginia
Sandy Payette Cornell Information Science
2 Priorities for digital libraries
- Managing digital resources as if they are all the
same - Delivering digital resources as if they are all
unique and free to participate in any number of
contexts - Supporting digital scholarship wherever it may
lead
3Shortcomings of commercial digital library
products
- Narrow focus on specific media formats (e.g.
image databases, document management) - Fail to effectively address interrelationships
among digital entities - Fail to address interoperability.
- Fail to provide facilities for managing programs
and tools that deliver digital content. - Not extensible do not enable easy integration of
new tools and services
4The Flexible Extensible Digital Object Repository
Architecture (FEDORA)
- Developed as a DARPA and NSF-funded research
project at Cornell (1997-present) - Interpreted and re-implemented at University of
Virginia (1999) - Virginia prototype supported a testbed of
10,000,000 digital objects with very good results
(1999-2001) - Andrew W. Mellon Foundation granted Virginia and
Cornell 1,000,000 to develop a full-featured
production FEDORA system that is web-based (2002)
5Users access data objects through behaviors.
Application
services
6Managers have direct access to each component of
a data object.
7The Current Project
- An efficient, scalable, freely distributable
FEDORA repository system ASAP - A complete basic management interface with the
initial release - Add important digital library functionality in
later releases - Multiple testbed repositories to deploy and
evaluate the software - Make all software open source
8Deployment Partners
- Indiana University Digital Library group
- Kings College London Humanities Computing
- Library of Congress Motion Picture and Recorded
Sound Division - Los Alamos National Laboratory Research Library
- New York University Humanities Computing
- Northwestern University Academic Computing
- Oxford The Refugee Studies Center
- Tufts Digital Collections and Archives
9Fedora 1.0
- Architecture
- Software
- Release 1.0 Features
- Demo Use Cases
10Digital Object Model Architectural View
Globally unique persistent id
Persistent ID (
PID
)
Public view access methods for obtaining
disseminations of digital object content
Disseminators
Internal view metadata necessary to manage the
object
System
Metadata
Datastreams
Protected view content that makes up the
basis of the object
11Digital Object Model Example Disseminators
- Get Profile
- List Items
- Get Item
- List Methods
- Get DC Record
Persistent ID (
PID
)
Disseminators
Default
Simple Image
- Get Thumbnail
- Get Medium
- Get High
- Get VeryHigh
System
Metadata
Datastreams
12Object Behavior Contracts
Behavior Definition Object
behavior subscription
Data Object
behavior contract
data contract
Web Service
Behavior Mechanism Object
13Basic Repository Architecture
- Repository System
- Object Management
- Lifecycle (Ingest/create ? Store ? Delete ?
Approve ? Purge) - Validation
- PID Generation
- Version management
- Access Control
- Preservation support
- Object Access
- Object Dissemination
- Object Reflection
- Service Mediation
14Fedora and Web Services
- Fedora Repository system is a web service
- Access/Search (API-A) and Management (API-M)
- Service descriptions published using WSDL
- Both SOAP and HTTP bindings
- Back-end services
- Digital object behaviors implemented as linkages
to other distributed web services - Service binding metadata (WSDL) stored in special
Fedora Behavior Mechanism objects. - Fedora acts as mediator to these services.
15Fedora Repository SystemClient and Web Service
Interactions
Backend
Frontend
Fedora Repository System
Content Transform Service
client application
client application
user
Service
Web Service
Web Service Dispatch
Content Transform Service
web browser
user
Service
16Fedora Repository System
17Server Design 3 Layers
Interface Service Exposure Access API (API-A, API-A-LITE) Management API (API-M)
Application Logic Implements management, access, and security in terms of the Fedora object model.
Storage Database and File system XML object serializations cache(s).
18Fedora 1.0 Features
- Public APIs - exposed as web services
- Flexible Digital Object Model
- XML submission and storage (extension of METS
Schema) - Local and distributed content
- Data (any type) and metadata (any schema DC,
other) - Supports inter-relationships among objects
- Behavior contracts for objects
- Associate services with objects
- Objects can provide launch-pad or tool to use
object content - Repository System
- Management Service - manage digital resources,
metadata, as well as computer programs, services
and tools that support them - Access Service repository search and object
disseminations - Mediation - interactions with other distributed
web services for content transformation and
presentation - Admin GUI client object creation, update,
purge, search - OAI-PMH Provider provides OAI-DC
- Basic Access Control - IP-based
19Fedora 1.0 (available May 16, 2003)
- Open Source Software
- GNU General Public License (GPL)
- Implementation Technologies
- Sun Java J2SDK1.4
- Apache Tomcat 4.1 and Apache Axis (SOAP)
- Xerces 2-2.0.2 for XML parsing and validation
- Saxon 6.5 for XSLT transformation
- Schematron 1.5 for validation
- MySQL-2.23.52 and Mckoi relational database
- Deployment Platforms
- Windows 2000, NT, XP
- Solaris
- Linux
20DEMO Basic Use Cases
21Finding Aids Collections at Virignia
Connect to Repository
22Fedora Future Development Plans
1 Year Out
2-3 Years Out
- Integrity Management Tools
- Service liveness checker
- Link liveness checker
- Logging and stats
- Object Versioning (for behaviors)
- R2R Federation
- Shared PID resolver service
- Interoperability features
- Performance
- Repository clustering
- Load balancing
- Reliability
- Fault Tolerance
- Mirroring and Replication
- Backend Service Mediation
- SOAP dispatcher
- Full API-M implementation
- Advanced Access Control
- Shibboleth
- XML Policy expression
- Fine-grained enforce
- Object Versioning (for content)
- More object creation tools
- Improved Searching
- Performance
- Tuning
- Caching
23Questions