Title: The Fedora Project
1The Fedora Project
Tim Sigmon University of Virginia
- JA-SIG Winter Conference
- December 9, 2003
2This Fedora Project is not the Redhat Fedora
project.
3The Fedora Project
- Fedora Digital Object Repository System
- Extensible digital object model
- Repository System exposed via Web service APIs
- Scalable, persistent storage for content and
metadata - Local and remote content
- Associate services with objects
- Content versioning
- Fedora Use cases
- Content Management (CMS)
- Digital Library architecture
- Digital Asset Management
- Institutional Repository
- Scholarly publishing
- Preservation
- Open source software
4 Priorities for digital libraries
- Managing digital resources as if they are all the
same - Delivering digital resources as if they are all
unique and free to participate in any number of
contexts - Supporting digital scholarship wherever it may
lead
5Shortcomings of commercial digital library
products
- Narrow focus on specific media formats (e.g.
image databases, document management) - Fail to effectively address interrelationships
among digital entities - Fail to address interoperability
- Fail to provide facilities for managing programs
and tools that deliver digital content. - Not extensible do not enable easy integration of
new tools and services
6Fedora History
- Research (1997-present)
- DARPA and NSF-funded research project at Cornell
- (Carl Lagoze and Sandy Payette)
- Reference implementation developed at Cornell
- First Application (1999-2001)
- University of Virginia digital library prototype
- (Thorny Staples and Ross Wayland)
- Scale/stress testing for 10,000,000 objects
- Open Source Software (2002-present)
- Andrew W. Mellon Foundation granted Virginia and
Cornell 1 million to develop a
production-quality Fedora system - Fedora 1.0 released in May 2003
- www.fedora.info
7Fedora 1.x
- Architecture
- Software
- Release 1.2 Features
- Demo Use Cases
8Digital Object Model Architectural View
Globally unique persistent id
Persistent ID (
PID
)
Public view access methods for obtaining
disseminations of digital object content
Disseminators
Internal view metadata necessary to manage the
object
System
Metadata
Datastreams
Protected view content that makes up the
basis of the object
9Digital Object Model Example Disseminators
- Get Profile
- List Items
- Get Item
- List Methods
- Get DC Record
Persistent ID (
PID
)
Disseminators
Default
Simple Image
- Get Thumbnail
- Get Medium
- Get High
- Get VeryHigh
System
Metadata
Datastreams
10Object Behavior Contracts
Behavior Definition Object
behavior subscription
Data Object
behavior contract
data contract
Web Service
Behavior Mechanism Object
11DEMO Basic Use Cases
- Image (multiple datastreams)
- Image (Mr. SID)
- EAD (Rita Mae Brown papers)
- Text conversion (TEI to PDF)
- Basic Search
12Users access data objects through behaviors (or
disseminations).
Application
services
13Managers have direct access to each component of
a data object.
14Fedora and Web Services
- Fedora Repository system is a web service
- Access/Search (API-A) and Management (API-M)
- Service descriptions published using WSDL
- Both SOAP and HTTP bindings
- Back-end services
- Digital object behaviors implemented as linkages
to other distributed web services - Service binding metadata (WSDL) stored in special
Fedora Behavior Mechanism objects. - Fedora acts as mediator to these services.
15Fedora Repository SystemClient and Web Service
Interactions
Backend
Frontend
Fedora Repository System
Content Transform Service
client application
client application
user
Service
Web Service
Web Service Dispatch
Content Transform Service
web browser
user
Service
16Fedora Repository Service Interfaces
- Management Service (API-M)
- Ingest - XML-encoded object submission
- Create - interactive object creation via API
requests - Maintain - interactive object modification via
API requests - Validate application of integrity rules to
objects - Identify - generate unique object identifiers
- Security - authentication and access control
- Preserve - automatic content versioning and audit
trail - Export - XML-encoded object formats
- Access Service (API-A and API-A-LITE)
- Search - search repository for objects
- Object Reflection - what disseminations can the
object provide? - Object Dissemination - request a view of the
objects content - OAI-PMH Provider Service
- OAI-DC records
17Fedora Repository System
18Fedora 1.2 Software Feature Set
- Open Fedora APIs
- Repository as web services (REST and SOAP
bindings) WSDL interface defs - Flexible Digital Object Model
- Content View objects as bundle of items (content
and metadata) - Service View objects as a set of service methods
(behaviors) - Extensible functionality by associating services
with objects - Repository System
- Core Services Management, Access/Search, OAI-PMH
- Storage XML object store relational db object
cache relational db object registry - Mediation - auto-dispatching to distributed web
services for content transformation - Auto-Indexing system metadata and DC record of
each object - HTTP Basic Authentication and Access Control
- Built-in disseminator services XSLT x-form,
image manipulation, xml-to-PDF - Content Versioning
- Automatic version control (saves version of
content/metadata when modified) - Enables date-time stamped API requests (see
object as it looked at a point in time)
19Fedora Software Distribution Package
- Open Source (Mozilla Public License)
- 100 Java (Sun Java J2SDK1.4)
- Supporting Technologies
- Apache Tomcat 4.1 and Apache Axis (SOAP)
- Xerces 2-2.0.2 for XML parsing and validation
- Saxon 6.5 for XSLT transformation
- Schematron 1.5 for validation
- MySQL and Mckoi relational database
- Oracle 9i support
- Deployment Platforms
- Windows 2000, NT, XP
- Solaris
- Linux
20DEMO Basic Use Cases
- Image (multiple datastreams)
- Image (Mr. SID)
- EAD (Rita Mae Brown papers)
- Text conversion (TEI to PDF)
- Basic Search
21Projects using Fedora
- University of Virginia digital library (images,
EAD, e-texts) - Tufts University educational (VUE/concept maps)
digital library - VTLS basis for new commercial product (library
system) - Indiana University EVIA Digital Archive (video)
- Northwestern academic technologies (images, art,
video, e-texts) - Rutgers University digital library (e-journals,
numeric data) - Yale University Electronic Records Archive
- New York University Humanities Computing Group
22Fedora Downloads since May 2003
- Total downloads gt1500
- Average downloads per day 9
- Countries 32
- Types of orgs
- Universities libraries, IT, departments
- Software and technology companies
- Defense/military
- Banks
- National libraries and archives
- Publishers
- Research labs
- Library automation vendors
- Scholarly societies
23Future Software Releases
December 2003 December 2004
- Fedora Object XML (FOXML)
- Internal storage format direct expression of
Fedora object model - Better support for relationships (kinship
metadata) - Better support for audit trail (event history)
- Format identifiers for dynamic service binding
- Shibboleth authentication
- Policy Enforcement
- XACML expression language
- Fedora policy enforcement module
- Web interface for easy content submission
- Batch object modification utility
- Administrative Reporting
- Object Event History (ABC/RDF disseminations)
- Better support for collections
- New ingest and export formats (METS1.3, DIDL)
24Future Development Proposals
- Digital Library in a Box
- Full-featured DL application with Fedora inside
- Optimized for common set of content types
- Fedora Power Server
- Integrity Management Tools
- Service and link liveness checker
- Fault Tolerance
- Mirroring and Replication
- Peer-to-peer interoperability features
- Repository clustering
- Load balancing
- Object Creation Tools
- Workflow applications based on content models
- Web interface for document/content submission
25Questions?
www.fedora.info