Title: New Value from the DSpace Foundation and Fedora Commons
1New Value from the DSpace Foundation and Fedora
Commons
DuraSpace
- Michele Kimpton and Sandy Payette
- Executive Directors
2Social and Technical Forces (2000-present)? Waves
of Repository-Enabled Applications
- Institutional Repositories
- Digital Collections
- Digital Libraries
- Collaborative Spaces and Web 2.0
- Scholarly and Scientific Infrastructure
- E-Research
- Data (archiving, linking, sharing)
3Implications for our future work
more open
more collaborative
more web-oriented
more interoperable
more distributed
4 Emergence of Infrastructure
Systems
Networks
Integrate systems Distributed control Generic
gateways More open More reconfigurable
Integrate components Central control Dedicated/spe
cialized gateways More closed More preconceived
Source Understanding Infrastructure Lessons
for New ScientificInfrastructure,
http//deepblue.lib.umich.edu/handle/2027.42/49353
5December 2008
page 55
Source Francine Berman, Got Data? A Guide to
Data Preservation in the Information Age, pp 50-56
6History DSpace and Fedora
- Two open source repository systems
- DSpace
- End-user application and repository
- Turn key system providing easy out-of-box
- Fedora
- Web services (repository and supporting services)
- Flexible, modular, and scalable
- Enabling technology supporting
- scholarship, science, culture, education
- open access
- preservation and archiving
7DSpace and Fedora Installations
Universities Research Centers Libraries Archives C
ultural Heritage Government More
Largest share of open repositories worldwide
over 700 institutions tracked in our registries
8DSpace Foundation and Fedora Commons501(c)(3)
non-profit organizations
Progression of Partnership
Web APIs Storage Abstraction Architecture Strategy
DuraSpace Future Joint Offerings Business
Strategy Communication/Outreach
SWORD Deposit MS Word Plug-In
9http//blogs.the451group.com/opensource/
10Goals of Strategic Partnership
- Stewardship
- Support and align open source development
communities for DSpace and Fedora - Keepers of the cause (durability access)
- Innovation
- Think beyond existing platforms
- New strategic directions for repositories
- New products and services
- Sustainability
- Devise business models that fit our sector
- Services that generate revenue for non-profits
11What About the Cloud?
A style of computing where massively scalable
IT-related capabilities are provided as a
service using Internet technologies to multiple
external customers. (Gartner, 6/08).
An emerging architecture in which data and
applications reside in cyberspace, allowing
users to access via the internet (Pew Internet
9/08)
12Types of Cloud Services
- Software as a Service (SAAS)
- e.g. , Google Apps
- Cloud Computing
- e.g., Amazon Elastic Compute Cloud (EC2)
- Cloud Storage
- e.g., Amazon Simple Storage Service (S3)
13Cloud Services
For the organization- Elastic web based
infrastructure for storage and compute
14Vision Federated Repositories and
Cyberinfrastructure
Heaven
DuraSpace
15- DuraSpace Proposition
- Trust and durability in the cloud
16What have we learned from our users?
Focus Groups
Site Visits
Forums
17Problems
Preservation important but difficult to implement
- Tools and processes unproven
- Limited IT support
- Capital expenditures limited
- Task can be overwhelming ( replication,
migration, emulation ect.)
18Problems
Barriers to making content more accessible and
useful to researchers
- Systems not interoperable
- Heterogeneous applications/platforms
- Lack of commons standards
- Inelastic compute capability
19Advantages Cloud Services
- Flexibility
- Scalability
- Pay for use
- Easy to implement
- Cost
20Cost
- Public cloud providers drive cost down through
scale, location and virtualization technology
Large Data centers(50k) can achieve 5 to 7 times
costs savings over Medium Data Centers(1,000) Ha
milton, J Internet-Scale Service Efficiency (Sept
08)
21Issues
- Security
- Transparency
- Data lock in
- SLAs
- Trust
22DuraSpace
Trusted management of and access to durable
digital assets in the cloud
DuraSpace Mediating Service
23DuraSpace- Notional Architecture
24Architectural view
25Core services-Preservation based
- Replicate to multiple storage providers
- Replicate to multiple geographic areas
- Be able to manage content and services through
web based Dashboard - Includes integrity checking and monitoring
- Pay for use for services and storage
26Technology Services
- Build and run services on top of content stored
in the cloud - Search
- Aggregation
- Streaming
- Migration
- Hosting
- Enable others to build services/apps on top of
content
27Use CasesDuraSpace with Cloud Storage
- Online backup for text, images, datasets, video,
audio - Preservation-Multiple copies, geographies,
administrations - Temporary or permanent project storage
28Use casesDuraSpace with Cloud Compute
- Streaming service for video
- JPEG2000 image engine
- Indexing and other processing heavy jobs
- Staging area for repository ingest
- Repositories in cloud
- Data and text mining over open data
- Aggregation and web 2.0 tools on open content and
collections
29DuraSpace software
- Open source - apache license
- Open core
- Run Your Own Private clouds, University
consortia - Extensible Research partners
30Critical success factors
- Ease of use- simplicity
- Trusted partner for end user
- Cost effective
- Scalable/Flexible
- Can establish key partnerships with service
providers - Can build community of developers and users
31Timeline
- Identified initial cloud partners
- Identified initial pilot partners
- Defined initial requirements
- Initial open source release -Q3 2009
- Begin pilot- Fall 2009
- Extensions available for repository platforms- Q1
2010 - Roll out to Repository community-Q1 2010
- Launch production service Q2 2010
32Initial capabilities
- Replication, up to three providers (including
local store) - Web based Dashboard
- Data integrity checking and monitoring
- Can push content from DSpace/Fedora repository
platform - Integrated billing
- Compute capability
- A few initial compute services TBD
33Listen
- Sandy and Micheles DuraSpace webinar
http//www.education-webevents.com/