Title: The NSDL:
1The NSDL A Case Study in Interoperability William
Y. Arms Cornell University
2Acknowledgement and Disclaimer
The NSDL is a program of the National Science
Foundation's Directorate for Education and Human
Resources, Division of Undergraduate
Education. The NSDL Core Integration is a
collaboration between the University Center for
Atmospheric Research (Dave Fulker), Columbia
University (Kate Wittenberg) and Cornell
University (Bill Arms). The ideas discussed in
this talk do not represent the official views of
the NSF or the Core Integration team.
3Research Funding Europe and USA
Europe Grant is awarded to carry out the research
plan specified in proposal USA Grant is awarded
to carry out research in the area described in
the proposal, but is not expected to follow the
precise plan.
4New Initiatives during a Grant
Program Activity University Gigabit
testbed Mosaic Illinois CSTR Lycos Carnegie
Mellon DLI-1 Google PageRank Stanford DLI-2 Open
Archives Initiative Cornell
Examples of significant partial funding that was
not envisaged in the proposal.
5NSF-funded Research Programs
New ideas
New ideas
Research
6The NSDL Program
NSF's objective Build a comprehensive digital
library for all aspects of science
education NSF's approach Solicitation encouraged
wide diversity of proposals divided into general
categories Best 60 proposals funded -- more to
follow Grants allow projects flexibility Result
A splendid set of projects A challenge in
interoperability!
7NSDL Collections Funded by the NSF (a)
Focused collections
8(No Transcript)
9(No Transcript)
10(No Transcript)
11NSDL Collections Funded by the NSF (b)
Aggregates and federations
12(No Transcript)
13(No Transcript)
14(No Transcript)
15NSDL Service Projects Funded by the NSF
16(No Transcript)
17(No Transcript)
18(No Transcript)
19NSDL Core Integration Team Funded by the NSF
20Responsibility without Authority
Core Integration
Budget 4-6 million Staff 25 -
30 Management Diffuse
How can a small team, without direct management
control, create a very large-scale digital
library?
21How Big might the NSDL be?
- All branches of science, all levels of
education, very broadly defined - Five year targets
- 1,000,000 different users
- 10,000,000 digital objects
- 10,000 to 100,000 independent sites
22The NSDL program funds only a fraction of the
relevant collections.
Collections
23Every Collection is Different
24The Core Integration Task ...
... to provide a coherent set of services across
great diversity.
25A Spectrum of Interoperability
26Approaches to interoperability
The conventional approach ? Wise people develop
standards protocols, formats, etc. ? Everybody
implements the standards. ? This creates an
integrated, distributed system.
Unfortunately ... ? Standards are expensive to
adopt. ? Concepts are continually changing. ?
Systems are continually changing. ? Different
people have different ideas
27Interoperability is about agreements
Technical agreements cover formats, protocols,
security systems so that messages can be
exchanged, etc. Content agreements cover the
data and metadata, and include semantic
agreements on the interpretation of the messages.
Organizational agreements cover the ground
rules for access, for changing collections and
services, payment, authentication, etc. The
challenge is to create incentives for independent
digital libraries to adopt agreements
28Function versus cost of acceptance
Cost of acceptance
Few adopters
Many adopters
Function
29Example textual mark-up
Cost of acceptance
SGML
XML
HTML
Function
ASCII
30Example security
Cost of acceptance
Public key infrastructure
Login ID and password
IP address
Function
31Levels of interoperability
Level Agreements Example Federation Strict use
of standards AACR, MARC (syntax, semantic, Z
39.50 and business) Harvesting Digital
libraries expose Open Archives metadata
simple metadata harvesting protocol and
registry Gathering Digital libraries do not
Web crawlers cooperate services
must and search engines seek out information
32- Metadata Strategy
- Metadata is expensive
- The NSDL cannot afford to create it manually
-
33- Metadata Strategy Support eight standard
formats - Collect all existing metadata in these
formats - Provide crosswalks to Dublin Core
- Expose records in the metadata repository for
others to harvest - Concentrate on collection-level metadata
- Use automatic generation to augment
item-level metadata
34The Metadata Repository
Services
The metadata repository is a resource for service
providers. It holds information about every
collection and item known to the NSDL.
Users
Metadata repository
Collections
35Services Strategy
36The Metadata Repository as a Resource
- Records are exposed through Open Archives
Initiative harvesting protocol. - Core Integration team will provide some services
based on the metadata repository. - The architecture encourages others to build
services.
37Example Search Service
Metadata repository
Portal
OAI
SDLIP
Search andDiscoveryServices
Portal
http
Portal
Collections
James Allan, Bruce Croft (University of
Massachusetts, Amherst)
38- Research Challenges
- Extending the Architecture to Support Greater
Riches - ? Federations with rich sets of agreements
(e.g., MARC, Z39.50) - ? Rich object models (e.g., interactive,
dynamic, continuous time) - ? Language tools (e.g, thesaurus, gazetteer)
- ... and Lesser Riches
- ? Web crawling
- ? Automated quality control