Digital Library Architecture: A Service-Based Approach - PowerPoint PPT Presentation

About This Presentation
Title:

Digital Library Architecture: A Service-Based Approach

Description:

Digital Library Architecture: A Service-Based Approach Mo i Rana, Norway November 10, 1998 Sandra Payette Department of Computer Science Cornell University – PowerPoint PPT presentation

Number of Views:174
Avg rating:3.0/5.0
Slides: 36
Provided by: csCornell2
Category:

less

Transcript and Presenter's Notes

Title: Digital Library Architecture: A Service-Based Approach


1
Digital Library ArchitectureA Service-Based
Approach
Mo i Rana, Norway November 10, 1998
Sandra Payette Department of Computer
Science Cornell University payette_at_cs.cornell.edu
http//www2.cs.cornell.edu/payette/presentations/D
L-architecture.ppt
2
Overview
  • Why talk about DL architecture?
  • Digital Libraries - the architectural perspective
  • Review of service-based architecture
  • NCSTRL - a working example
  • Dienst - existing service-oriented architecture
  • Cornell next generation (component-oriented)
  • Conclusion

3
Why Talk about Digital Library Architecture?
  • Web alone is not a digital library
  • Commercial packages limited
  • limited flexibility
  • standards issues
  • network-enabled applications not DL architecture
  • Must position for broader DL opportunities

4
Web by itself not a DL Architecture
  • Documents - Files, CGI, MIME-Types
  • Naming - URLs
  • Document Servers - HTTP servers
  • Resource Discovery - web crawlers
  • Collections - web pages, ad-hoc
  • IP - Access Control List, passwords, ad-hoc

5
WWW Infrastructure Evolving
  • Resource Description Framework (RDF)
  • will allow rich metadata semantics for documents
  • http//www.w3.org/RDF/
  • Extensible Markup Language (XML)
  • will allow highly structured documents and rich
    linking (relationship) capabilities
  • http//www.w3.org/XML/
  • Uniform Resource Names (URNs)
  • will allow for persistent, globally unique
    identifiers

6
But still need Digital Library Architecture
  • Richer document model - digital objects
  • Persistent, unique naming - URNs
  • Well-defined digital library services
  • Better facilities for resource discovery
  • Flexible definition of collections
  • Management of distributed content services
  • Rights management for intellectual property

7
Digital Library Interoperability
8
Digital Library ArchitectureKey Principles
  • Open Architecture
  • functionality partitioned into set of
    well-defined services
  • services accessible via well-defined protocol
  • Modularization
  • promotes interoperability
  • scalable to different clientele (research
    library, informal web)
  • Federation
  • enable aggregations into logical collections
  • Distribution
  • of content (collections) and services
  • of administration and management of DL

9
Component-Ware Digital Libraries
Digital Objects
10
NCSTRL A Working Example
A Globally Distributed Digital Library
120 Institutions in US, Europe, and Asia
11
NCSTRL Participants collections federated
  • 120 institutions
  • Universities/labs - research reports
  • European Research Consortium for Informatics and
    Mathematics (ERCIM)
  • Los Alamos (Physics pre-prints, ACM )
  • D-Lib Magazine
  • 40 independent servers

12
Federation of Collections
13
Documents in Distributed Repositories
14
Multi-Format Document Model
15
NCSTRLReal-world testbed for ...
  • modular system based on a standard open
    architecture
  • study of hard, real-world problems policy
    issues, quality of service, federation of
    publishers
  • creation of a self-sustaining international
    federated digital collection

16
Dienst NCSTRL technical base
  • Implements a service-based architecture for
    distributed digital libraries
  • Protocol and reference implementation
  • Network of services
  • WWW browser access
  • Uniform search over distributed indexes
  • Access to documents in distributed repositories
  • Access to multi-formatted documents

17
DienstService-Based Architecture
  • Document model
  • Naming service (CNRIs Handle System)
  • Repository service
  • Indexer service
  • Collection service
  • User Interface service

18
Dienst Document Model
19
(No Transcript)
20
Dienst Document Protocol
  • Documents addressable through their URNs
  • Document service requests
  • get document metadata
  • get document formats
  • get document in format
  • get document partition (page) in format

21
Dienst 5.0 Document Protocol
  • More complex document model
  • versions
  • hierarchical part specification
  • binders (multi-part documents)
  • Structure service request
  • Reveal, in XML, full or collapsed structure of a
    document
  • e.g., chapters, sections, figures, etc.
  • Describe multiple views of a document
  • e.g., bibliography, content, thumbnails

22
Dienst Core Services
WWW browser
Dienst User Interface
23
Dienst ProtocolBuilding Gateways to
non-Conforming Sites
24
Dienst Collection Service
25
Naming Service
  • Documents identified by globally unique names
  • Names are persistent, permanent
  • Registered names resolve to specific location
    (URL)

cnri.dlib/april97-payette
Persistent Identifier (e.g., URN)
Naming Authority
Item Name
Location (URL)
http//www.somewebserver.org/somedirectory/somefil
e
26
Identifiers Current Initiatives
  • IETF Uniform Resource Names (URN)
  • specification of URN framework
  • requirements for resolution systems
  • syntax definition
  • Existing Systems
  • CNRIs Handle System (NCSTRL uses)
  • OCLC PURLs
  • DOI Initiative

27
Looking Ahead Current Research at Cornell
  • Digital Objects and Repository
  • FEDORA
  • Joint work in Interoperability with CNRI
  • Access Management
  • Resource Discovery
  • STARTS (Cornell/Stanford collaboration)
  • Intelligent Distributed Searching
  • Collection Definition

28
Digital Object is...
getSection getArticle
getTrack getLabel
getChapter getPage
getFrame getLength
recognizable by what it can do
29
What the client sees vs.What the object is
Book
Content-Type Interfaces
MARC
Mechanism
Structure
30
FEDORA DigitalObject
31
FEDORAExtensibility for Content Types
  • Simple, familiar content types
  • Complex, compound, dynamic content types

32
Resource Discovery
  • Meta-Searching for Resource Discovery
  • query multiple document sources
  • choose best sources to evaluate a query
  • evaluate the query at these sources
  • merge the query results from these sources
  • Stanford Protocol Proposal for Internet Retrieval
    and Search (STARTS)
  • www-db.stanford.edu/gravano/starts.html
  • www.cs.cornell.edu/NCSTRL/STARTS/STARTShome.htm

33
Distributed Collection Service Definition and
Access
User Interface
Intelligent routing based on regional conditions
Central Collection Server
34
Conclusions Design with an Eye Toward the Future
  • Know limitations of ad-hoc web development and
    commercial packages
  • Embrace a service-based approach
  • modular designs increase flexibility,
    extensibility, plug-in/plug-out
  • well-defined services with protocols to enable
    federation and interoperability
  • can utilize various technologies or commercial
    software underneath the service layers
  • Watch Web developments in XML and RDF

35
Further reading
  • Lagoze and Payette An Infrastructure for
    Open-Architecture Digital Libraries
    http//ncstrl.cs.cornell.edu/Dienst/UI/1.0/Displa
    y/ncstrl.cornell/TR98-1690
  • Davis and Lagoze NCSTRL Design and Deployment
    of a Globally Distributed Digital Library, Draft
    of submission to IEEE Computer Special Issue on
    Digital Libraries, February 1999.http//www2.cs.c
    ornell.edu/lagoze/papers/NCSTRL-IEEE3.doc
  • Payette Persistent Identifiers, RLG DigiNews
    http//www.rlg.org/preserv/diginews/diginews22.htm
    l
  • Payette and Lagoze Flexible and Extensible
    Digital Object and Repository Architecture
    (FEDORA)http//www2.cs.cornell.edu/NCSTRL/CDLRG/F
    EDORA.html
Write a Comment
User Comments (0)
About PowerShow.com