The Digital Library: Current Technologies and Challenges - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

The Digital Library: Current Technologies and Challenges

Description:

DOI is a number that identifies a digital ... Cell Phone/Pocket PC combination. With Pocket Devices, use by patrons and staff for remote search, processing. ... – PowerPoint PPT presentation

Number of Views:95
Avg rating:3.0/5.0
Slides: 27
Provided by: twc2
Category:

less

Transcript and Presenter's Notes

Title: The Digital Library: Current Technologies and Challenges


1
The Digital Library Current Technologies and
Challenges
  • William H. Mischo
  • w-mischo_at_uiuc.edu
  • Grainger Engineering Library Information Center
  • University of Illinois at Urbana-Champaign
  • SLA Global 2000
  • October 18, 2000

2
Outline
  • Definition of Digital Library.
  • Elements of a Digital Library.
  • Full-Text Document Technologies.
  • Illinois Testbed.
  • XML its Role and Importance.
  • Distributed Repository model.
  • Role of Libraries and Librarians.

3
The Digital Library
  • Digital, Virtual, Electronic Library as
    network-based library without regard to place and
    time.
  • Implementation issues.
  • Digital Collections vs. Digital Library.
  • Must Emphasize Integration of Collections and
    Services.

4
Elements of DL
  • Collections.
  • Services.
  • Technologies and Standards.
  • Integration of All.

5
Full-Text Technologies
  • Continuum of Web-Enabled Technologies.
  • Evolving Technologies and Standards.
  • All Presently being Utilized.
  • Role and History of Markup.
  • XML its Role and Importance.
  • The Smart Document.

6
Illinois DLI-I Project
  • Funded under DLI-I by NSF, DARPA, and NASA,
    1994--1998. Awards made to 6 universities.
  • Large-Scale Testbed, Distributed Repository
    Models, Evaluation, Web Software.
  • CNRI D-Lib Test Suite Program, 19982001.
  • Collaborating Partners Program. AIP, APS, ASCE,
    IEE, NRL, ASM, ACM, NTT Learning Systems,
    Elsevier.

7
Illinois Testbed
  • American Institute of Physics--APL, JAP, RSI
  • 16,000 articles, 1995--.
  • American Physical Society--PRL
  • 10,000 articles, 1995--, weekly updates.
  • ASCE Journals (25 titles)
  • 9,000 articles, 1995--.
  • IEE Proceedings and Electronics Letters
  • 8,500 articles, 1993--.
  • ASM (American Society for Materials) Handbook.
  • ACM (Association for Computing Machinery).
  • Elsevier Science.

8
Project Issues
  • Evolution of the Document.
  • Information Environment.
  • Use of Metalanguages Transformations (SGML,
    XML).
  • Searching over Full-Text of Journals vs. Abstract
    Index Service Database.
  • Rendering and Styling (SGML, XML, MathML).
  • Dynamic Metadata for Normalization, Linking.
  • Breadth and Depth of Collections.
  • User Needs.

9
Accomplishments
  • Process Retrieve from Multiple Publishers
    Heterogeneous DTDs.
  • Cross-Repository Searching.
  • SGML to XML Conversion.
  • Metadata Extraction, Representation, Merging.
  • Transformation Rendering Technologies.
  • Dynamic Linking Forward/Backward, from/to A I
    Services.

10
Ongoing Investigations
  • Support simultaneous searching of A I Services,
    Distributed Repositories, enhanced navigation,
    expanded gateway functions.
  • Metadata Harvesting Replicative or Distributed
    Approaches.
  • Z39.50 protocols, HTTP Harvesting, Spider
    Technology.
  • Archiving of Electronic Resources.
  • Local Resolution of Resources.

11
XML (eXtensible Markup Language)
  • Subset of SGML, a Data Description Language
    (Metalanguage).
  • Allows fine-granularity markup of content and
    structure. Author can create their own elements
    (extensible).
  • Tags define the Structure of Document not
    Presentation Format.
  • Two types of valid XML well-formed document
    structure without DTD and well-formed with
    validating DTD.
  • Displays natively only in IE 5.0 and Netscape
    6.0.
  • Powers B2B, compatible with Relational DBs.

12
Role of XML
  • If you ask 20 people in the industry, what is
    XML? Youll get 20 different answers Dale
    Fuller, CEO, Inprise Corporation.
  • Vendor-Neutral, Platform-Independent Structured
    Information Standard.
  • Document Representation and Interchange Standard.
  • Applications can externalize their data as XML.
  • XML data, CSS presentation layer, XSL to modify
    the structure of the document.

13
Distributed Repository Model
  • Information Environment in which we Operate.
  • Web-Based and Publisher-Centric.
  • Multiple Relationships and Nodes.
  • Need for Gateway and Navigation Tools.
  • Need for Integration, Linking.
  • Publisher Repository approaches to Retrieval.
  • A and I Service Issues.

14
Distributed Repository Issues
  • Integration of discrete publisher repositories,
    local and remote A I services, OPAC, Web
    resources, and local data within gateway and
    navigation tools.
  • Issues for user access
  • need to identify appropriate publisher
    repository, but presently interfaces are
    different and full-text and controlled vocabulary
    searching often not offered.
  • A Is not full-text but offer controlled
    vocabulary, no links to full-text repositories.

15
Distributed Repository Search
  • Needed feature set
  • A Is need links to full-text at article level
    via Digital Object Identifier (DOI), vocabulary
    switching within controlled vocabularies. Will we
    see consolidation of A I services? Other
    information providers? PubMed/PubRef, PubSCIENCE
    (DOE/OSTI)
  • Publisher metadata repository for central
    searching deposit metadata in conjunction with
    DOI.
  • Browser technology that fully incorporates XML,
    CSS.

16
Digital Object Identifier (DOI)
  • DOI is both a unique identifier of a piece of
    digital content AND a system to access that
    content digitally.
  • The ISBN for the 21st Century -- Norman Paskin.
  • DOI system has two main parts (the identifier
    and a directory system) and a third logical
    component, a database.
  • Developed by AAP (Association of American
    Publishers), now managed by International DOI
    Foundation.

17
DOI Construction
  • First open standard for content identification.
  • DOI is a number that identifies a digital object
  • 10.1063/S000369519903216
  • 10 Registration Agency Prefix
  • 1063 Publisher Prefix
  • S000369519903216 Suffix (Publisher-assigned
    ID)
  • Suffix can be SICI or PII.
  • DOI and URL pointing to the digital object, is
    registered with the International DOI Foundation.
  • 10.1234/4356 http//www.pubsite.org/apr99/artl1.
    pdf

18
Using a DOI
  • DOIs are resolved using the Handle System
    technology from CNRI (Corporation for National
    research Initiatives).
  • Retrieval of object is two step process link is
    sent to central directory where current Web
    address is stored, location is sent back to
    browser with special message to redirect to
    address, e.g
  • dx.doi.org/10.100/1 redirects to www.pub/art1.pdf
  • CrossRef Project major Sci-Tech professional
    societies and commercial publishers.

19
Reference Linking
  • In some fields, e.g. Physics, publishers have
    linking agreements already in place.
  • Alternatives to DOI
  • PubMed/PubRef (National Library of Medicine)
  • PubSCIENCE (DOE/OSTI)
  • OpCit project
  • Proprietary Link Managers (AIP, APS)
  • System design calls for one URL for each DOI
    underlying technology can handle multiple URLs
    however.

20
Current Work
  • Pilot Project involving CNRI, SFX, Academic
    Ideal.
  • OpenURL Protocol.
  • Recent Letter to CrossRef and IDF.
  • Demonstration Project at Illinois and OhioLink.
  • Local Resolver.
  • Localizing Name Resolution for AIP, ASCE,
    Elsevier, other publishers.
  • Use of CrossRef Metadata Database for identifying
    Publisher from DOI and linking to Local Copy, A
    I Services, Library Assistance.

21
Computer Technologies
  • XML Appliances Intel XML Accelerator.
  • Thin Desktops
  • Legacy-free PCs
  • Network appliances (Sun Rays).
  • Ubiquitous Computing
  • Pocket PCs --Windows CE machines
  • PalmPilots.

22
Wireless Technologies
  • Wireless Computing
  • Security issues
  • Bandwidth and throughput limited
  • CDPD (Cellular Digital Packet Data)
  • Web clipping vs. portable HTML
  • Cell Phone/Pocket PC combination.
  • With Pocket Devices, use by patrons and staff
    for remote search, processing.

23
Role of the Sci-Tech Library
  • Function of Library
  • Collect source materials
  • Organize materials
  • Provide access to materials.
  • Change above activities are now distributed, not
    confined to a specific place.
  • Question How do the support services for these
    activities need to change?

24
Issues
  • Library as Function not Place.
  • Acknowledgment of and Support for the Librarys
    Role in the Campus Information Infrastructure.
  • Provide a Digital Library out of digital
    collections.
  • Moving up on the Information Food Chain personal
    collection, colleague, e-mail, Web, Library.
  • Archiving issues (Open Archive Initiative)
    Archive implies an access mechanism).

25
4th Generation Information System
  • Simultaneous Searching of Multiple Resources.
  • Remote Reference and Instruction (Collaboration
    and Whiteboard--apply Help Desk Software).
  • Software-Aided Search Navigation and
    Modification.
  • Dynamic Links to Full-Text. Appropriate Copy
    problem.
  • One-Stop-Shopping.

26
Role of the Academic Librarian
  • In addition to Raising money dealing with
    Publishers/vendors.
  • Experts in Information Seeking Process, Research,
    and Instructional Programs.
  • Knowledge of Emerging Information Technologies.
  • Ability to Work Effectively at Campus Level.
  • Ability to Train, Mobilize, and Enthuse Staff.
  • Cooperative Endeavors with other Departments,
    Grant Agencies, and Government Agencies.
Write a Comment
User Comments (0)
About PowerShow.com