OAI Overview - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

OAI Overview

Description:

http://www.cstc.org/cgi-bin/OAI/CSTC.pl?verb=GetRecord&identifier=oai:CST C ... to creating Open Archives,' Hussein Suleman (hussein_at_vt.edu), JCDL 2001 Tutorial. ... – PowerPoint PPT presentation

Number of Views:209
Avg rating:3.0/5.0
Slides: 28
Provided by: Ging152
Category:
Tags: oai | cgi | dodge | overview | parts | tutorial

less

Transcript and Presenter's Notes

Title: OAI Overview


1
OAI Overview
  • DLESE OAI Workshop
  • April 29-30, 2002
  • John Weatherley (jweather_at_ucar.edu)

2
Workshop Schedule
  • Day 1
  • Morning
  • Overview of OAI
  • Look at OAI tools and resources
  • Afternoon
  • DLESE OAI software installation, configuration
    and setup
  • Day 2
  • Morning
  • Overview of NDSL and DLESE interoperability
    architecture
  • NSDL metadata overview
  • Metadata and OAI

3
Resources
  • Workshop presentation slides, links to tools and
    other OAI resources are located
    athttp//oai.dlese.org

4
What is DLESE and NSDL?
  • DLESE Digital Library for Earth System
    Education
  • provides access to digitally accessible resources
    for learning about the Earth system
  • NSDL National Science (STEM) Digital Library
  • network of scholarly and educational digital
    libraries related to science (DLESE will be part
    of this network)

5
1. What is the OAI?
  • What is the Open Archive Initiative (OAI)?
  • Organization dedicated to solving problems of
    digital library interoperability by defining
    simple protocols and standards
  • Grew out of the e-prints (arXiv) community at Los
    Alamos
  • What is the OAI Protocol for Metadata Harvesting
    (OAI-PMH)?
  • Protocol to transfer metadata from a source
    archive to a destination archive
  • How is the OAI-PMH Being Used by the NSDL and
    DLESE?
  • The OAI-PMH has been adopted as a primary means
    of gathering and sharing metadata among
    contributors
  • Also used to facilitate internal management of
    metadata stores

6
What is Metadata?
  • Data refers to digital objects e.g. the resources
    themselves
  • Metadata is data about data e.g. a description
    about a resource, not the resource itself
  • OAI is used to transmit metadata

7
2. Definitions / Concepts
  • Basic Principles
  • Harvesting vs. Federation
  • Data Providers vs. Service Providers
  • Underlying Technology
  • HTTP and XML
  • XML Namespaces and Schema
  • Protocol Policies and Conventions
  • Basic Policies
  • Sets

8
Harvesting vs. Federation
  • Competing approaches to interoperability
  • Federation is when services such as searching are
    run remotely
  • Harvesting is when metadata is transferred from
    remote sources to the destination where the
    services are located
  • Federation requires more effort at the remote
    site but is easier for the local system
  • Harvesting requires less effort at the remote
    site Services are provided by the local system
  • OAI uses the harvesting model

9
Data Providers vs. Service Providers
  • Data Providers refer to entities who possess
    metadata and are willing to share this with
    others (e.g. collection builders)
  • Service Providers are entities who harvest data
    from Data Providers in order to provide
    higher-level services to users (e.g. searching,
    browsing, recommender systems, etc.). The NSDL
    and DLESE are examples.

10
Features of the OAI Approach
  • Lightweight Low overhead for Data Providers
  • Protocol is relatively simple to implement
  • Many plug-and-play tools publicly available
  • Transports any metadata framework that can be
    made available in XML form (details to come)
  • Details of searching, browsing, annotation and
    other advanced services are handled by the
    Service Provider

11
Metadata Harvesting Framework
Data Providers (collection builders)
Library User
1. Service Provider polls periodically for new
records
3. Provide searching, browsing, and other
services over the data.
OAI protocol (over http)
Service Provider (DLESE, NSDL)
Harvested Records
2. New records downloaded and cached by the
Service Provider
12
HTTP and XML
  • The OAI-PMH is an almost stateless
    request/response protocol
  • Requests and responses are sent via the HTTP
    protocol
  • Requests are encoded as GET/POST operations
  • Responses are well-formed XML documents

13
Well-formed and Valid XML
  • Correct
  • ltcargt
  • ltmakegtDodgelt/makegt
  • ltmodelgtSpiritlt/modelgt
  • ltyeargt1994lt/yeargt
  • ltownergt
  • ltnamegtyoult/namegt
  • ltplategtCOlt/plategt
  • lt/ownergt
  • lt/cargt
  • Incorrect
  • ltcargt
  • ltmakegtDodgelt/makegt
  • ltmodelgtSpiritlt/modelgt
  • ltyeargt1994
  • ltownergt
  • ltplategtCOlt/plategt
  • ltnamegtyoult/namegt
  • lt/cargt
  • lt/ownergt

14
DTD, Schemas Namespace
  • DTDs Document Type Definition
  • Describe the elements of XML instance documents
  • Not well-formed XML
  • Some data-typing
  • Namespaces harder to deal with
  • Schemas
  • Describe the elements of XML instance documents
  • Well-formed XML
  • Strong data-typing
  • Namespaces are easier to deal with
  • Namespace
  • Collection of related element names identified by
    a name label (e.g. dc)

15
XML Namespaces and Schema
  • Consistency and data quality is ensured by using
    XML Schema descriptions for each possible
    response
  • XML Namespaces are used where necessary to
    clearly define which parts of the responses are
    actual metadata and which support the OAI-PMH.
  • Example
  • http//www.cstc.org/cgi-bin/OAI/CSTC.pl?verbGetR
    ecordidentifieroai3ACSTC3A103metadataPrefixo
    ai_dc

16
Basic OAI Policies and Conventions
  • Each metadata record from a given Data Provider
    must have a unique ID (OAI ID is not necessarily
    the same as the record ID)
  • Each metadata record must be persistent so that
    Service Providers can always refer back to the
    source
  • Each record must have a date stamp indicating
    creation / modification date
  • Dates provide a mechanism for incremental and
    continuous transfer of metadata by only
    requesting records that have changed since the
    previous harvest
  • Flow Control - Resumption Tokens can be used to
    return partial results the client is issued a
    token which may be presented to the server to
    receive more results
  • Multiple formats of metadata are allowed
  • Examples Dublin Core, DLESE IMS

17
Sets
  • OAI-PMH mechanism to allow for harvesting of
    sub-collections
  • Semantics for sets are defined outside of the
    protocol
  • Sets are defined by conventions established
    between data and service providers
  • Example sets within DLESE might be DWEL, COMET,
    LDEO, etc.
  • Example sets within the NDSL might be DLESE,
    DLESEDWEL, DLESECOMET, DLESELDEO, etc.
  • Sets can be established that enable querying
    (e.g. by topic, author name, subject area, etc.)
  • Example The Open Digital Library (Suleman, 2001)

18
3. Requirements to be a Data Provider
  • Source of metadata
  • Human or automated resource catalogers
  • Metadata mappings
  • Crosswalks from native formats to DC or other
    formats
  • Server technology
  • Handled by the OAI software
  • Datestamps
  • Deletions
  • Unique identifiers

19
4. The OAI-PMH
  • Service Requests
  • Identify
  • ListMetadataFormats
  • ListSets
  • GetRecord
  • ListIdentifiers
  • ListRecords
  • Date Ranges
  • Resumption Tokens

20
Identify
  • Purpose
  • Return general information about the archive and
    its policies
  • Parameters
  • None
  • Sample URL
  • http//oai.dlese.org/provider?verbIdentify

21
ListMetadataFormats
  • Purpose
  • List metadata formats supported by the archive as
    well as their schema locations and namespaces
  • Parameters
  • Identifier for a specific record ( O )
  • Sample URL
  • http//oai.dlese.org/provider?verbListMetadataFor
    mats

22
ListSets
  • Purpose
  • Provide a hierarchical listing of sets in which
    records may be organized
  • Parameters
  • None
  • Sample URL
  • http//oai.dlese.org/provider?verbListSets

23
GetRecord
  • Purpose
  • Returns the metadata for a single identifier in
    the form on an OAI record
  • Parameters
  • identifier id for the record ( R )
  • metadataPrefix metadata format ( R )
  • Sample URL
  • http//oai.dlese.org/provider?verbGetRecordident
    ifierdlese3ADLESE-000-000-000-002metadataPrefix
    dlese_ims

24
ListIdentifiers
  • Purpose
  • List all unique identifiers corresponding to the
    record in the repository
  • Parameters
  • from start date ( O )
  • until end date ( O )
  • resumptionToken flow control mechanism ( X )
  • Sample URL
  • http//oai.dlese.org/provider?verbListIdentifiers

25
ListRecords
  • Purpose
  • Retrieves metadata for multiple records
  • Parameters
  • from start date ( O )
  • until end date ( O )
  • resumptionToken flow control mechanism ( X )
  • set set to harvest from ( O )
  • metadataPrefix metadata format ( R )
  • Sample URL
  • http//oai.dlese.org/provider?verbListRecordsmet
    adataPrefixdlese_ims

26
DLESE Architecture
DLESE Portal
Library Users
Services (e.g. Whats New)
Search Discovery
OAI
MetadataRepository
NSDL
OAI
OAI
Direct Entry
Resources
Collections
27
References
  1. Building Interoperable Digital Libraries A
    Practical Guide to creating Open Archives,
    Hussein Suleman (hussein_at_vt.edu), JCDL 2001
    Tutorial.
  2. A Framework for Building Open Digital
    Libraries, Hussein Suleman and Edward A. Fox, in
    D-Lib Magazine, December, 2001.
    http//www.dlib.org/dlib/december01/suleman/12sule
    man.html
  3. The Open Archives Initiative http//www.openarchiv
    es.org
Write a Comment
User Comments (0)
About PowerShow.com