From eprint archives to open archives and OAI: the Open Citation project - PowerPoint PPT Presentation

About This Presentation
Title:

From eprint archives to open archives and OAI: the Open Citation project

Description:

These s prepared for the JISC/NSF Digital Libraries Initiative (DLI) All Projects Meeting, Edinburgh, 24-25th June 2002 OpCit is a joint JISC-NSF – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 22
Provided by: SteveHi2
Category:

less

Transcript and Presenter's Notes

Title: From eprint archives to open archives and OAI: the Open Citation project


1
From eprint archives to open archives and OAI
the Open Citation project
  • By The Open Citation Project team
  • Presented by Steve Hitchcock, Southampton
    University
  • These slides prepared for the JISC/NSF Digital
    Libraries Initiative (DLI) All Projects Meeting,
    Edinburgh, 24-25th June 2002
  • OpCit is a joint JISC-NSF
  • International Digital Libraries Project 1999-2002

2
About this presentation
  • The aim is to
  • Report progress since Stratford All-Projects
    meeting in 2000
  • Demonstrate new services developed by the
    project
  • Highlight the role of the project in the Open
    Archives Initiative
  • Outline key tasks remaining
  • Look beyond the Open Citation Project

3
Recap 1 principal partners
  • Southampton University, IAM (Intelligence,
    Agents, Multimedia) Research Group, PI Stevan
    Harnad
  • Citation-ranked search, EPrints.org, user surveys
  • Cornell University, Digital Library Research
    Group, PI Carl Lagoze
  • Architecture for reference linking, experiments
    with the ACM Digital Library and D-Lib magazine,
    OAI technical support center
  • arXiv.org, Paul Ginsparg
  • Now based at Cornell University. Still the
    largest archive of freely accessible
    author-deposited scientific papers

4
The Open Citation Project deliverables
  • The Open Citation Project (OpCit) is developing
    software and services to support the Open
    Archives Initiative (OAI). OpCit can help OAI
    data providers and service providers
  • Citebase citation-ranked search
  • EPrints.org software free software to build and
    manage OAI-compliant eprint archives
  • API for reference linking, an interface on which
    reference linking applications can be built

5
Recap 2 last time at Stratford
  • Reference links on pdf copies of papers

6
Citebase, a new interface to the scholarly
literature

7
Citebase, a citation-ranked search engine
  • http//citebase.eprints.org/
  • Google for the refereed literature
  • Citebase is based on a citation database
  • Harvests metadata using OAI-PMH
  • Extracts reference lists from arXiv papers
  • Provides impact (and other)-ranked search based
    on reference data
  • Re-exports metadata references

8
Evaluating Citebase
  • http//citebase.eprints.org/survey/
  • The evaluation is aimed at users of arXiv, and
    all others who use bibliographic services to
    access the refereed journal literature. You can
    contribute (June-July 2002) using the form linked
    above.
  • Aims of the evaluation
  • Discover the users awareness of related
    services
  • Assess usability with a practical exercise
  • Invite the users views on the main features
  • Assess the level of user satisfaction with the
    service

9
Citebase further developments
  • OpenURL-enabled pointing Citebase links at
    library and journal services
  • Google interface using DP9 getting Citebase
    results, and open archives, into Google
  • Metadata format and XML schema for citations
    making citation metadata harvestable via OAI-PMH.
    Possible formats include
  • Academic Metadata Format a local profile
    format, some collaborative experiments performed
    within OpCit
  • OpenURL metadata, moving towards NISO
    standardisation

10
Recap 3 API for reference linking
  • getLinkedText contents of the paper,
    reference-linked plus lots of metadata for the
    paper
  • getReferenceList this papers references
    getCurrentCitationList the list of
    works citing this paper (best knowledge)
  • getMyData metadata for this paper

11
Surrogates in the API
  • Based on an automatic analysis of the work, a
    surrogate for a scholarly work (and of other
    works, for citations), consists of the following
    three XML files
  • Bibliographic data for the scholarly work
  • References contained in that work, and their
    contexts within the full text
  • Citations of that work

12
API evaluation
  • API tested on D-Lib Magazine and the ACM Digital
    Library. Try demo at http//cs-tr.cs.cornell.edu/R
    efLinkingDemo/
  • Performance (in terms of accuracy of data
    extracted)
  • Reference analysis 86.7
  • Item analysis (bib data, contexts, and
    references for a given paper) 82.42
  • Implementability
  • Simple interface Surrogate s new Surrogate
    (some-url)
  • Portable written in Java, has run in Solaris,
    Win2K, and NT4
  • Installation API source code plus public domain
    jar files

13
EPrints.org software
  • http//www.eprints.org/
  • Generates eprint archives that are compliant with
    the Open Archives Initiative Protocol for
    Metadata Harvesting. EPrints is free (GPL)
    software. It is aimed at organisations and
    communities.
  • EPrints v. 2.0 released February 2002 (now on v.
    2.0.1, which fixes bugs and typos). Features
  • Internationalised metadata stored as Unicode
  • Support for multiple archives on one server
  • Improved user interface

14
OpCit and OAI
  • OAI Aggregator (Celestial) collecting and
    caching the results from OAI data providers to
    improve the efficiency of data harvesting
    http//celestial.eprints.org
  • OAI infrastructure proxies, caches, gateways.
    Improve interoperability, scalability and
    reliability of OAI services. Joint work with Old
    Dominion University, see paper http//arxiv.org/ab
    s/cs.DL/0205071
  • OAI Registration and Validation performed at
    Cornell http//www.openarchives.org/Register/Brows
    eSites.pl

15
EPrints and OAI
  • EPrints feeds repository URLs straight into the
    OAI registration process (if so desired by the
    EPrints administrator)
  • A scan of the OAI database of registered sites
    shows many sites use EPrints software to create
    repositories

16
A repository administrators view of OAI
  • As we have introduced our repository to our
    faculty and staff, we have emphasized the point
    that because they would be depositing their
    material in an OAI-compliant archive, it would
    automatically and painlessly be discoverable from
    various other points around the globe. Luckily,
    we were right.
  • Roy Tennant, eScholarship, California Digital
    Library, June 2002 http//www.ecs.soton.ac.uk/har
    nad/Hypermail/Amsci/2085.html

17
OpCit user surveys and data mining
  • Maximising impact
    Maximising access
  • Results from Mining the Social Life of an Eprint
    Archive http//opcit.eprints.org/tdb198/opcit/
  • When interoperability is not enough show authors
    what users do when open access services are
    available

18
Key project tasks remaining
  • OpCit formally ends in September 2002. Before
    then
  • Evaluation and reporting of the results
  • Programmer's guide to using the API
  • Journal and conference papers
  • Final reports to JISC and NSF

19
Beyond OpCit
  • Beyond the project, the following will continue
    to be developed
  • Citebase
  • EPrints.org
  • OAI
  • and variously applied in the JISC FAIR
    programme (start 2002)
  • http//www.jisc.ac.uk/dner/development/programmes/
    fair.html
  • Targeting Academic Research for Deposit and
    Disclosure (lead institution Southampton
    University)
  • e-prints UK (RDN, Kings College London)
    citation analysis service for eprints database
  • Machine-readable rights metadata (Loughborough
    University)

20
What we have achieved what we have learned
  • OAI is gathering momentum
  • Software for building OAI repositories is
    available
  • Institutional archives are being created, but
    need to be filled by authors
  • Attracting authors requires evidence of services
    that will improve the visibility and impact of
    their works
  • Citation-ranked search and reference linking are
    examples of OAI services that do this
  • The infrastructure supporting OAI services
    continues to be enhanced
  • Resource discovery and current awareness are
    exemplar OAI services now. Future services may be
    preservation management, and personalization

21
Credits
  • Other contributors to the project include
  • Technical development at Southampton is directed
    by Les Carr
  • Research at Cornell by Donna Bergmark
  • EPrints.org software is being developed by Chris
    Gutteridge
  • Citebase is produced and managed by Tim Brody
  • Project manager is Steve Hitchcock
  • A copy of these slides can be found on the OpCit
    Web site
  • http//opcit.eprints.org/. Look for Papers and
    Presentations
  • Contact Steve Hitchcock sh94r_at_ecs.soton.ac.uk
Write a Comment
User Comments (0)
About PowerShow.com