Title: Lessons from the Open Citation Project
1Lessons from the Open Citation Project
- Presented by Steve Hitchcock, Southampton
University - These slides prepared for The Open Archives
Initiative application and exploitation, a
one-day seminar on the application and
exploitation of the OAI Protocol for Metadata
Harvesting, May 14, 2003, London - A joint JISC-NSF
- International Digital Libraries Project 1999-2002
2(No Transcript)
3A post-Google information environment
- Electronic journals exist in a post-Gutenberg and
a post-Google information environment - The ability to locate a specified item of
information precisely and instantly among the
mass of information available on the Web has
profound implications. - In the electronic environment the search engine
has become the de facto interface to information,
rather than the fragmented packages that have
migrated from the print world.
4About this presentation
- Citebase citation-ranked search and impact
discovery service - New scientometric indices
- Evaluating Citebase
- EPrints.org software free software to build and
manage OAI-compliant eprint archives - Growth of OAI, Eprints.org and institutional
archives - How to accelerate the growth of OAI eprint
archives
5Citebase, a discovery service with usage- and
citation-bases ranking
- http//citebase.eprints.org/
- Google for the refereed literature
- Citebase is based on a citation database
- Harvests metadata using OAI-PMH
- Extracts and indexes citations from published
research papers stored in the larger open access,
OAI disciplinary archives - currently arXiv,
CogPrints and BioMed Central - Provides impact (and other)-ranked search based
on reference data - Re-exports metadata references
6Some old and new scientometric (publish or
perish) indices ofresearch impact
- Quality-level and citation-counts of the journal
in which the article appears - Citation-counts for the article
- Citation-counts for the researcher
- Co-citations, co-text (cited with whom/what
else?) - Citation-counts for the preprint
- Usage-measures (hits, Webmetrics)
- Time-course analyses, early predictors, etc.
7Citebase, a new interface to the scholarly
literature
8Time-Course of Citations (red) and Usage (hits,
green)Witten, Edward (1998) String Theory and
Noncommutative Geometry Adv. Theor. Math. Phys.
2 253
1. Preprint or Postprint appears. 2. It is
downloaded (and sometimes read). 3. Eventually
citations may follow (for more important
papers). 4. This generates more downloads, etc.
Perhaps the most important new information to
become available for bibliometric studies is the
per article readership information. Kurtz et al.
(2003) "The NASA Astrophysics Data System
Sociology, Bibliometrics and Impact"
http//cfa-www.harvard.edu/kurtz/jasist-submitte
d.ps
9Evaluating Citebase
- http//opcit.eprints.org/opcitevaluation.shtml
- First detailed user evaluation of an open access
Web citation indexing service - The evaluation was aimed at users of arXiv, and
all others who use bibliographic services to
access the refereed journal literature. - Citebase was evaluated by nearly 200 users from
different backgrounds between June and October
2002 - Just prior to the evaluation Citebase had
records for 230,000 papers, indexing 5.6 million
references. - By discipline, approximately 200,000 of these
papers are classified within arXiv physics
archives.
10Results of Citebase evaluation
- Web-based citation indexing of open access
eprint archives is closer to a state of readiness
for serious use than had previously been realised
- Within the scope of its primary components, the
search interface and services available from its
rich bibliographic records, Citebase can be used
simply and reliably for the purpose intended - Tasks can be accomplished efficiently with
Citebase regardless of the background of the user - Links to citing and co-citing papers are
features of Citebase that are valued by users - Citebase compares favourably with other
bibliographic services - Coverage is seen as a limiting factor.
Non-physicists were frustrated at the lack of
papers from other sciences
11Accomplishing tasks with Citebase
All users
- Tasks can be accomplished efficiently with
Citebase regardless of the background of the
user. - A key part of the evaluation assessed the
usability of Citebase with a practical exercise
to build a short bibliography based on a series
of questions - Yellow line, Ttrue
- Blue, Ffalse
- Purple, Nno response
Physicists only
12Most useful features of Citebase
- Links to citing and co-citing papers are features
of Citebase that are valued by users
13Citebase compares favourably with other
bibliographic services
14Growth of OAI, Eprints.org and Institutional
Archives
- How OAI Archives for institutional research
output have been growing and how to accelerate
their growth - The following slides are taken from the
presentation The Research Impact Cycle, which
contains key data on the growth of open access
through the self-archiving of institutional
(peer-reviewed) research. These data can be
freely used or adapted for other talks. Copy this
PPT version for reuse. - http//www.ecs.soton.ac.uk/harnad/Temp/self-archi
ving.ppt - Data collected and analysed by Tim Brody,
Electronics and Computer Science, Southampton
University
15Growth in number of OAI Archives (now 140
Archives, but the average number of papers per
Archive (9000) needs to grow faster!)
16EPrints.org software
- http//www.eprints.org/
- Generates eprint archives that are compliant with
the OAI Protocol for Metadata Harvesting. - Eprints.org software has been used to build
institutional archives, and disciplinary
archives. - In conjunction with OAI, Eprints.org has been a
primary motivator for institutional archives - Eprints.org v. 2.0 released February 2002 (now on
v. 2.2.1) - EPrints is free (GPL) software, aimed at
organisations and communities.
17Growth in number of Eprints.org Archives (c. 70)
(again, average number of papers per Archive c.
120 needs to grow faster!)
18Work that needs to be done to accelerate growth
per archiveThese curves must become convex
upward Institutional self-archiving policies
are needed
19What have we learned from the Open Citation
Project?
- OAI is gathering momentum
- Software for building OAI repositories is
available - Institutional archives are beginning to be
created, but need to be filled by authors - Attracting authors requires evidence of services
that will improve the visibility and impact of
their works - Citation-ranked search and reference linking are
examples of OAI services that do this
20Online or Invisible? (Lawrence 2001)
- average of 336 more citations to online
articles compared to offline articles published
in the same venue - Lawrence, S. (2001) Free online availability
substantially increases a paper's impact.
Nature, 411 (6837) 521 - http//www.neci.nec.com/lawrence/papers/online-na
ture01/
21What is needed to fill the archives
- Universities Adopt a university-wide policy of
self-archiving all university research output,
e.g. Southampton (ECS) Research Self-Archiving
Policy http//www.ecs.soton.ac.uk/lac/archpol.htm
l - Departments Create Departmental OAI-compliant
Eprint Archives - University Libraries Provide digital library
support for research self-archiving and
archive-maintenance - Promotion Committees Request a standardized
online CV from all candidates, with refereed
publications all linked to their full-texts in
the Departmental Archives - Research Funders Assess research impact online
(from the online CVs)
22Mandating online UK Research Assessment CVs
linked to university eprint archives
- "will set an example for the rest of the world
that will almost certainly be emulated in terms
of research assessment and research access"
Ariadne, issue 35, April 30, 2003 - http//www.ariadne.ac.uk/issue35/harnad/
23Exploiting OAI
- OAI has become the critical technical
infrastructure for open access to author
self-archived papers in institutional archives - OAI enables cross-archive services such as
Citebase - Open access data and services promise increased
visibility and impact for authors - OAI resources will begin to grow significantly
when authors realise this, and when research
councils start mandating open access to the
publication of results of funded research
24Credits Open Citation Project _at_ Southampton
- Principal Investigator is Stevan Harnad
- Technical development at Southampton is directed
by Les Carr - EPrints.org software is being developed by Chris
Gutteridge - Citebase is produced and managed by Tim Brody
- Project manager is Steve Hitchcock
- A copy of these slides can be found on the OpCit
Web site - http//opcit.eprints.org/. Look for Papers and
Presentations - Contact Steve Hitchcock sh94r_at_ecs.soton.ac.uk