Title: RePEc: a early example of an open library
1RePEc a early example of an open library
- Thomas Krichel
- 2006-12-14
2Open libraries and scholarly communication
- RePEc is an example for an Open Library.
- An Open Library is loosely defined an application
of the OSS principles to libraries. - vague
- in the making
- but has some history
- Looking at RePEc will fix ideas.
3scholarly communication
- Is mainly about scholars communicating
- between themselves
- to students, occasionally
- Thus it is essentially a community activity.
- Traditionally, there have been two intermediaries
acting as external agents. - libraries
- publishers
4when tradition ends
- Two external shock
- There comes the Internet and reduces distribution
costs to zero - There comes computer technology and reduces
storage costs somewhat - opportunity sets of community members and
external agents increases - Proposition the future depends much on what the
community members decide. External agents have
little impact.
5discipline communities
- Scholars of various disciplines have varying
habits of research, publication, and evaluation - It is likely that the Internet will emphasize
those differences rather than reducing them.
6examples disciplines with established informal
publishing
- Preprint communities
- Physics ? arxiv.org
- Mathematics ? arxiv.org, partially
- Working paper communities
- Computer Science ? CiteSeer
- (working paper disappearing)
- Economics ? RePEc
7change is tough
- Change has to come inside the discipline.
- There has to come a pioneering individual who
- is technically well versed
- is managerially smart
- has extraordinary forward thinking
- is willing to take considerable risk with her
career - Ginsparg, Krichel, Giles Lawrence are rare
8RePEc History
- It started with me as a research assistant an in
the Economics Department of Loughborough
University of Technology in 1990. - a predecessor of the Internet allowed me to
download free software without effort - but academic papers had to be gathered in a
painful way
9CoREJ
- published by HMSO
- Photocopied lists of contents tables recently
published economics journal received at the
Department of Trade and Industry - Typed list of the recently received working
papers received by the University of Warwick
library - The latter was the more interesting.
10working papers
- early accounts of research findings
- published by economics departments
- in universities
- in research centers
- in some government offices
- in multinational administrations
- disseminated through exchange agreements
- important because of 4 year publishing delay
111991-1992
- I planned to circulate the Warwick working paper
list over listserv lists - I argued it would be good for them
- increase incentives to contribute
- increase revenue for ILL
- After many trials, Warwick refused.
- During the end of that time, I was offered a
lectureship, and decided to get working on my own
collection.
121993 BibEc and WoPEc
- Fethy Mili of Université de Montréal had a good
collection of papers and gave me his data. - I put his bibliographic data on a gopher and
called the service "BibEc" - I also gathered the first ever online electronic
working papers on a gopher and called the service
"WoPEc".
13NetEc consortium
- BibEc printed papers
- WoPEc electronic papers
- CodEc software
- WebEc web resource listings
- JokEc jokes
- HoPEc
- a lot of Ec!
14WoPEc to RePEc
- WoPEc was a catalog record collection
- WoPEc remained largest web access point
- but getting contributions was tough
- In 1996 I wrote basic architecture for RePEc.
- ReDIF
- Guildford Protocol
151997 RePEc principle
- Many archives
- archives offer metadata about digital objects
(mainly working papers) - One database
- The data from all archives forms one single
logical database despite the fact that it is held
on different servers. - Many services
- users can access the data through many
interfaces. - providers of archives offer their data to all
interfaces at the same time. This provides for an
optimal distribution.
16RePEc is based on 670 archives
- WoPEc
- EconWPA
- DEGREE
- S-WoPEc
- NBER
- CEPR
- Blackwell
- US Fed in Print
- IMF
- OECD
- MIT
- University of Surrey
- CO PAH
- Elsevier
17to form a 433k item dataset
- 191,000 working papers
- 237,000 journal articles
- 1,400 software components
- 2,300 book and chapter listings
- 11,000 author contact and publication
listings - 10,000 institutional contact listings
18RePEc is used in many services
- Econpapers
- Decomate Z39.50 service
- NEP New Economics Papers
- Inomics
- RePEc author service
- IDEAS
- RuPEc
- EDIRC
- LogEc
- CitEc
19 describes documents
- Template-Type ReDIF-Paper 1.0
- Title Dynamic Aspect of Growth and Fiscal Policy
- Author-Name Thomas Krichel
- Author-Person RePEcper1965-06-05thomas_kriche
l - Author-Email T.Krichel_at_surrey.ac.uk
- Author-Name Paul Levine
- Author-Email P.Levine_at_surrey.ac.uk
- Author-WorkPlace-Name University of Surrey
- Classification-JEL C61 E21 E23 E62 O41
- File-URL ftp//www.econ.surrey.ac.uk/
pub/RePEc/sur/surrec/surrec9601.pdf - File-Format application/pdf
- Creation-Date 199603
- Revision-Date 199711
- Handle RePEcsursurrec9601
20 describes persons (RAS)
- template-type ReDIF-Person 1.0
- name-full MANKIW, N. GREGORY
- name-last MANKIW
- name-first N. GREGORY
- handle RePEcper1984-06-16N__GREGORY_MANKIW
- email ngmankiw_at_harvard.edu
- homepagehttp//post.economics.harvard.edu/faculty
/ - mankiw/mankiw.html
- workplace-institution RePEcedideharus
- workplace-institution RePEcedinberrus
- Author-Article RePEcaeaaecrevv76y1986i4p
676-91 - Author-Article RePEcaeaaecrevv77y1987i3p
358-74 - Author-Article RePEcaeaaecrevv78y1988i2p
173-77 - .
21 describes institutions
- Template-Type ReDIF-Institution 1.0
- Primary-Name University of Surrey
- Primary-Location Guildford
- Secondary-Name Department of Economics
- Secondary-Phone (01483) 259380
- Secondary-Email economics_at_surrey.ac.uk
- Secondary-Fax (01483) 259548
- Secondary-Postal Guildford, Surrey GU2 5XH
- Secondary-Homepage
- http//www.econ.surrey.ac.uk/
- Handle RePEcedidesuruk
22what do open libraries do?
- Identify records
- Relate identified records
- These actions require human control.
- They prepare for assessment of performance.
23key to success
- Have a small group of volunteers
- Disseminate as widely as possible
- Demonstrate to authors and institutions that it
works for them. - institutional registration
- author registration
24institutional registration
- It started by one sad geezer making a list of
departments that have a web site. - I persuaded him that his data would be more
widely used if integrated into the RePEc
database. - Now he is a happy geezer and one of our three
crucial volunteers.
25RePEc author service
- RePEc document data has author names as strings.
- The authors register with RAS to list contact
details and identify the papers they wrote. - This is classic access control, but done by the
authors.
26author registration
- It started when funding allowed us to hire a
crazy programmer to write an author registration
system. - The system went online as "HoPEc" in late 2000.
- It has been renamed "RePEc author service" (RAS)
- A recent grant from OSI allows for a rewrite and
expansion.
27(No Transcript)
28LogEc
- It is a service by Sune Karlsson that tracks
usage of items in the RePEc database - abstract views
- downloads
- There is mail that is sent by Christian
Zimmermann to - archive maintainers
- RAS registrants
- that contains a monthly usage summary.
29authors' incentives
- Authors perceive the registration as a way to
achieve common advertising for their papers. - Author records are used to aggregate usage logs
across RePEc user services for all papers of an
author. - Stimulates a "I am bigger than you are"
mentality. Size matters!
30recently
- In 2004, Peter Jasco compared RePEc services with
the EconLit proprietary professional database. - IDEAS and LogEc were Peters pick
- EconLit was Peters pan.
- He slammed the working paper coverage of EconLit.
- He could have slammed other things.
31RePEc / EconLit partnership
- RePEc now delivers all its working paper data to
EconLit, without getting the journal data of
EconLit in return. - This may seem absolutely perverse! A bunch of
volunteers laboring for a multi-million
concern! - In fact it serves RePEc well because it adds
officialdom.
32partnership with library officialdom
- Recently, the Zentralbibliothek fuer
Wirtschaftswissenschaften (ZBW) have become
interested in working with RePEc. - The aim is to build a new dataset called LoTEc
for the moment. - LoT means Long Term, but could also mean a lot.
33automated full-text collection
- The idea behind LoTEc is to build a system that
collects full-text from metadata. - take title and author data
- search Google for it
- examine search results if the could be the full
text - http//openlib.org/home/krichel/proposals/khabarov
sk.pdf is a funding application with more ideas
about such systems.
34konz project
- The konz project software is a prototype for
LoTEc - uses DBLP
- title searches only
- limited to PDF / MS office full-text
- Results of konz are thin. only 5 of full-text
found. Probably the fault of bad programming (by
myself).
35will LoTEc do better?
- Recent work on a small sample of 460 papers by
Bergstrom finds a self-archiving ratio of about
80 for top-level recent research papers. - There are chances that we will find these if we
can improve over the konz software. - Initial tests to be done on the current version
of konz in early 2007.
36whats good a paper in an archive?
- Unless we have the authors permission, we can
not make a paper available in archived form (on
our server). All we could do is make metadata
available. - To gather permissions from the authors the plan
is to use the RePEc author service in stage four
of the ACIS project.
37validation / permissions interface
- For a paper that an author has written, RAS
would present a set of potential full-text files
and ask - is this a full-text of this paper?
- can we make an archived copy of this paper
available for public use? - This interface is supposed to be set up during
2007.
38KEY idea 1
- RePEc attracts a community of users and
contributors - The community itself is the focus of attention
- RePEc describes the living rather than the dead.
- Forget about documents!
39KEY idea 2
- Forget about users!
- Disseminate widely
- Users will come through Google anyway.
- And Google loves RePEc services
- puts RePEc services top when the query consists
of the name of an author
40obstacles to open libraries
- lack of imagination entrepreneurship
- inability to form alliances
- user-centered thinking
- document-centered thinking
- technical competence required
- OAI PMH
- XML and XML Schema
- Unicode
- the "C" word
41http//openlib.org/home/krichel
- Thank you for your attention!