Title: The Use of Usage
1The Use of Usage
- Michael J. Kurtz
- Harvard-Smithsonian
- Center for Astrophysics
2Collaborators
- Johan Bollen - LANL
- Paul Ginsparg - arXiv/Cornell
- Alberto Accomazzi - ADS/SAO
- Edwin Henneken - ADS/SAO
- Guenther Eichhorn ADS ? Springer
3The Use of Usage
- Sources of Usage Information
- Properties of Usage Information
- Applications of Usage Information
4Sources of Usage Information
- Library Logs reshelving statistics
- Historical
- Journal volume, not article
- Publisher/Consolidator logs
- ADS etc
- Log Consolidators
- MESUR
- The Need for Standards
5A typical ADS query
6Sorted by citation count
Next get review articles
7Papers citing papers
Now get heavily read
8What scientists are reading
9Johan Bollen (LANL) Principal investigator. Herbe
rt Van de Sompel (LANL) Architectural
consultant. Aric Hagberg (LANL) Mathematical and
statistical consultant. Luis Bettencourt (LANL)
Mathematical and statistical consultant. Ryan
Chute (LANL) Database management, ingestion and
normalization Lyudmilla Balakireva Database
management, ingestion and normalization
The Andrew W. Mellon Foundation has awarded a
grant to Los Alamos National Laboratory (LANL) in
support of a two-year project that will
investigate metrics derived from the
network-based usage of scholarly information. The
Digital Library Research Prototyping Team of
the LANL Research Library will carry out the
project. The project's major objective is
enriching the toolkit used for the assessment of
the impact of scholarly communication items, and
hence of scholars, with metrics that derive from
usage data.
10MESUR general approach
- Generalizable, quantitative results
- Create very large-scale reference data set
- Usage, citation and bibliographic data combined
- Various communities, various collections
- Investigate sampling issues
- Effects of sampling on usage-based assessment
- Mapping and characterization of scholarly
community - Uncertainty quantification noise, bots,
- Investigate validity of usage data and
usage-based metrics - Cross-validation compare to existing, accepted
journal-focused metrics and data - Not selling 1 metric exploring many
possibilities, many facets of impact - Explorative approach not top-down, bottom-up
exploration - Lay foundation for scientific, generalizable
study of usage data-based assessment
11Data normalization and ingestion
- Minimal requirements for all usage data
- Unique usage events (article level)
- Fields unique session ID, date/time, unique
document ID and/or metadata, request type - Note difference with usage statistics
- 2007 9 1 0 0 1
CFA cffoe A172080.N1.Vanderbilt.Edu
unknown AST A 1996SPIE.2828..64S
http//foe.edu/abs/1996SPIE.2828..64S
http//www.google.com - 2007 9 1 0 0 1
CFA cffoe 210.94.41.89 unknown PHY A
2007ApPhL.90a2120C http//foe.edu/abs/20
07ApPhL.90a2120C http//www.google.co.kr - 2007 9 1 0 0 1
CFA cffoe 24-196-228-125.dhcp.gwnt.ga.charte
r.com unknown AST A 2000ASPC.213.333S
http//foe.edu/abs/2000bioa.conf.333S
http//scholar.google.com - 2007 9 1 0 0 4
CFA cffoe 163.152.35.114 4700387eae
PHY A 1993WRR..29.133S
http//foe.edu/abs/1993WRR..29.133S
http//scholar.google.com - 9 1 0 0 6 CFA
cffoe pd9e980fc.dip0.t-ipconnect.de
45f0c69881 AST X 2007AN..328.841H
http//arXiv.org/abs/0708.1863 http//foe.edu - 2007 9 1 0 0 1
CFA cffoe A172080.N1.Vanderbilt.Edu
unknown AST A 1996SPIE.2828..64S
http//foeabs.edu/abs/1996SPIE.2828..64S
http//www.google.com - 2007 9 1 0 0 1
CFA cffoe 210.94.41.89 unknown PHY A
2007ApPhL.90a2120C http//foeabs.edu/abs
/2007ApPhL.90a2120C http//www.google.co.kr - 2007 9 1 0 0 1
CFA cffoe 24-196-228-125.dhcp.gwnt.ga.charte
r.com unknown AST A 2000ASPC.213.333S
http//foeabs.edu/abs/2000bioa.conf.333S
http//scholar.google.com - 2007 9 1 0 0 4
CFA cffoe 163.152.35.114 4700387eae
PHY A 1993WRR..29.133S
http//foeabs.edu/abs/1993WRR..29.133S
http//scholar.google.com - 2007 9 1 0 0 6
CFA cffoe pd9e980fc.dip0.t-ipconnect.de
45f0c69881 AST X 2007AN..328.841H
http//arXiv.org/abs/0708.1863
http//foeabs.edu - 2007 9 1 0 0 6
CFA cffoe foel25144.4u.com.gh
47002f8eda PHY A
2002AGUFM.S21A0965M http//foeabs.edu/abs/2002
AGUFM.S21A0965M http//www.google.com - 2007 9 1 0 0 6
CFA cffoe 66-215-171-214.dhcp.ccmn.ca.charte
r.com 4681d22a6f AST A
2001PSS..49.657R http//foeabs.edu/cgi-bin/bi
b_query?bibcode2001P26SS..49.657R
http//cfa-www.edu - 2007 9 1 0 0 7
CFA cffoe nat-ptouser3.uspto.gov unknown
PHY A 2005ApPhL.86g2106M
http//foeabs.edu/abs/2005ApPhL.86g2106M
http//www.google.com - 2007 9 1 0 0 7
CFA cffoe cpe-71-65-25-115.ma.res.rr.com
unknown PHY A 1980SPIE.205.153S
http//foeabs.edu/abs/1980SPIE.205.153S
http//www.google.com - 2007 9 1 0 0 7
CFA cffoe customer3491.pool1.unallocated-106
-0.orangehomedsl.co.uk unknown PHY A
1983ElL..19.883V http//foeabs.edu/abs/198
3ElL..19.883V http//www.google.co.uk - 2007 9 1 0 0 8
CFA cffoe Uranus.seas.ucla.edu
46672d96b2 PHY A 1966Phy..32.385K
http//foeabs.edu/abs/1966Phy..32.385K
http//www.google.com - 2007 9 1 0 0 9
CFA cffoe 75-121-173-37.dyn.centurytel.net
46cf1fd8a6 AST D
1984ApJS..56.257J http//vizier.cfa.edu/viz-bi
n/VizieR?-sourceIII/92/ http//foeabs.edu - 2007 9 1 0 0 13
CFA cffoe foel17-18.kln.forthnet.gr
unknown AST A 1987cosm.book...C
http//foeabs.edu/abs/1987cosm.book...C
http//www.google.gr - 2007 9 1 0 0 15
CFA cffoe hades.astro.uiuc.edu
46f707564d PRE A
2007arXiv0707.3146N http//foeabs.edu/abs/2007
arXiv0707.3146N http//foeabs.edu
12Properties of Usage Information
- Obsolescence
- Comparison with citations
- Article type
- Network properties
13Obsolescence
14Usage vs Age 110 years
15Usage vs Age 25 years
16Usage vs Age 90 weeks
17Scholarly Usage Model
18Students (Google Scholar)
19Five Usage Modes
20General Public (Google)
21Cites vs Use by Scholars
22The 100 year relation
23Cites CI only
24Article Type
- Newspapers are rarely cited
- Trimble, et al is the most read paper in
astronomy every year, but is almost never cited - ApJ articles are cited 300 times BAAS articles,
but are only read 6 times. Number of reads per
cite is different by a factor of 50
25Networks - ADS
26Usage map
- 200M usage events
- 2006 usage only
- JCR journals (-7600)
Red, orange psych, cogn Green phys, chem Olive
material science Blue biology Purple pharma
27Applications of Usage
- Articles
- Individuals
- Departments
- Journals
- Countries
28Finding Articles
- 2nd Order Operators, in ADS since 1996
- People who read this set of articles also read
this set most often - Finds the most popular, or hottest articles
- A collaborative filter
29Most Popular
30Hottest papers on dark energy
31Measuring Individuals
- The number of times ones articles are read is a
valid measure of ones scientific impact, similar
to citation counts - Use has different properties than cites, together
they form a two dimensional view of productivity
32The Read-Cite Diagram
33Professional Astronomers
342 dimensional productivity model
35Different productivities - histories
36Productivity vs Age - Cites
37Productivity vs Age - Reads
38Productivity vs Age Read10
39Measuring Departments
40Department Size Can Matter
41Measuring Journals
- Beyond the Impact Factor
- Reads vs Cites differences will be important here
- The New York Times would have a low Impact Factor
42(No Transcript)
43IF vs PageRank
44Readership rates per article
45Relative Currency
46Regional Differences
47Measuring Countries
- Authors are from countries, reads as a function
of authors country has yet to be studies - Readers are from countries, their activities
allow one to measure the Scientific Wealth of
Nations
48ADS use vs per capita GDP
49Per astronomer ADS vs GDP
50Astronomers vs GDP
51Modeling Countries Research
52A Good Fit
53Astronomy is representative
ADS predicts average of Citesarticles better
than cites predicts articles
54Measuring Changes