Title: Boolean, bibliometrics, and beyond
1Boolean, bibliometrics, and beyond
Part 2
LIS 670 donna Bair-Mundy
2Bibliometrics
3Bibliometrics a defintion
Using quantitative analysis and statistics to
examine patterns in academic publishing, now
including information transmitted via the World
Wide Web
4Bibliometrics what it looks at
- Author productivity
- Citation analysis impact factors, indexing
- Obsolescence of information resources half-life
of articles - Dispersion of articles in certain fields
- Word frequencies
5Bibliometrics Purposes (1)
Physics ? Astrophysics Biophysics Subatomic
particle physics
- Provide evolutionary models of science,
technology, and scholarship
Invisible colleges
Structure of scholarly disciplines
Evolution of a discipline over time
Global warming
Evolution of concepts
6Bibliometrics Purposes (2)
- Assist development of information retrieval
methodologies
Provide tools for studying information use and
impact
Assist in selection and deselection of resources
7Properties of scientific literature
- Fragmentary - each paper contributes a small
piece to the puzzle under study
Derivative - scientific papers rely heavily on
previous research (acknowledged in citations)
Edited - peer reviewed by anonymous referees
8Evolution of a discipline
Cole and Eales - 1917 - The history of
comparative anatomya statistical analysis of the
literature
- Purpose "to reduce to geometric form the
activities of the corporate body of anatomical
research, and the relative importances from time
to time of each country and division of the
subject" - Looked at 6,436 publications dealing with animal
anatomy for the period 1543 to 1860
Published in Sci. Progr. 11578-596.
9Evolution of a discipline
Cole and Eales - 1917 - The history of
comparative anatomya statistical analysis of the
literature
- When were the periods of greater or less
importance - Where were the centers of activity at any given
time? - As the field grew, how and when did it begin to
be subdivided into narrower fields?
Looking at publications within a field to tell us
about the field itself
10Evolution of a discipline IS
Harmon, Glynn - 1971 On the evolution of
information science. JASIS 22(4)235-241
- Emergence and development of information science
- Relationships and roles of information science
within potentially emergent suprasystem of
knowledge
11Science, politics, and economics
E. Wyndham Hulme 1923 - Statistical bibliography
in relation to the growth of modern civilization
- First to use the term "statistical bibliography"
Purpose "to ascertain and illustrate by
bibliographical data, various stages in the
development of the mechanics of civilization"
Published by Butler and Tanner Grafton (London)
12Hulme (contd)
- Used 13 annual issues of The International
Catalogue of Scientific Literature, from 1901 to
1913
Counted author entries for various subjects
Tabulated number of indexed journals by countries
(which countries are highly productive in
science?)
13Hulme (contd)
- Felt that subject division in a discipline was a
sign of growth
Concluded that scientific publication output is
influenced by population change and political and
economic movements
14Research output by countries
J. Martin van Zyl 2013 The generalized Pareto
distribution fitted to research ouoputs of
countries Scientometrics 94(3)1099-1109
Which continent (besides Antarctica) is not
represented?
Why might that be?
Why might be the consequences?
15Cost of research
16Consequences
ebola
722 results
ebolavirus
984 results
aids
hiv
122,722 results
196,414 results
17Author productivity
Alfred J. Lotka 1926 - Statisticsthe frequency
distribution of scientific productivity
- Purpose to "determine, if possible, the part
which men of different calibre contribute to the
progress of science"
Looked at Chemical Abstracts Index, then
Geschichtstafeln der Physik
Published in J. Washington Acad. Sci. 16317-325.
18Lotka's Law
The total number of authors y in a given subject,
each producing x publications, is inversely
proportional to some exponential function n of x.
19Lotka's Law - scientific publications
Inverse square law of scientific productivity
Where x number of publications y number of
authors credited with x publications n constant
(equals 2 for scientific subjects) C constant
xn y C
20Lotka's Law - scientific publications
No. of authors
xn y C
21Relative impacts of journals
Gross Gross - 1927 - College libraries and
chemical education
- Purpose Select appropriate journals for a
chemical library to provide good education for
students
Which journals to collect?
Tabulated 3,633 citations found in the 1926
volume of the Journal of the American Chemical
Society
First use of citation analysis rather than
publication counts
Published in Science 66385-389
22Relative impacts of journals
Journal Citation Reports
JCR is still the only usable tool to rank
thousands of scholarly and professional
journals... PETER JACSO
23Relative impacts of journals
Journal Citation Reports
24Relative impacts of journals
Journal Citation Reports
25Relative impacts of journals
Journal Citation Reports
26Relative impacts of journals
Journal Citation Reports
27Citation Indexing
Eugene Garfield 1955 - Citation indexes for
science a new dimension in documentation through
association of ideas
Impact factor Influence of an article based on
citations to it
Science Citation Index
Published in Science 122108-111.
28Problems of indexing
- The interrelationship between the chemistry and
the biological organisms of the soils of Cambodia.
The soil ecology of Kampuchea
1955
1995
29Citation matrix
citing article
citing article
cited article
citing article
article
citing article
cited article
citing article
citing article
cited article
citing article
30ISI Web of Science (1)
31ISI Web of Science (2)
32ISI Web of Science (3)
33ISI Web of Science (4)
34ISI Web of Science (5)
35Science Citation Index
Association-of-ideas index
citing article
citing article
cited article
citing article
article
citing article
cited article
citing article
citing article
cited article
citing article
http//libweb.hawaii.edu/uhmlib/databases/er_title
.htmlWEB
36Co-citation analysis
Articles that cite the same article are likely to
both be of interest to the reader of the cited
article
citing article
article
These two articles are likely to be related
citing article
37Selecting productive journals
Samuel Clement Bradford 1934 - Sources of
information on specific subjects
Purpose to develop a means by which librarians
could select the most usable periodicals
First paper published on observations of
scattering
Bradford's Law
Published in Engineering 13785-86
38Bradford's Law of Scattering (1)
- "If scientific journals are arranged in order of
decreasing productivity of articles on a given
subject, they may be divided into a nucleus of
periodicals more particularly devoted to the
subject and several groups or zones containing
the same number of articles as the nucleus, when
the numbers of periodicals in the nucleus and
succeeding zones will be as a n n2 n3 "
39Bradford's Law of Scattering (2)
- No. of
- source journals
- 1
- 2
- 1
- 2
- 2
- 4
- 10
- 7
- 5
- 5
No. of articles per source 60 35 30 25 9 8 6 5 4 3
Total no. of articles 60 70 30 50 18 32 60 35 20 1
5
130
3
9
130
27
130
40Bradford's Law of Scattering (3)
3 sources 130 articles
9 sources 130 articles
27 sources 130 articles
41George Kingsley Zipf 1935
The psycho-biology of language an introduction
to dynamic philology
Frequency distributions of words
Two laws Less frequently occurring
words Frequently occurring words
Published by MIT Press
42Zipf's Law of High Frequency Words
Proposed in 1949 by George Kingsley Zipf
Where r rank (in terms of frequency) f freq
uency (no. of times the given word is used in the
text) c constant for the given text r f c
- For a given text the rank of a word multiplied by
the frequency is a constant.
43Application of Zipf's laws
William Goffman - automatic indexing
- Determine transition point between high- and
low-frequency words
Collect equal number of words above and below the
transition point
Eliminate trivial words using stop list
Remaining content-bearing words indicate document
contents
44Obsolescence of resources
Charles F. Gosnell 1944 - Obsolescence of books
in college libraries
Purpose "to discover lines of trend or curves of
distribution by means of which this rate of
obsolescence may be expressed in mathematical
form"
Published in College Res. Libr. 5115-125
45Curve of obsolescence
Number of users
Age at time of use
46Alan Pritchard 1969
Statistical bibliography or bibliometrics?
Coined the term "bibliometrics" "the application
of mathematics and statistical methods to books
and other media of communication"
Published in Journal of Documentation
25(4)348-349
47Google indexing criteria
- Text within page being indexed to determine topic
Links to page being indexed
Anchor text of links to page being indexed
(indication of topic)
Weight links to page being indexed by links to
the linking pages
For a good explanation of Bradfords Law of
Scattering see...
48Google
Treating links as citations to compute PageRank
high-weight linkage
low-weight linkage
49Citation tree rings represent the citation
history of an article. The color of a citation
ring denotes the time of corresponding citations.
The thickness of a ring is proportional to the
number of citations in a given time slice. Chen,
C. 2006. CiteSpace II detecting and
visualizing emerging trends and transient
patterns in scientific literature. Journal of
the American Society for Information Science and
Technology 57(3)359-3787.
50Bibliometrics in Action
A time-zone view of mass-extinction research.
Chen, C. 2006. CiteSpace II detecting and
visualizing emerging trends and transient
patterns in scientific literature. Journal of
the American Society for Information Science and
Technology 57(3)359-3787.
51(No Transcript)
52Adding bibliometric visualizations to digital
library search results
53Adding bibliometric visualizations to digital
library search results