Title: Knowledge structure for information professionals
1Knowledge structure for information professionals
- Lecture 2 Intro to bibliometrics
2Bibliometrics
- biblio derived from biblion Greek word for
book - metrics derived from metrikos Greek word for
measurement
3Traditional definition
- The quantitative study of literature as they are
reflected in bibliographies (White and McCain
1989). - The study of patterns of authorship, publication
and literature use by applying various
statistical analyses (Lancaster, 1977).
4Alternative definition the behavioral element
- the study of the application of statistical
methods to the study of ...human
document-related behavior, - specifically, behavior that involves (i)
determining, (ii) expressing, or (iii) acting
on, preferences for documents (Jonathan Furner)
5Reader/User-centered bibliometrics
- new kinds of document
- electronic documents, non-scholarly documents
- new kinds of behavior
- Hypertext linking, mentioning, retrieval, usage,
- purchasing (e.g., frequency of link activation,
hit- - rate, frequency of download, time spent viewing)
6Object and assumption (Furner)
- objective the analysis of human preferences for
documents, in order to - reward authors,
- recommend documents, or
- represent document-network structure
- assumption any action of writing, publishing,
citing, reading, viewing, or buying a document at
any given time is the outcome of a decision to
select that document rather than any other - i.e., the action is an expression of a preference
ordering over the universal set of documents
7Major areas of bibliometric research
- Indicators of research performance
- Citation analysis
- Bibliometric laws (Bradfords law, Lotkas law,
Zipfs law) -
8Application for bibliometrics
- in resource distribution
- identifying authors most worthy of promotion
research - areas most worthy of funding journals most
worthy of - purchase etc.
- in collection development
- identifying the most-useful materials by
analyzing circulation - records journal / e-journal usage statistics
etc. - in information retrieval
- identifying top-ranked documents, authors those
most highly-cited - most highly co-cited most popular etc.
- in the sociology of knowledge
- identifying structural and temporal relationships
between - documents, authors, research areas, etc.
9ISI Citation index
- The Web of Science is an enhanced web version of
the Institute for Scientific Information's
citation indexes, including the Science Citation
Index, the Social Sciences Citation Index, and
the Arts and Humanities Citation Index. - The Web of Science measures the impact of
particular journals and is one of the sources
used by the National Research Council to rank
graduate programs
10ISI indicators
- Impact factors
- B total citations in 2001 to articles in
journal X published - 1999-2000
- C number of journal X articles published
1999-2000 - 2001 impact factor of journal X B/C
- Immediacy index
11Cited half-life
- Definition the number of years that the number
of citations take to decline to 50 of its
current total value. - a measure of how long articles in a journal
continue to be cited after publication.
12Citing half-life
- Definition the median age of all cited articles
in the journal during the year. - A measure of how current the references cited in
the journal
13Source (Amin Mabe 2000)
14Citation analysis
- Bibliographic coupling
- Document co-citation analysis
- Author co-citation analysis
15Subject variation
Source (Amin Mabe 2000)
16Bibliographic coupling
- The assumption two documents both cite the same
previously published docs have something in
common. - The strength of bibliographic coupling depends on
the number of references the two papers have in
common. - results in a cluster of citing documents
- is fixed and permanent
-
17Co-citation coupling
- Number of times two documents are jointly cited
in later publication, - results in a cluster of cited docs (a research
front) - Co-citation patterns change as the interests
and intellectual patterns of the field changes
18Co-citation strength
- S co-citations of A B /citations either A or B
- (Small 1973)
19Fractional citation counting
- Reference length bias
- Each citing item has a total voting strength of
one, but divides that single vote equally among
all references it cites. If an item contains ten
references, each citation has a fractional weight
of 1/10
20Link-based indexing
PageRank relies on the democratic nature of the
web by using its vast link structure as an
indicator of an individual pages value, An
in-link to a page is considered as a vote to the
authority of the page.
21PageRank
- Votes cast by pages that are themselves
"important" - weigh more heavily and help to make other pages
- "important."
- PR (A) (1-d) d (PR (t1)/C(t1) (PR
(t2)/C(t2) .. (PR (tn)/C(tn) - PR pack rank
- t1 tn are pages linking to page A,
- C is the number of outbound links that a page has
- d is a damping factor, usually set to .85
- In plain language
- A pages PageRank 0.15 0.85 (a share of
the PageRank of every page - that link to it)
-
22The reasons to cite
23Cited document as concept symbols
- By condensing or capsulizing a complex
original text into a few standard statements, the
community of scientists can more easily confirm,
refute or build upon the earlier work. This
serves the needs of the specialty by enabling
work to go on unencumbered by the necessity of
unraveling the complete meaning and implications
of the earlier text, even though this may result
in the distortion or oversimplification of the
original - (Small 1978, p. 338)