Bibliometrics uncovered: principles

1 / 73
About This Presentation
Title:

Bibliometrics uncovered: principles

Description:

... and John Walshe, Irish Independent, Thursday August 14 ... names spelled the ... Sample: Irish HEI's by discipline: Biochemistry, Genetics & Molecular ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 74
Provided by: rosariec

less

Transcript and Presenter's Notes

Title: Bibliometrics uncovered: principles


1
Bibliometrics uncovered principles practices
of citation-based research evaluation
  • Rosarie Coughlan
  • Research Support Librarian (STM)
  • Rosarie.coughlan_at_nuigalway.ie

2
Outline
  • Research evaluation using bibliometrics the
    context
  • What to measure when
  • Tools to retrieve the data
  • Limitations considerations
  • Evolving models metrics
  • Bibliometrics best practice
  • Influencing the debate

3
Drivers for increased evaluation and assessment
4
Scholarship is increasingly collaborative global
Largest Collaboration in 2006 2512 authors, a
collaboration of collaborations
5
Justify budget of 2bn, colleges warned
UNIVERSITIES and colleges will have to justify
their 2bn budget as part of an unprecedented
government investigation into third-level
spending. By Ralph Riegel and John Walshe, Irish
Independent, Thursday August 14 2008
6
  • According to Higher Education Authority
    figures, the number of permanent
    research-oriented staff in universities has risen
    30 per cent in just six years. No wonder the
    Department of Education can't devote resources to
    reducing pupil/teacher ratios or helping autistic
    children. It's too busy funding research papers
    on post-modernist underwater hang gliding
    techniques in Outer Mongolia

Marc Coleman, Sunday Independent, May 18th 2008
7
  • The metrics of bibliometrics


8
Bibliometrics
  • Bibliometrics
  • the discipline of measuring the performance of
    a researcher, a collection of articles, a
    journal, a research discipline or an institution.
  • This process involves the application of
    statistical analyses to study patterns of
    authorship, publication, and literature use
    (Lancaster 1977).
  • Citation analysis
  • Citation analysis involves counting how many
    times a paper or researcher is cited by other
    scholars in the field. This performance measure
    assumes that influential scientists and important
    works are cited more often than others.

Citation counts are a measure of impact And
impact is closely related to quality Nonetheless,
the two concepts are not synonymous!
9
Stakeholders in research evaluation
Thomson (2007) Using Bibliometrics
10
What to measure (1)?
  • Academic Peer Review Employer Review Faculty
    Student Ratio Citations per Faculty
    International Faculty International students
    Teaching Excellence Student Satisfaction
    Heads/peer assessments A-level/Higher points
    Unemployment Firsts/21s awarded Total Papers
    Total Citations Citation Impact (cites per
    paper) Percent Cited Paper Impact Relative to
    Field Percentile Rank in Field Collaboration
    Indicators Expected Citation Count Ratio of
    Citations to Expected citation count Expected
    Citation Rate for Category Mean / Median Citation
    H Index Citation Frequency Distribution
    Time Series Trends Eigenfactor Impact Factor
    Bibliometrics Scientometrics Informetrics
    Citation Analysis Webometrics Virtual
    Ethnography Web Mining Conference Papers
    Number Patents Co-Citation Immediacy Factor
    PageRank Weighted In-degree Weighted
    Out-degree In-Degree entropy Out-Degree
    entropy g-index

10
11
When to measure (2) variables of scale?
higher educations most critical goals are
difficult, if not impossible, to
measure (Birnbaum, 2001, p.84
  • (1) Applicability
  • Impact vs. quality
  • Scholarly publishing practices
  • Knowledge dissemination methods different in SSH
  • Regional/ national significance (regional
    readership?)
  • Field or discipline variation
  • Not all influences are counted i.e. books, gov
    pubs, grey literature etc
  • (2) Accuracy
  • Citation bias may exist
  • Publication exclusion
  • Same name authors
  • English language bias
  • Bias to international titles
  • of authors (distribution of work)
  • Cronyism
  • (3) Validity
  • Quality vs. quantity
  • Time-span
  • Only a small percentage of articles are highly
    cited
  • Controversial papers

12
  • Questions
  • Evaluating research is difficult! Not everything
    that counts is countable, and not everything that
    is countable counts. (Albert Einstein, 1879
    1955)
  • but
  • Is a partial portrait an invalid portrait?

13
What to measure?
higher educations most critical goals are
difficult, if not impossible, to
measure (Birnbaum, 2001, p.84
Funding Proposals?
Tenure Promotion?
Field Strength's / Collaborators?
Publication Dissemination?
University Rankings?
14
The tools strength's weaknesses
S - Complete indexing of authors addresses
Complete indexing of a known proportion of
academic journals Multidisciplinary coverage
Indexing of citations Better foreign language
coverage 1200 Open Access Journals Increased
SSH pubs, including journals, book chapters
3500 titles (June 2009) __________________________
__ W - limited coverage English language
bias exclusion of certain types of
documents classification of journals by
discipline changes in journal titles names
spelled the same way (homographs)
  • (SCI, SSCI AHSSCI)
  • S - Complete indexing of authors addresses
  • Complete indexing of a known proportion of
    academic journals
  • Multidisciplinary coverage
  • Indexing of citations
  • International coverage
  • ______________________________
  • W - limited coverage
  • English language bias
  • exclusion of certain types of documents
  • classification of journals by discipline
  • changes in journal titles
  • names spelled the same way (homographs)
  • S - Peer-reviewed papers, theses, books,
    abstracts, and other scholarly literature
  • Variety publishers prof. soc
  • OA Journals / Institutional Repositories
  • View citing articles
  • Articles ranked by weighing full text of each
    article, author, publication by relevance
    times cited
  • Full-text via NUI Galway Library
  • Link to full-text via Library (on campus)
  • _________________________
  • W
  • Data dirty - duplication
  • Data structure not suited for citation analysis
    these points ruin its great potential for a wide
    range of subjects

15
Issues (1) mining the data - attribution
  • NUIGalway
  • NUI, Galway
  • NUI Galway
  • NUIG
  • National University of Ireland Galway / National
    University of Ireland, Galway
  • UCG
  • University College Galway
  • University College Hospital Galway
  • UCHG
  • University Hospital Galway
  • University Hospital
  • OÉ Gaillimh
  • Ollscoil na hÉireann Gaillimh
  • Ollscoil na hÉireann, Gaillimh


14 of 83 for National University of Ireland,
Galway
16
Impact of inconsistent identifier use
Loss of output approx 19
Source Scopus
17
Issue (2) mining the data - author name syntax
  • Murphy, Paul V.
  • Murphy, P. V.
  • Murphy, P.
  • Murphy, Paul
  • Murphy, Paul Vincent

Use of multiple author name variants
18
Examples
Institutions
Countries
Research Groups
Papers
Journals
Authors
19
  • Benchmarking countries

Bibliometric outputs
20
Data source
21
Jiao Tong University in ShanghaiTop 500
Universities
22
European Commission Science Indicators 2007
23
  • Benchmarking countries
  • (a bibliometric overview)

(1) ISI Essential Science Indicators
24
Which Country has the highest impact in Biology
Biochemistry?
  • by
  • Citation
  • Papers
  • Cites per paper

Source ISI Essential Science Indicators
(coverage 1954-)
25
What are the highest ranking disciplines
nationally?
  • What are the highest ranking disciplines
    nationally by
  • Citation
  • Papers
  • Cites per paper

Source ISI Essential Science Indicators
(coverage 199?
26
  • Benchmarking institutions
  • (a bibliometric overview)
  • ISI Essential Science Indicators
  • Scopus

27
What are the emerging areas of research
excellence institution n.? (ISI Essential Science
Indicators (c1954-))
NB data derived from ISI journal categories
28
Sample Irish HEIs by discipline Biochemistry,
Genetics Molecular Biology (Scopus 1996 - )
  • Getting the most from the metric to identify
    local / inter-institutional hotspots
  • Express it in relative terms i.e. of scholars
    and level of funding productivity indicator
  • Expected citation ratio (aligned to world
    discipline average) Percentile position of
    the paper based on citations in the same field.
  • Disciplinarity - level of multidisciplinarity in
    a set of papers from 0 to 1 (the lower the
    number, the greater the multidisciplinarity).
    (Herfindal Index)

NB data derived from Scopus journal categories
29
  • Benchmarking disciplines
  • (a bibliometric overview)
  • ISI Essential Science Indicators
  • (2) Scopus (Elsevier)
  • (3) ISI Web of Science

30
What are the highest ranking disciplines within
the University (using citation analysis)?
NB data derived from Scopus journal categories
31
  • Benchmarking authors / groups

Bibliometric outputs
Scopus ISI Web of Science
32
Pro-Intelligent Design Astronomer Denied Tenure
Ranks Top in His Department According to
Smithsonian/NASA Database, Say Analysts(The
Chronicle of Higher Education, 1st June 2007)
Mr. Gonzalez has a normalized h-index of 13, the
second highest of the 10 astronomers in his
department. The only person who ranks higher is
Curtis J. Struck, a professor with an h-index of
17
33
Metrics
  • Total number of pubs
  • Total number of cites
  • H-Index
  • J. E Hirsch (2005) Number of papers (N) in a
    given dataset having N or more citations.
  • 4. Number of papers in Journal with IF gt N.
  • Key variables
  • career profile
  • Early career researchers!
  • Only meaningful when compared to others within
    the same discipline area. (e.g. Life Sciences vs
    Physics)
  • Bias against exceptional/ papers
  • Paper A 1000 cites
  • Paper B 3 cites!

34
Data integrity author profiles in citation
indices (1)
Select relevant author name variants
34
35
Ensuring author profile accuracy for improved
precision tracking
  • Checklist
  • Coverage
  • Publication years
  • Inclusion of all publication types (i.e. peer
    review articles and conference proceedings)
  • Accuracy (name syntax)
  • Are all possible name variants for your profile
    included
  • Are there any gaps?
  • Are there any inaccuracies?
  • How an author set may be affected by a name
    change
  • Accuracy (affiliation syntax)
  • Your current affiliation
  • Your affiliation history?
  • Accuracy (paper groups)
  • Whether all papers in one set are indeed by the
    same author
  • Whether an article has been omitted from a set
  • Select the feedback link
  • Use your NUIG email (this is used as verification
    by Scopus)
  • 3-4 weeks time lag
  • You will receive notification when your profile
    has been updated

36
Example
  • Functions
  • Top cited papers
  • citations
  • H-index

37
  • The h-index number of papers (N) in the list
    that have N or more citations.
  • discounts disproportionate weight of highly cited
    papers or papers that have not yet been cited.

Scopus h-index 42
38
ISI Web of Science Distinct Author ID
Select Provide Feedback to update your
profile. Can also do institutional batch load
39
  • Assessing grant publication productivity impact
    (using citation analysis)

Scopus ISI Web of Science Google Scholar
40
Assessing publication impact
  • Sources
  • Researchers in the field/discipline
  • Institutions / Departments / Labs
  • Journals, core collections within the discipline
  • Discipline impact by
  • Citing researchers
  • Journals
  • Institutions / Departments / Labs
  • Knowledge transfer
  • Across disciplines
  • Across countries

41

Cited Reference Searching
Traditional search
Cited reference search
2004 paper
2003 paper
1987 paper
1993 paper
2001
2001
1996 paper
1996 paper
1982 paper
1982 paper
1957 paper
1957 paper
42
Cited Reference Search ISI Web of Science
  • Esashi F, Christ N, Cannon J, Liu Y, Hunt T,
    Jasin M, West SC CDK-dependent phosphorylation
    of BRCA2 as a regulatory mechanism for
    recombinational repair. Nature 2005,
    434(7033)598-604.

Funded by Cancer Research UK
.
43
ISI Web of Science (1)
View citing articles 2nd generation cites
indirect recognition (must be normalised to
field)
Find related Articles that share the same
reference
View Journal IF
44
ISI Web of Science (2)
Analyse by Author Doc type Source title Subject
45
Scopus (1)

Receive RSS Feed fro new citing articles
Source Scopus Cited 112 times
46
Scopus (2) Results analysis
Knowledge transfer what are the
domains/disciplines using the knowledge created
in the original work?
Journals where the citing research is being
published
Authors?
47
Google Scholar

Google Scholar Cited 107 times
48
Benchmarking authors / group evolving metrics
  • Authors/ Groups
  • Successive h-index - h2 Index (h individuals with
    h index of at least h)
  • hP index (ranked list of authors their number
    of pubs)
  • hC (ranked list of authors their number of
    cites)
  • Market power value
  • Social network analysis (co-authorship /
    co-publication collaboration activity)
  • Webometrics
  • Network based metrics see MESUR project seeks
    to capture ALL interaction as attention metrics
    i.e. PDF downloads etc, when, where, who, whys?
    clickstream data yields
  • Behaviours between web sites web pubs
  • Google Web-URL citations
  • Google Analytics who, how, where web searchers
    accessing sites time periods i.e. peaks of
    activity etc.

49
  • Measure field trends or journals

ISI Web of Knowledge Journal Citation Reports
50
Journal evaluation criterion
  • Peer evaluation respondents requested in
    specific disciplines to rate or classify the
    journals into several categories, so that
    rankings can be produced.
  • Journals compliance with publication criteria
    i.e. periodicity, blind review of manuscripts
    and so on.
  • Diffusion diffusion of journals through a
    variety of methods presence in the main
    international databases of their discipline,
    presence in libraries, inclusion in directories
    of periodicals, presence in the Internet, or
    inclusion of contributions from foreign authors.
  • Citations analysis

51
Citation counts as a measure for evaluation (1)
  • Measure
  • Numbers of publications level of production of
    new knowledge
  • Number of citations
  • Citation networks establishing Knowledge
    transfer across titles / disciplines
  • Metrics
  • Impact factor
  • Eigenfactor Scope
  • Article Influence Score
  • Scimago Journal Ranbk
  • ISI Essential Science Indicators
  • Scimago Journal rank (data source Scopus)

52
The Impact Factor (1)
2008 Impact Factor
All Previous Years
2008
2007
2006
Immediacy Index
Impact Factor Citations during the current
year, 2008 in this example, to articles
published within the prior two years.
SUM No. of citations in 2008 to articles
published in journal in 2007 and
2006 _____________________________________ No. of
articles published in journal in 2007 and 2006
53
The impact factor (2)
  • Interpreting the metric
  • Negates research longevity a given IF for any
    journals only presents an average - cannot be
    used to measure the performance of an individual
    author.
  • Quantity vs quality
  • Negative cites citation bias
  • Time
  • Not all research work is published and cited in
    the citation indices.
  • Different fields of research publish at different
    rates i.e. Biomedicine vs. Engineering.
  • Internationally recognized measurement of journal
    influence
  • Shows the most influential journals by
    discipline, publisher etc.
  • Used mainly for
  • making decisions on where to publish
  • making decision which journals to subscribe
    cost effectiveness
  • Calculated against activity in the 2 previous
    years

54
Journal impact the Impact Factor

55
Journal impact number of articles published


56
SCImago Journal Rank (2008)
Data Source Scopus
  • SJR indicator attributes different weight to
    citations depending on the "prestige" of the
    citing journal (excl. journal self-citations)
    prestige is estimated using the Google PageRank
    algorithm in the network of journals.
  • The SJR indicator includes total number of
    documents of a journal in the denominator of the
    relevant calculation

57
Reputation metrics citation networks
58
Prestige metrics e.g. UK ABS Academic Journals
Quality Guide
  • Criterion used to rank journals include
  • Originality
  • Well executed etc.
  • Top journals have highest citation Impact Factor
    in field
  • (IF available for 493 out of 1041 journals in
    the ABS guide)
  • Academic Journals Quality Guide -Aim to
    benefit membership and academicsto make
    informed decisions, whether at the level of the
    business school or at the level of the
    individual academic

59

Outputs
Books, monographs, book chapters, reports,
working papers etc
60
Citation counts as a measure for evaluating books
other publications (1)
  • Measure
  • Numbers of publications level of production of
    new knowledge
  • Metrics means
  • Classification based on scholars quality
    perceptions e.g. case study Flemish Law
  • single multi-authored books
  • PhD thesis
  • Pubs lengths gt 5 pages
  • Library collections analysis
  • Number of academic library copies per book title
  • E.g. use of Worldcat (Lianmans, CWTS 2007)
  • Google Books (GB world books unknown)
  • Journal book weights
  • Lists of journals publishers rated in surveys
    of national international experts (trimmed
    statistical weights computed (Lewel et. al.,1999
    Moed et al, 2002 Nederhof et al., 2001)
  • need application to international standards

61
Evaluating books other publications - evolving
metrics (2)
Opportunities for evaluation of heterogeneous
publication types
  • Establishing robust institutional systems
  • Research Support System
  • Institutional Repositories
  • Libraries key agencies in helping to collect,
    curate research outputs, analysis
    interpretations

Outputs, outputs, outputs
62
Limits of bibliometrics
If you want to fatten a pig, you don't keep
weighing it.
63
(No Transcript)
64
Bad Science Funding findings the impact
factor (Guardian 14th Feb 2009)
studies of flu shots funded by pharmaceutical
companies are more likely to be published in
prestigious journals than those funded by other
sources, in spite of the fact that they have the
same sample size and comparable
methodology (study by Thomas Jefferson
Cochrane Vaccine Institute Relation of study
quality, concordance, take home message, funding,
and impact in studies of influenza vaccines
systematic review)
65
Limitations the tools
  • Applicability
  • Accuracy
  • Validity
  • WoS Scopus not strong in coverage of humanities
    journals
  • Not strong on non-English sources
  • The humanities, engineering, computer science are
    less dependent on journals than other subject
    areas
  • Google Scholar data is very dirty and there is
    duplication data structure not suited for
    citation analysis these points ruin its great
    potential for a wide range of subjects
  • Limited number of articles in any indices
  • Same name authors also known as homographs

THES-QS 2008 Net change of rank by country
66
  • Evolving metrics!

67
Evolving metrics (1)
  • Institutions
  • G Factor links to university web sites by other
    international universities (perspective based)
  • Authors/ Groups
  • Successive h-index - h2 Index (h individuals with
    h index of at least h)
  • hP index (ranked list of authors their number
    of pubs)
  • hC (ranked list of authors their number of
    cites)
  • Market power value
  • Topic / fields
  • Networks of Science social network analysis
    (co-authorship / cp-publication collaboration
    activity)
  • Journals
  • Journal h-index
  • Eigenfactor
  • Diffusion Factor
  • Webometrics
  • Network based metrics see MESUR project seeks
    to capture ALL interaction as attention metrics
    i.e. PDF downloads etc, when, where, who, whys?
    clickstream data yields
  • Behaviours between web sites web pubs
  • Google Web-URL citations
  • Google Analytics who, how, where web searchers
    accessing sites time periods i.e. peaks of
    activity etc.

68
Evolving metrics (2) Open Access
  • Citations (C)
  • CiteRank
  • Co-citations
  • Downloads (D)
  • C/D Correlations
  • Hub/Authority index
  • Chronometrics
  • Latency/Longevity
  • Endogamy/Exogamy
  • Book citation index
  • Research funding
  • Students
  • Prizes
  • h-index
  • Co-authorships
  • Number of articles
  • Number of publishing years
  • Semiometrics (latent semantic indexing, text
    overlap, etc.)

Source Harnard, S (2008) Metrics Mandates
69
  • Establishing Best Practices
  • Consider whether available data can address the
    question
  • Choose publication types, field definitions, and
    years of data
  • Decide on whole or fractional counting
  • Judge whether data require editing to remove
    artifacts
  • Ask whether the results are reasonable
  • Use relative measures, not just absolute counts
  • Obtain multiple measures
  • Recognize the skewed nature of citation data
  • Confirm data collected are relevant to question
  • Compare like with like

A 2-pronged approach it takes experts to
evaluate experts
70
Considerations
  • Relevant and appropriate
  • Are metrics correlated with other performance
    estimates?
  • Do metrics really distinguish excellence as we
    see it? Are these the metrics the researchers
    would use?
  • Cost effective
  • The beauty of citations - data accessibility,
    coverage, cost validation
  • Few studies undertaken of specialised databases
    (indexing non-journal pubs) validity coverage
    yet to be proven
  • Transparent, equitable and stable
  • Is it clear what the metrics do?
  • Are all institutions, staff subjects treated
    equitably?
  • Peer review?
  • Influence on behaviour?
  • Articles - publish less / only very best papers?
  • Collaborate more intensely?
  • Search engine optimisation?
  • Strategic planning benchmarking
  • Are bibliometrics retrospective only? Or do they
    provide a reliable predictive value with which to
    evaluate future research strategy i.e. time
    series trends?

71
The Library supporting institutional research
evaluation
  • Provide customized research performance profiles
    and workshops for disciplines and schools
  • Getting Published Making an Impact
  • Benchmarking Your Research Performance using
    Bibliometrics
  • University rankings
  • Local, national international research
    benchmarking requirements
  • Advocacy and awareness of evolving metrics,
    measures and their application
  • Customized bibliometric data profiles
  • Data analysis of university profile in key
    citation indices
  • Research Office
  • Quality Office
  • Deans of Research
  • Research / Centre Heads / PIs
  • Library colleagues

Key collaborations
72
Some conclusions
  • Evaluating research is difficult! Not everything
    that counts is countable, and not everything that
    is countable counts. (Albert Einstein, 1879
    1955)
  • Do the metrics corroborate or validate peer
    review and equally does peer review moderate the
    metrics?
  • Need to move towards research impact not outputs
  • A 2-pronged approach it takes experts to
    evaluate experts
  • Use multiple metrics fit for purpose and best
    advantage
  • Content is king but context is God maximising
    research tracking dissemination exposure
    greater impact!

73
Bibliometrics uncovered principles practices
of citation-based research evaluation Thank You!
  • Rosarie Coughlan
  • Research Support Librarian (STM)
  • Rosarie.coughlan_at_nuigalway.ie
Write a Comment
User Comments (0)