CiteSearch: Multifaceted Fusion Approach to Citation Analysis - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

CiteSearch: Multifaceted Fusion Approach to Citation Analysis

Description:

Parse search results. Eliminate noise (duplicate citations) ... Parse & normalize the search results. Compute various citation-based quality evaluation measures ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 28
Provided by: stude603
Category:

less

Transcript and Presenter's Notes

Title: CiteSearch: Multifaceted Fusion Approach to Citation Analysis


1
CiteSearch Multi-faceted Fusion Approach to
Citation Analysis
  • Kiduk Yang and Lokman Meho
  • Web Information Discovery Integrated Tool
    Laboratory
  • School of Library and Information Science,
    Indiana University
  • January 18, 2007

2
CiteSearch What, Why, How
  • Goal
  • Quality Assessment of Scholarly Publications
  • Motivation
  • Lack of comprehensive citation database
  • Limitations of conventional citation analysis
  • One-dimensional assessment
  • Misleading evaluation
  • Approach
  • Multi-faceted, Fusion-based Citation Analysis
  • Combine data from multiple citation databases
  • Assess quality using various quality evaluation
    measures

3
CiteSearch Study Overview
  • Objectives
  • Investigate current citation analysis environment
  • Test the viability of CiteSearch system
  • Method
  • Search citation databases and compare the results
  • Setup
  • Study sample
  • Publications of 15 SLIS faculty members (approx.
    1,100 publications)
  • Databases used
  • Google Scholar, Scopus, Web of Science
  • Citation sources
  • Journals and conference papers in 1996-2005

4
Citation Databases
  • Data collection
  • WoS 100 hours
  • Scopus 200 hours
  • GS over 3,000 hours

5
Scopus and WoS Citation Count
  • Scopus vs. WoS
  • 14.0 (278) more citations by Scopus
  • More comprehensive coverage by Scopus (15,000 vs.
    8,700 periodicals)
  • Scopus WoS
  • Scopus increases WoS citations by 35 (710)
  • WoS increases Scopus citations by 19.0 (432)
  • Relatively low overlap (58) and high uniqueness
    (42)

Web of Science (2,023)
Scopus (2,301)
26(710)
58(1,591)
16(432)
Scopus ? WoS (2,733)
6
Impact of Scopus By Research Area
- varies significantly between research areas
7
Impact of Scopus on Faculty Members Relative
Ranking
Scopus significantly alters the relative ranking
of those faculty members that appear in the
middle of the rankings
8
Scopus WoS Citation Count By Document Type
Conference Papers Only
WoS (229)
Scopus (359)
54(267)
28(137)
18(92)
Scopus ? WoS (496)
9
Scopus WoS Summary of Results
  • Coverage
  • Varies greatly between research areas
  • Increase in citations ranges from 5 to 99 by
    combining results from both databases
  • Scopus has a much better coverage of conference
    proceedings
  • Overlap 18
  • Scopus only 54
  • WoS only 28
  • Ranking by citation count
  • Relative ranking of faculty members changes
    significantly for those in the middle

10
Google Scholar Citations By Document Type
11
Citations By Language
12
Impact of GS By Research Area
13
Impact of GS on Faculty Members Relative Ranking
GS does not significantly alter the rankings of
faculty members
14
GS vs. Scopus?WoS
  • GS increases WoS?Scopus citations by 93 (2,552)
  • Scopus?WoS increases GS citations by 26 (1,104)
  • GS identifies 53 (or 1,448) more citations than
    WoS?Scopus
  • GS has much better coverage of conference
    proceedings
  • (1,849 by GS vs. 496 by Scopus?WoS)
  • GS has over twice as many unique citations as
    Scopus?WoS
  • (2,552 vs. 1,104, respectively)

Google Scholar (4,181)
Scopus?WoS (2,733)
31(1,629)
48(2,552)
21(1,104)
GS ? Scopus?WoS (5,285)
15
GS Scopus?WoS Summary of Results
  • Coverage
  • Varies greatly between research areas
  • 23 to 144 increase by combining GS Scopus?WoS
  • 5 to 98 increase by combining Scopus WoS
  • GS has strong coverage in CS IS
  • HCI, IR, computational linguistics, social
    informatics
  • Scopus?WoS has stronger coverage in LS
  • Bibliometrics, collection development,
    information policy
  • GS provides significantly better coverage of
    non-English materials
  • GS (7) Scopus (1) WoS (1)
  • Ranking
  • No significant changes in relative ranking of
    faculty members

16
Findings
  • Scopus, WoS, and GS complement rather than
    replace each other
  • GS can be useful in showing evidence of broader
    international impact than could possibly be done
    through Scopus and WoS
  • GS can be very useful for citation searching
    purposes however, it is not conducive for
    large-scale comparative citation analyses
  • Scopus significantly alters the relative citation
    ranking of scholars as measured by Web of
    Science. GS does not

17
Conclusions
  • Multiple sources of citations should be used to
    generate accurate citation counts and rankings
  • Citation databases complement one another
  • Small overlap between sources may significantly
    influence relative ranking
  • Multi-faceted citation analysis is needed
  • citation coverage varies by research area,
    document type, language
  • CiteSearch can greatly facilitate citation
    analysis
  • Enormous effort is required to
  • Refine search strategy
  • Parse search results
  • Eliminate noise (duplicate citations)
  • Extract normalize citation metadata

18
CiteSearch System Overview
  • A Web-based citation search and analysis tool
  • Work-in-progress prototype system
  • Search multiple citation sources
  • Google Scholar, Web of Science, Scopus, EBSCO,
    ProQuest, etc.
  • Extract and compile citation metadata
  • Parse normalize the search results
  • Compute various citation-based quality evaluation
    measures
  • Document-based measures
  • Weighted citation counts, CiteRank
  • Author-based measures
  • Weighted publication counts, H-Index, Mentor-Index

19
(No Transcript)
20
(No Transcript)
21
CiteSearch System Architecture
22
End
23
CiteSearch System Work-in-Progress
  • Federated Citation Search
  • To compile comprehensive usable citation data
  • Query multiple citation databases
  • Filter out noise
  • e.g., invalid, duplicate citations
  • Extract normalize metadata
  • bibliographical metadata (e.g., title, author,
    year, source, etc.)
  • citation metadata (e.g., doctype, subject,
    language, etc.)
  • Multi-faceted Citation Analysis
  • To produce multi-faceted quality/impact
    assessment measures that
  • account for variance in citation quality (e.g.,
    Weighted citation counts, CiteRank)
  • consider various facets of evaluation metric
    (e.g., Document type, language)
  • accommodate diffent aspects of quality assessment
    (e.g., H-Index, Mentor-Index)
  • Compute citation-based quality scores (CQS) for
    each publication
  • Compute CQS for authors, schools, publishers
    using publication CQS
  • Compute CQS for each publication weighted by
    author/school/publisher scores
  • Compute CQS for authors, schools, publishers
    using weighted publication CQS
  • Repeat steps 3 and 4 until convergence

24
CiteSearch Study Citation Databases
  • Web of Science
  • 3 Institute for Scientific Information (ISI)
    databases
  • Standard tool for citation studies worldwide
  • 35 million records from 9,000 publishers
  • Scopus
  • Produced by Elsevier
  • 27 million records from 15,000 publishers
  • Google Scholar
  • 500 million records
  • UBC (http//weblogs.elearning.ubc.ca/googlescholar
    /archives/025964.html)
  • Unknowns
  • Coverage (subject, publisher, time-span)
  • Document type and refereed status of records

25
Google Scholar Citations by Year
26
Sources of Unique Citations
27
CiteSearch Study GS Scopus WoS
Scopus (2308)
Google Scholar (4203)
8.2 (435)
5.3 (282)
48.3(2561)
18.3 (970)
11.7 (617)
4.3 (230)
3.8 (204)
WoS (2025)
GS ? Scopus ? WoS (5307)
Write a Comment
User Comments (0)
About PowerShow.com