Title: Citation Analysis for the Free, Online Literature
1Citation Analysis for the Free, Online Literature
- Tim Brody
- Intelligence, Agents, Multimedia Group
- University of Southampton
2Content
- Current services for Open Access Literature
- Institutional Archives Registry
- Metadata Harvesting through Celestial
- Citebase Search
- Citation Linking
- Search and Navigation Service
- Web Impact as a predictor of Citation Impact
3Institutional Archives Registry
4(No Transcript)
5Sites in the IAR
- Things we want to know
- GNU EPrints sites
- Other research collections (Other Archives, Open
Journals) - BOAI 1. vs BOAI 2.
- A submission form consisting of
- URL, Name, OAI URL, Country, type, full-text,
software - Cant (yet) track full-texts
- (Create a master-list so archives only
register-once?)
6Celestial
- Designed to
- Be an abstraction over OAI-PMH versions
- Caching OAI metadata records
- Technological questions
- How big can the OAI-PMH go (ok for 5 million
records so far) - How reliable are OAI-PMH implementations
- Feeds Citebase, IAR, some external users
7(No Transcript)
8(No Transcript)
9Services for Open Access Literature
OAIster
Scirus
Google
Search Engines Navigation Tools Analysis
Assessment
BMC
Citeseer
Citebase
Citation Analysis/Linking Services (Citebase /
Citeseer / OpenURL / DOI) Version Linking Services
OAI-PMH Transport
arXiv.org
Self-Archived Full-texts (Pre/Post-prints) Open
Access Publishing
n.b. Scirus/OAIster arent citation-analysis
aware yet, Google indexes Citeseer. Not an
exhaustive list
10Citation Analysis Linking
- A citation is a reference from one work to
another as a hyperlink a citation link - Citation analysis uses citation relationships to
analyse patterns in research - As a graph a work (paper, book etc.) is a vertex
and a citation an edge - Bibliometrics
- (study of patterns in literature)
11Digitometric/Infometric Analysis
- Bibliometrics for the online age
- Couple citation analysis with Web analysis
- (how many times has x been accessed?)
- Similar to readership studies, but easier to
survey and more comprehensive - (though subject to the same problems of copies
being re-distributed, multiple accesses etc.)
12Citebase Search
13Citation Linking
- Retrieve and cache full-texts
- LaTeX, PDF, XML
- Extract reference list
- Extract individual references
- Parse references into components
- Author, year, title, journal, volume, pagination
- Store in structured database
14Citebase Search
15(No Transcript)
16Citebase SearchNavigation by Citation Links
Article withreference list
Future
Referencelink
Related
Current Article
Co-cited
Past
17(No Transcript)
18Predicting Citation Impact
- The Web gives us access to new metrics
- Download/access frequency
- Can early-day download frequency give an
indication of longer-term citation frequency? - (Web logs from the UK arXiv.org mirror, Citation
data from Citebase Search) - Pearson correlation after 6 months of web logs
0.42 for the High Energy Physics sub-arXiv
19(No Transcript)
20(No Transcript)
21(No Transcript)
22(No Transcript)
23(No Transcript)
24Assessing Research(ers)
- Citation Impact
- By-Paper, Author, Journal, Institution
- Web Impact
- Predictor of citation-impact, combine with
citation-impact - Search Engines
- More detailed research assessment
25Comparing Online/Offline Impact
- Using ISI CD-ROM data
- Use Web crawlers to find online articles
- Compare citation impact of online and offline
articles - By discipline, by journal, by author?
- Initial results for Physics show 2-3x increase
- arXiv.org
- Southampton, U. Quebec, Oldenburg (de)
26Relevant Web Pages
- EPrints http//www.eprints.org/
- IAR http//archives.eprints.org/
- Citebase Search
- http//citebase.eprints.org/
- Celestial
- http//celestial.eprints.org/
- Correlation Generator
- http//citebase.eprints.org/analysis/correlation.p
hp - Tim Brody lttdb01r_at_ecs.soton.ac.ukgt