Text mining and Open Access publishing - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Text mining and Open Access publishing

Description:

The current model of publishing scientific research. Scientists carry out research ... BMC Dermatology. BMC Ear, Nose and Throat Disorders. BMC Emergency Medicine ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 30
Provided by: mattc156
Category:

less

Transcript and Presenter's Notes

Title: Text mining and Open Access publishing


1
Text mining and Open Access publishing
Matthew CockerillTechnical Director, BioMed
Central
2
Summary
  • What is Open Access publishing?
  • Open Access publishing and text mining
  • About BMC Bioinformatics
  • The BioCreative supplement

3
Summary
  • What is Open Access publishing?
  • Open Access publishing and text mining
  • About BMC Bioinformatics
  • The BioCreative supplement

4
The current model of publishing scientific
research
  • Scientists carry out research
  • They write up their results
  • They submit them to a journal
  • Other scientists act as peer reviewers and
    editorial advisers
  • Finally, the publisher sells access to that
    research back to the scientific community

5
Whats wrong with this status quo?
  • Restricted access to scientific research is
    contrary to the interests of
  • the scientists who do the research
  • the funders who pay for it
  • society as a whole
  • It is an historical artefact of the economics of
    print publishing
  • It is a serious obstacle to mining of full text
    information

6
BioMed Central The Open Access publisher
  • Commercial organization
  • Published first article in mid-2000
  • Strict policy of immediate Open Access to all
    research articles

7
Growth of BioMed Central
8
Momentum for Open Access
  • PubMed Central
  • Public Library of Science
  • Open Access declarationsBudapest/Bethesda/Berlin
  • Software open-source movement
  • Mass cancellation of titles from traditional
    publishers

9
BioMed Centrals business model for open access
publishing
  • Keep costs down via
  • Online submission and peer review
  • Automated tools to streamline article processing,
    conversion and layout
  • Processing charge (currently 525) for accepted
    articles
  • No processing charge for authors at member
    institutions

10
Institutional membership
More than 400 institutions are members of BioMed
Central, including, to name just a few
  • CalTech
  • Cancer Research UK
  • Columbia University
  • Cornell University
  • University of California
  • Dana-Farber Cancer Institute
  • Harvard University
  • INSERM
  • Imperial College
  • Institut Pasteur
  • John Innes Centre
  • Johns Hopkins University
  • Kyoto University
  • Max Planck Institutes
  • Memorial Sloan-Kettering Cancer Center
  • MRC Laboratory of Molecular Biology
  • National Institutes of Health
  • National Institute for Medical Research
  • NHS England
  • Princeton University
  • Rockefeller University
  • TIGR
  • TSRI
  • Tufts University
  • Wellcome Trust Sanger Institute
  • University of Wisconsin
  • World Health Organization
  • Yale University

11
Summary
  • What is Open Access publishing?
  • Open Access publishing and text mining
  • About BMC Bioinformatics
  • The BioCreative supplement

12
Mining the full text
  • Analysing results of high-throughput experiments
    means biologists increasingly need text-mining
    tools
  • PubMed is currently the primary resource for text
    mining (its whats available) but
  • Abstracts omit critical information
  • Techniques developed for abstracts may not
    effectively use extra information in full text
  • Fully Open Access corpora, in standard XML
    formats, will help

13
Data mining - BioMed Central
http//www.biomedcentral.com/info/about/datamining
  • Entire corpus of full text XML downloadable by
    ftp as a single zip file
  • Various groups working with the data
  • E.g Pre-BIND (automatic extraction of possible
    protein-protein interaction information from full
    text)
  • No restrictions on redistribution
  • This means other groups can use same corpus to
    repeat and build on results

14
Data mining - BioMed Central (screen shot)
15
Data mining - PubMed Central
http//www.pubmedcentral.com/about/oai.html
  • Standard NLM archiving/interchange XML DTD
    common format across multiple publishers
  • Only a subset of PubMed Central participating
    publishers allow download of full text XML
  • BioMed Central
  • Public Library of Science
  • Hopefully, more will follow.
  • XML made available via OAI interface

16
Data mining - PubMed Central
17
Adding structure to full text data
  • Some examples of useful structure
  • Structure of article itself (figure legends,
    materials and methods, references etc)
  • MathML, CML etc
  • Disambiguated references to genes/proteins

18
Authoring tools are key
  • Manuscript structureEndNote, TeX/BibTeX pretty
    good already
  • MathML
  • Publicon, TeX etc.
  • CML
  • Chemsketch etc.
  • Gene/protein reference markup?
  • Semi-automatic markup during authoring
  • Author reviews and confirms markup
  • System prompts author to clarify ambiguity c.f.
    grammar checker, code intelligence

19
Summary
  • What is Open Access publishing?
  • Open Access publishing and text mining
  • BMC Bioinformatics
  • The BioCreative supplement

20
BMC series of online journals
  • BMC Biochemistry
  • BMC Bioinformatics
  • BMC Biotechnology
  • BMC Cell Biology
  • BMC Chemical Biology
  • BMC Developmental Biology
  • BMC Ecology
  • BMC Evolutionary Biology
  • BMC Genetics
  • BMC Genomics
  • BMC Immunology
  • BMC Microbiology
  • BMC Molecular Biology
  • BMC Neuroscience
  • BMC Pharmacology
  • BMC Physiology
  • BMC Plant Biology
  • BMC Structural Biology
  • BMC Anesthesiology
  • BMC Blood Disorders
  • BMC Cancer
  • BMC Cardiovascular Disorders
  • BMC Clinical Pathology
  • BMC Clinical Pharmacology
  • BMC Complementary and Alternative Medicine
  • BMC Dermatology
  • BMC Ear, Nose and Throat Disorders
  • BMC Emergency Medicine
  • BMC Endocrine Disorders
  • BMC Family Practice
  • BMC Gastroenterology
  • BMC Geriatrics
  • BMC Health Services Research
  • BMC Infectious Diseases
  • BMC International Health and Human Rights
  • BMC Medical Education
  • BMC Medical Ethics
  • BMC Medical Imaging
  • BMC Medical Informatics and Decision Making
  • BMC Medical Research Methodology
  • BMC Musculoskeletal Disorders
  • BMC Nephrology
  • BMC Neurology
  • BMC Nuclear Medicine
  • BMC Nursing
  • BMC Ophthalmology
  • BMC Oral Health
  • BMC Palliative Care
  • BMC Pediatrics
  • BMC Pregnancy and Childbirth
  • BMC Psychiatry
  • BMC Public Health
  • BMC Pulmonary Medicine
  • BMC Surgery
  • BMC Urology
  • BMC Women's Health

21
BMC Bioinformatics
22
RSS feeds
23
Open access leads to high visibility
  • Indexing/Linking
  • PubMed
  • MEDLINE
  • ISI
  • BIOSIS
  • CAS
  • CrossRef
  • Scirus
  • Open Archive Initiative
  • Citebase
  • Google
  • Archiving
  • PubMed Central
  • INIST
  • LOCKSS
  • Max Planck
  • OhioLINK

24
BMC Bioinformatics - citation impact
25
Summary
  • What is Open Access publishing?
  • Open Access publishing and text mining
  • About BMC Bioinformatics
  • The BioCreative supplement

26
Process for publishing in BMC Bioinformatics
supplement
  • Follow BMC Bioinformatics Research Article
    instructions for authors
  • Send articles to BioCreative organizers who will
    coordinate peer reviewdo not submit articles
    online
  • Supplement passed on to BioMed Central for XML
    markup and publication
  • 400 processing charge/article

27
Instructions for authors
28
Access to supplement
  • All articles in supplement covered by BioMed
    Centrals Open Access licence agreement
  • Free access
  • Free re-distribution/re-use
  • Supplement indexed in PubMed and permanently
    archived in PubMed Central

29
Thats it
Write a Comment
User Comments (0)
About PowerShow.com