Title: and
1and
- Tools for exploring the biomedical information
landscape
Les Grivell EMBO Electronic Information Programme
EAHIL 2004, Santander,
2- Electronic information programme
Online research information environment for the
life sciences
A next generation information service for the
life sciences
Communities_at_embo
Life Sciences Mobility Portal
3but to the early days ofscientific
publishing(pre- impact factor)
But first, let me take you back not to
Altomira,
4When libraries were comfortable places that had
everything you needed
5and it was possible to keep track of the
literature . (more or less)
6Where are we now?
Publishing is big business
- STM publishing is a multi-billion EUR
activity(In the UK alone, GBP 22 billion in
2000) - Estimated 164000 scientific periodicals
worldwide around 16 of these are online
7 Core science core journals
- PubMed lists some 4600 journals in bio-medical
disciplines - As of 19 Sept 2004, 4429 of these are online
- The PubMed database provides access to circa 15
million abstracts (but if you cant be found, you
wont be read ) - The Science Citation Index lists 5876 journals
with impact factors ranging from 54.45 0.00.
(youve been found, but are you worth reading? )
8Another information explosion genomics
9Raw sequences are not the onlyform of digital
information
10The nice thing about biological information
resources is that there are so many ..
- Hundreds of different databases, many in
flat-file format - A variety of user interfaces
- General lack of interoperability
11Wouldnt it be nice to find all published
literature references for a large set of gene
symbols and explore their relationships?
Co-regulated genes
Find literature
12This is not really such a novel idea .
13Fritz Saxl (1890 1948)
Ich will nicht, dass in der Bibliothek ewig
gesucht wird! Dieses Suchen kostet Nerven und die
dürfen nicht verschwendet werden an solche
Dummheiten...
Aby Warburg (1866 1929)
14Saxl WarburgMnemosyne Atlas
15Some text search engines
Bibliographic databases
Full text / web-pages
16Pubmed
No direct linkage to other datasets
All documents stored and indexed in one location
17main features
- Ability to interconnect literature articles with
different types of molecular data, including
images - Ability to search through and retrieve journal
articles and other full text documents, even when
in different physical locations - Ability to support multi-lingual documents and
queries - Services free to the academic community
A discovery tool
Features implemented via conceptual
fingerprinting
18conceptual fingerprints
19prototypes
- Initial prototypes in September 2002 and July
2003 - Current prototype online since 1st March 2004
- Next launch due mid-October 2004
20E-BioSci
21 and now a word about
8 partners ( DE, ES, FR,UK)
(Platform)
13 partners (ES, FR, IT, NL, UK)
(Research project)
22Oriels aims
23Wouldnt it be nice to be able to navigate from
an image to literature and molecular databases?
24Gene symbol identification in text
Text containing symbols
25Improved literature molecular dataset linkage
Twinkle, twinkle, little star,How I wonder what
you are.Up above the world so high,Like a
diamond in the sky.Twinkle, twinkle, little
star,How I wonder what you are
26Problems in gene symbol recognition
- Many gene symbols are indistinguishable from
everyday words or abbreviations - Synonyms
- Homonyms
- Homonym synonyms (ELK1 SAP1 CAR1 SAP1 BD-2
SAP1 RIP1_SAPOF SAP1)
27Word-processing
28Natural language processing
29Protein interaction networks
ataxia
Yfh1
requires
regulates
Ssc1
Isu1
interacts
activates
Oct1
30Hoffman Valencia (Madrid)
31Some web-addresses