Title: Harnessing the Semantic Web to Answer Scientific Questions:
1- Harnessing the Semantic Web to Answer Scientific
Questions -
- A Health Care and Life Sciences Interest Group
demo - Susie Stephens, Principal Research Scientist,
Lilly
2Agenda
- Health Care and Life Sciences Interest Group
- Scientific Use Case
- Technological Approach
- Demonstration
- Benefits of the Semantic web
3Health Care Life Sciences Interest Group
- HCLSIG is chartered to develop and support the
use of Semantic Web technologies and practices to
improve collaboration, research and development,
and innovation adoption in the Health Care and
Life Science domains
More details on HCLS are available at
http//www.w3.org/2001/sw/hcls/
4Benefits of Semantic Web Technologies
- Fusion of data across many scientific
disciplines - Easier recombination of data
- Querying of data at different levels of
granularity - Capture provenance of data through annotation
- Perform inference across data sets
- Machine processable approach
- Data can be assessed for inconsistencies
5Scientific Use Case
- Use case focuses on Alzheimers Disease
- AD is a devastating illness that impacts 26.6
million people worldwide - Prevalence is predicted to quadruple to 106.8
million by 2050 - Many different types of evidence need to be
integrated - An active Web community exists for AD research
6Scientific Data Sets
- Integration and analysis of heterogeneous data
sets - Hypothesis, Genome, Pathways, Molecular
Properties, Disease, etc.
PDSPki
NeuronDB
Reactome
Gene Ontology
BAMS
Allen Brain Atlas
BrainPharm
Antibodies
Entrez Gene
MESH
NC Annotations
PubChem
Mammalian Phenotype
SWAN
AlzGene
Homologene
Publications
7Scientific Hypothesis Research Questions
- Scientific Hypothesis
- Amyloid beta peptide may impair memory by
inhibiting long-term potentiation (LTP) - Research Questions
- - By what mechanism does amyloid beta inhibit
LTP? - Can we identify
a novel therapeutic target based on
this mechanism?
- How can we validate the
therapeutic target? -
8Technological Approach
- Careful modeling that reflect biology to enable
integration of data sources - All bio-entities were assigned URIs
- Most data translated to RDF and managed in a
triple store - Other data maintained in original store and
mapped to RDF - Using a reasoner to infer triples to increase
expressiveness of queries - Query data with SPARQL and visualization tools
9Conclusions
- Semantic Web provides ability to query across
many disparate data sources to discover new
insights - Potential to identify patterns and insights
across many data sources - Data needs to be carefully modeled
- Flexible re-use of data, which is important in a
discipline where knowledge is frequently updated
10Acknowledgements
- HCLS Demo Contributors
- John Barkley (NIST)
- Olivier Bodenreider (NLM, NIH)
- Bill Bug (Drexel University College of Medicine)
- Huajun Chen (Zhejiang University)
- Paolo Ciccarese (SWAN)
- Kei Cheung (SenseLab, Yale)
- Tim Clark (SWAN)
- Don Doherty (Brainstage Research Inc.)
- Kerstin Forsberg (AstraZeneca)
- Ray Hookaway (HP)
- Vipul Kashyap (Partners Healthcare)
- June Kinoshita (AlzForum)
- Joanne Luciano (Harvard Medical School)
- Scott Marshall (University of Amsterdam)
- Chris Mungall (NCBO)
- Eric Neumann (Teranode)
- Eric Prudhommeaux (W3C)
- Jonathan Rees (Science Commons)
- HCLS Demo Contributors
- Susie Stephens (Eli Lilly)
- Mike Travers (
- Gwen Wong (SWAN)
- Elizabeth Wu (SWAN)
- Data Providers
- Judith Blake (MGD.)
- Mikail Bota (BAMS)
- David Hill (MGD)
- Oliver Hoffman (CL)
- Minna Lehvaslaiho (CL)
- Colin Knep (Alzforum)
- Maryanne Martone (CCDB)
- Susan McClatchy (MGD)
- Simon Twigger (RGD)
- Allen Brain Institute
- Vendor Support
- OpenLink - Kingsley Idehen, Ivan Mikhailov, Orri
Erling, Mitko Iliev - HP - Ray Hookaway, Jeannine Crockford
11NeuronDB
Protein (channels/receptors) Neurotransmitters Neu
roanatomy Cell Compartments Currents
PDSPki
Proteins Chemicals Neurotransmitters
GO
Reactome
Genes/proteins Interactions Cellular
location Processes (GO)
Molecular function Cell components Biological
process Annotation gene PubMedID
BAMS
BrainPharm
Protein Neuroanatomy Cells Metabolites
(channels) PubmedID
Drug Drug effect Pathological agent Phenotype Rece
ptors Channels Cell types pubMedID Disease
Allen Brain Atlas
Entrez Gene
Antibodies
Genes Brain images Gross anatomy -gt neuroanatomy
Genes Protein GO pubmedID Interaction
(g/p) Chromosome C. location
Genes Antibodies
MESH
Drugs Anatomy Phenotypes Compounds Chemicals PubMe
dID PubChem
Genes/Proteins Processes Cells (maybe) PubMed ID
Name Structure Properties Mesh term
NC Annotations
Genes Phenotypes Disease PubMedID
PubChem
Genes Species Orthologies Proofs
Mammalian Phenotype
PubMedID Hypothesis Questions Evidence Genes
Gene Polymorphism Population Alz Diagnosis
Homologene
SWAN
AlzGene