The Bioinformatics of Small Molecules - PowerPoint PPT Presentation

1 / 67
About This Presentation
Title:

The Bioinformatics of Small Molecules

Description:

BIND/SMID Genome Canada initiative. KEGG Japanese initiative. ChEBI European initiative ... http://redpoll.pharmacy.ualberta.ca/drugbank/ DrugBank ... – PowerPoint PPT presentation

Number of Views:82
Avg rating:3.0/5.0
Slides: 68
Provided by: DSW47
Category:

less

Transcript and Presenter's Notes

Title: The Bioinformatics of Small Molecules


1
The Bioinformatics of Small Molecules
  • David Wishart
  • University of Alberta
  • david.wishart_at_ualberta.ca

VanBUG Oct. 13, 2005
2
25,000
metabolite
3
Why Are Small Molecules Important?
  • Constituents to all macromolecules (DNA, RNA,
    protein, carbohydrates)
  • Serve as cofactors and signaling molecules to
    1000s of proteins
  • The chemistry part of biochemistry
  • 99 of all drug entities and 90 of all drug
    types are small molecules
  • 90 of all biomarkers used in clinical chemistry
    are small molecules

4
Small Molecules sit on top of the Pyramid of Life
Metabolomics Proteomics Genomics
1400 Chemicals
3000 Enzymes
25,000 Genes
5
Molecular Informatics
Cheminformatics Bioinformatics Bioinformatics
1400 Chemicals
3000 Enzymes
25,000 Genes
6
Cheminformatics vs. Bioinformatics
  • Cheminformatics The application of information
    technology to the study, analysis, distribution
    and archiving of chemical data
  • Bioinformatics The application of information
    technology to the study, analysis, distribution
    and archiving of molecular biological data

7
Two Solitudes
Bioinformatics
Cheminformatics
8
Cheminformatics vs. Bioinformatics
  • Established in the 1960s
  • Designed for the needs of organic chemists
  • User-pay, limited public access
  • Funded by large companies (MDL, Bielstein, Sigma,
    CAS)
  • Established in the 1990s
  • Designed for needs of molecular biologists
  • Web-based, open access model
  • Funded by large govt agencies (NCBI, EBI, NIH,
    GC)

9
Blurring the Boarders
2000
2005
Meta bolomics
Systems Biology
Proteomics
Genomics
10
Whats Driving This?
NIH Roadmap
11
Whats Driving This?
  • Govt funded drug discovery and drug research
  • Drive to find newer and better clinical
    biomarkers
  • Molecular imaging (fMRI, PET)
  • Biosimulation and improved modeling of metabolic
    pathways
  • Modeling past success of open data access model
    in biology to chemistry

12
Major New Initiatives
  • PubChem NIH/NCBI initiative
  • BIND/SMID Genome Canada initiative
  • KEGG Japanese initiative
  • ChEBI European initiative
  • DrugBank U of Alberta
  • Human Metabolome Project - U of Alberta
  • SimCell U of Alberta

Primary Focus on Databases
13
PubChem
http//pubchem.ncbi.nlm.nih.gov/
14
PubChem
  • Released Sept. 16, 2004
  • Part of the NIH Molecular Libraries Roadmap, led
    by Steve Bryant
  • Contains more than 850,000 molecules
  • 3 Linked databases Compound, Substance and
    Bioassay
  • Links out to PubMed abstracts, NCBI 3D structures
    and other Entrez resources

15
PubChem Details
16
NIH vs ACS
17
BIND Small Molecules
www.bind.ca
18
SMID Small Molecule Interaction Database
http//smid.blueprint.org/
19
BIND/SMID
  • Shows links (and mol. contacts) between small
    molecules and the macromolecules to which they
    bind
  • Extracted from PDB data
  • Supports search by SMID ID, Protein GI, PDB ID,
    Domain ID, Taxonomy
  • Supports BLAST sequence searches
  • SMID Genomes lists putative ligand interactions
    based on SMID/SMID BLAST

20
SMID Genomes
21
KEGG Kyoto Encyclopedia of Genes and Genomes
http//www.genome.jp/kegg/
22
KEGG
  • First small molecule database, established in
    1996
  • Links small molecules to EC data and known
    pathways
  • Source data for many other small molecule
    databases and tools
  • Provides limited linkage between small molecules
    and the enzymes they interact with

23
KEGG Contents
  • PATHWAY 29,921 pathways generated from 246
    reference pathways
  • GENES 1,138,129 genes in 31 eukaryotes 241
    bacteria 24 archaea
  • LIGAND 12,973 compounds, 2,469 drugs, 11,148
    glycans, 6,442 reactions
  • BRITE 7,526 KO (KEGG Orthology) groups

24
ChEBI Chemical Entities of Biological Interest
http//www.ebi.ac.uk/chebi/
25
ChEBI
  • Includes 5719 compounds and other molecular
    entities that are either products of nature or
    synthetic products used to intervene in the
    processes of living organisms
  • Derived from KEGG Ligand, IntEnz and Chemical
    Ontology
  • Provides structures, names, synonyms, InChI,
    Smiles, ontology, Registry s

26
Major New Initiatives
  • PubChem NIH/NCBI initiative
  • BIND/SMID Genome Canada initiative
  • KEGG Japanese initiative
  • ChEBI European initiative
  • DrugBank U of Alberta
  • Human Metabolome Project - U of Alberta
  • SimCell U of Alberta

27
DrugBank
http//redpoll.pharmacy.ualberta.ca/drugbank/
28
DrugBank
  • A freely accessible, web-enabled, fully queryable
    database that links drug structure/activity data
    with protein structure/function/sequence data
  • Brings well-developed bioinformatics concepts of
    search and comparison to medicinal chemistry
  • Links bioinformatics, proteomics and drug
    discovery together

29
DrugBank
  • Contains nomenclature, synthesis,
    structure/activity, physical chemistry info on
    1000 FDA approved drugs
  • Contains nomenclature, structure, sequence,
    pharmacology, drug metabolism info on
    corresponding biomolecular targets
  • Wrapped with extensive querying and search tools

30
DrugBank Browser
31
DrugCard Links
32
Query Tools
PharmaBrowse
ChemQuery
33
Query Tools
SeqSearch
DataExtractor
34
DrugBank Stats
35
DrugBank Applications
  • Newly sequenced proteomes can be analyzed
    automatically for similarities to existing drug
    targets, giving researchers quick lead ideas
  • Newly determined protein structures can be
    Autodocked to a large 3D structure database of
    known, well-behaved compounds to suggest lead
    ideas

36
DrugBank Applications
  • Newly synthesized or identified lead compounds
    can be compared to existing structures to
    assess/predict possible efficacy, cross
    reactivity, metabolism or physical properties
  • Existing drugs can be compared or analyzed for
    key trends, properties or features to help in
    drug design and drug synthesis efforts

37
Major New Initiatives
  • PubChem NIH/NCBI initiative
  • BIND/SMID Genome Canada initiative
  • KEGG Japanese initiative
  • ChEBI European initiative
  • DrugBank U of Alberta
  • Human Metabolome Project - U of Alberta
  • SimCell U of Alberta

38
Human Metabolome Database
www.hmdb.ca
39
HMDB
  • A web-accessible database that links endogenous
    human metabolite data to genes and diseases
  • Brings phys/chem data, structure data,
    spectroscopic data, concentration data, disease
    data and molecular biology data (SNPs, sequences,
    EC, GenBank, UniProt, GO, reactions, pathway,
    KEGG) into single repository

40
The HMDB MetaboCard
41
(No Transcript)
42
(No Transcript)
43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
(No Transcript)
47
(No Transcript)
48
(No Transcript)
49
(No Transcript)
50
(No Transcript)
51
(No Transcript)
52
(No Transcript)
53
Major New Initiatives
  • PubChem NIH/NCBI initiative
  • BIND/SMID Genome Canada initiative
  • KEGG Japanese initiative
  • ChEBI European initiative
  • DrugBank U of Alberta
  • Human Metabolome Project - U of Alberta
  • SimCell U of Alberta

54
BiosimulationThree Types of Simulation
Atomic Scale 0.1 - 1.0 nm Coordinate data Dynamic
data 0.1 - 10 ns Molecular dynamics
Meso Scale 1.0 - 10 nm Interaction data Kon,
Koff, Kd 10 ns - 10 ms Mesodynamics
Continuum Model 10 - 100 nm Concentrations Diffusi
on rates 10 ms - 1000 s Fluid dynamics
55
Nationalism in Simulation
  • Petri Nets Germany, Japan
  • Flux-Balance Analysis USA
  • Pi Calculus France
  • ODEs and PDEs Japan, UK
  • Agent-Based methods (CA) - Canada

56
CA Methods in Games
SimCity 2000
The SIMS
57
Dynamic Cellular Automata
  • A novel method to apply Brownian motion to
    objects in the Cellular Automata lattice (mimics
    collisions)
  • Takes advantage of the scale-free nature of
    Brownian motion and the scale-free nature of
    heterogeneous mixtures to allow simulations to
    span many orders of time (nanosec to hours) and
    space (nanometers to meters)

58
SimCell
http//wishart.biology.ualberta.ca/SimCell/
59
SimCell
  • Java application that uses Dynamic Cellular
    Automata (DCA) to model motions, interactions,
    transport and transformations at the meso-scale
    (10-8 to 10-6 m)
  • Uses a square, 2D lattice to model processes,
    lattice squares are equivalent to 3x3 nm regions
  • Molecular objects are moved randomly and
    interactions determined according to a set of
    interaction rules that are only applied when
    objects are in contact (collision detection)

60
Diffusion in Cytoplasm
61
Enzyme-Substrate Progress Curves
Lactate Lo (1 e-kt)
Lactate Lo (1 e-kt)
pyruvate NADH ? lactate NAD
62
The TCA Cycle SimCell
Acetate
Acetyl-CoA
Glycerol
Pyruvate
Oxaloacetate
Citrate
Isocitrate
L-Malate
?-Ketoglutarate
Fumarate
2
1
Succinate dehydrogenase
Succinate
Succinyl-CoA
63
Succinate Production
Observed Predicted (SimCell)
64
Glycerol Consumption
Observed Predicted (SimCell)
65
Summary
  • Small molecules are an integral part of genomics,
    proteomics system biology
  • Several drivers are pushing or merging
    cheminformatics into bioinformatics
  • New databases and new techniques are emerging to
    assist in drug discovery, toxicology, biomarker
    ID, cellular metabolism and cellular modelling
  • Bioinformatics ? Biosimulation

66
Thanks
  • Craig Knox
  • Savita Shrivastava
  • An Chi Guo
  • Murtaza Hassanali
  • Zhan Chang
  • Jennifer Woolsey
  • Kevin Jewell
  • Dan Tzur
  • Kevin Jeroncic
  • Joey Cruz
  • David Arndt
  • David Block
  • Peter Tang
  • Russ Greiner

AICML
67
BioAssay Database
  • The BioAssay Database contains bioactivity
    screens of chemical substances described in
    PubChem Substance
  • Provides searchable descriptions of each
    bioassay, including descriptions of the
    conditions and readouts specific to a screening
    protocol
Write a Comment
User Comments (0)
About PowerShow.com