Title: Science In Silico
1Science In Silico
- Carol J. Bult, Ph.D.
- Mouse Genome Informatics
- The Jackson Laboratory
- August 2002
2http//www.nature.com/genomics/human/papers/409860
a0_r_2.html
3Genomes on Parade
From M. Gerstein, Yale University, 2001
4Genome Sequencing Status 02(completed genomes)
- 16 Archaea
- 68 Bacteria
- 913 Viruses
- 8 Eukaryotes
http//www.ncbi.nlm.nih.gov/entrez/query.fcgi?dbG
enome
5Now Available in Draft Form!
The Mouse Genome
http//www.ncbi.nlm.nih.gov http//genome.ucsc.edu
http//www.ensemble.org
6Genetics is key to human health. The Jackson
Laboratory is key to genetics. http//www.jax.or
g
7Research Areas
- Cancer genetics
- Development and aging
- Immune system and blood disorders
- Neurological and sensory disorders
- Metabolic diseases
- Bioinformatics and statistical genetics
8Human
Why Mouse? Human and mouse genomes have
conserved blocks of genetic material
Mouse
Source Lisa Stubbs, Lawrence Livermore National
Lab
9Why Mouse? Humans and mice share many of
the same genes
Mouse- Human Comparative Map (2 cM around Acrb
gene)
http//www.informatics.jax.org
10Why Mouse? Humans and mice suffer from
similar diseases
Species Concordance for Susceptibility Alleles
for Hypertension
Hypertension QTL in Mouse Human (B. Paigen G.
Churchill)
11The Paradigm Shift in Biology
- The new paradigm, now emerging, is that all the
genes will be known (in the sense of being
resident in databases available electronically),
and that the starting point of a biological
investigation will be theoretical. An individual
scientist will begin with a theoretical
conjecture, only then turning to experiment to
follow or test that hypothesis. - - Walter Gilbert. Toward a paradigm shift in
biology. Nature, 34999 (1991).
12http//cagle.slate.msn.com/news/gene/
13What is Bio-Informatics?
- The information technologies that bring together
data, analytical software and methods, and people
to drive biological discovery
14Data Integration is the Key to Making Sense of
Sequence
http//www.informatics.jax.org
15Exploring Biology on the Internet
- What disease are known to be associated with
specific genes? - Where are the genes in the genome?
- What are the characteristics of these genes?
- - i.e., size and structure
- What are known mutations in these genes?
- Where is the gene expressed?
- Do the mutations impact protein structure?
- Are there equivalent genes in other organism?
- Are the genes in other organisms associated with
similar biological or disease processes as in
humans? - Are there gene and protein specific reagents like
clones and antibodies available to use in
experimental approaches?
16Concepts
- Basic genetics and molecular biology
- Connections between genes, variation, and disease
- Protein structure and function
- Comparative genomics
- Model organisms can yield insights into human
biology and disease - Orthology and paralogy
17Not all databases are created equal..
- Primary Databases (GenBank, EMBL, DDBJ)
- Original submissions by experimentalists
- Database staff organize but dont add additional
information - Derivative Databases (NCBI, Model organsim
databases, etc.) - Curated/expert review
- compilation and correction of data
- Computationally Derived
- Combinations
- Sifted Databases (GeneCards)
- Compilation of data from different derivative
databases
18Todays Focus
- Given a sequence, gene, or disease of interest
how do I use existing bioinformatics resources to
find out the current state of knowledge?
19Start with a Known Gene or Disease
20(No Transcript)
21http//www.ncbi.nlm.nih.gov80/entrez/query.fcgi?d
bOMIM
22p53
23Information for this Gene in OMIM
Links to Gene Details
24(No Transcript)
25Survey of Mutations in this Gene
26http//archive.uwcm.ac.uk/uwcm/mg/hgmd0.html
27(No Transcript)
28Link to NCBIs LocusLink
29Links!!
http//www.ncbi.nlm.nih.gov/LocusLink/index.html
30Map Details
31(No Transcript)
32http//genome.ucsc.edu
33Links Again!!
Get the sequence of the Gene and corresponding
Protein
34http//bioinfo.weizmann.ac.il/cards/index.html
35(No Transcript)
36(No Transcript)
37Links!!
http//www.ncbi.nlm.nih.gov/LocusLink/index.html
38(No Transcript)
39Details for the equivalent mouse gene
http//www.informatics.jax.org
40Start with a Sequence
41Get a sequence for this gene here
Or here
42Sequence in FASTA format
Jim Kents BLAST search tool http//genome.ucsc.e
du
43(No Transcript)
44http//genome.ucsc.edu
45OMIM
LocusLink
PubMed
GeneCards
MGI
46Final Thoughts
- Biological databases are MUCH more interconnected
than they used to be - Easier to navigate to and from related data
- But can be confusing same information is
presented in multiple ways. Easy to get lost in a
maze of info. - Dont assume that because you find information at
a particular resource that they are the ones that
produced or curated it! - Many information resources (like GeneCards)
sift data from many databases and re-display it - Proper attribution for electronic resources is as
important as for published information
47Human Genome Project Informationhttp//www.ornl.g
ov/hgmis/GOOGLE!http//www.google.com
48http//cagle.slate.msn.com/news/gene/