Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) - PowerPoint PPT Presentation

About This Presentation
Title:

Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C)

Description:

Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C) UNIVERSITY OF TORONTO – PowerPoint PPT presentation

Number of Views:384
Avg rating:3.0/5.0
Slides: 61
Provided by: utorontoC4
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Genomics and Proteomics - Historical Perspective and the Future Eleftherios P. Diamandis, M.D., Ph.D., FRCPC (C)


1
Introduction to Genomics and Proteomics -
Historical Perspective and the FutureEleftherios
P. Diamandis, M.D., Ph.D., FRCPC (C)
UNIVERSITY OF TORONTO (Course 1505S/Jan. 9, 2001
1)
2
Organization of the
LectureHistorical BackgroundThe Human Genome
ProjectCritical Technologies Massive,
automated sequencing DNA and RNA
analysis Mass spectrometry
DNA and protein microarrays
Bioinformatics Single nucleotide
polymorphismsApplications
Diagnostics Therapeutics
PharmacogeneticsEthicsPatents
(Course 1505S/Jan. 9, 2001 2)

3
Historical MilestonesYear Miles
tone1866 Mendels discovery of
genes1871 Discovery of nucleic
acids1951 First protein sequence
(insulin)1953 Double helix structure of
DNA1960s Elucidation of the genetic
code1977 Advent of DNA sequencing1975-79 First
cloning of human genes1986 Fully automated DNA
sequencing1995 First whole genome (Haemophilus
Influenza)1999 First human chromosome(Chr
22)2000 Drosophila / Arabidopsis
genomes2001 Human and mouse genomes
(Course 1505S/Jan. 9, 2001 3)
4
Terminology DNA Genomics mRN
A Transcriptomics Protein Proteomics Met
abolites Metabolomics Functional genomics,
proteomics ----- etc. (Course 1505S/Jan.
9, 2001 4)
5
HistoryOn June 26, 2000, at The
White House, it was announced that the Human
Genome Project was essentially completed by -
Celera Genomics (private company) The National
Human Genome Research Initiative and its
International Partners (publicly funded)Work
has yet to be published but Celera scientists
submitted a paper to Science on December 6,
2000. (Course 1505S/Jan. 9, 2001 5)
6
HistoryOn June 26, 2000, at The
White House, it was announced that the Human
Genome Project was essentially completed by -
Celera Genomics (private company) The National
Human Genome Research Initiative and its
International Partners (publicly funded)Work
has yet to be published but Celera scientists
submitted a paper to Science on December 6,
2000. (Course 1505S/Jan. 9, 2001 5)
7
Diagnostics / Prognostics Does
my DNA predispose me to a specific
disease? Do I want to know? (Ethics) Genetic
mutations disease cancer
diabetes Alzheimers
heart disease Whole genome scans for
identification of mutations/polymorphisms?

AACC2000-2 - 1

8
Pharmacogenetics and
PharmacogenomicsGoal is to associate human
sequence polymorphisms with Drug
metabolism Adverse effects therapeutic
efficacy ß Decrease drug development cost
Optimize selection of clinical trial
participants Increase patient
benefit AACC2000-2 - 3
9
Critical Protein TechnologiesProtei
n Make pure form (recombinant) Activity Reag
ents (antibodies) Identification
(sequencing) Identify post-translational
modification (glycation, phosphorylation,
etc.) Protein-protein interactions
(physiological function) Gene protein
knockout / transgene AACC2000
-2 - 13
10
Models of Human Disease Identify
natural human knockouts Develop mice with
every gene (or gene combination) being knocked
out (this project is now underway!) AA
CC2000 -2 -14
11
Expressed Sequence Tags
(ESTs) Cloned cDNAs from various tissues (cDNA
libraries) Can search through by BLAST
analysis Can purchase them, fully sequence and
characterize them Great help for new gene
identification. AACC2000 -2 -16
12
Gene Patents Gene
fragments Whole genes without function Whole
genes with function Whole genes with function
and utility (enablement)
AACC2000 -2 - 18
13
Where Do We Stand Today? (July
2000)Public Consortium 85 of Genome is
done 24 finished form 22 near
finished 38 draft rest is being
doneCelera Claims to have more than 99 of
genome now!Incyte They may have all the
genes! AACC2000 -2 -25
14
Where Does the Individual Researcher
Stand? At the end of the day, each gene must
be looked at in great detail - structure -
function - physiology/pathways -
pathophysiology - connection to disease -
tools Individual researchers can make the big
discoveries on a very specific gene or a very
specific gene family Great time for individual
researchers AACC2000 -2 - 20
15
The Future of Genome
Projects Human Mouse (just
started) Rat Zebra
Fish Dog Other
Primates The Era of Comparative Genomics (you
can learn a lot about humans by studying the
yeast, drosophila, mouse, etc.)
AACC2000 -2 - 21
16
The Impact of the Human Genome Project
in Medicine You cant make a car if you are
missing parts Once all genes are known, we will
start understanding their function
PATHWAYS We will then be able to correlate
disease states to certain genes
(Pathobiology) DISEASE GENE (S) GENE (S)
DISEASE We will then find ways for rational
treatments (designer drugs), prevention,
diagnosis AACC2000 -2 - 22
17
Gene Manipulation (Ethics??)Gene
modulation ( regulation)Gene repairGene
excisionGene replacement/transplantationGene
improvement AACC -2 -23
18
Celeras Whole Genome Shotgun
Strategy Doe not use BAC clones cuts whole
DNA into millions of pieces which are
sequenced Computer assembles pieces
together Achieve high accuracy with X6
coverage Lots of relatively short
gaps AACC -2 - 26
19
Strategy to Sequence Human
GenomeConstruct a human genomic library in an
appropriate vector (BAC)Assemble overlapping
BAC clones in order to obtain full coverage of
the distance (restriction map)

DNABACClonesStart sequencing
each BAC until you finish the job AACC
-2 - 27
20
How are these BACs
Sequenced? Shotgun SequencingBAC clone is
broken down to small pieces which have
overlapping endsSmall pieces are sequenced and
a computer assembles the pieces based on the
overlapping sequence informationConstruct
contigs (contiguous areas of sequence)Larger
contigs ------------------------ AACC -2
-28
21
Other Important Genomic
Technologies Recombinant DNA
(cloning) PCR Pulsed Field Gel
Electrophoresis (PFGE) Chromosome
microdissection Somatic hybrid cell lines
(mapping) rodent x human Radiation hybrid
cell lines rodent x human DNA
sequencing AACC2000- 2- 32
22
AnnotationWhat is
annotation?Make sense out of a linear sequence
identify genes, intron/exon boundaries,
regulatory sequences, predict protein structure,
identify motifs, predict function,
etc.Annotation will likely go on for a few
years.Major annotation tool Þ BIOINFORMATICS
(hardware software)
23
Celera Genomics The publicly
funded project started around 1990 with a goal
to produce a highly accurate sequence by
2005 Celera started in 1998 and within 2 years
sequenced more DNA than the publicly funded
consortium! Why? No bureaucracy Facility
(300 sequencers x 24h/day) Powerful
supercomputer Lots of money More efficient
sequencing approach (no BACs necessary) Use
of data from the publicly funded
project AACC2000- 2 -30
24
Cloning Vectors Replicable units
of DNA which can carry exogenously inserted
DNA size of insert varies with vector type
plasmid 5-10 kb l phage 20 kb
cosmid 45 kbPAC/BAC (P1- or bacterial
artificial chromosome) 100 - 200 kbYAC (yeast
artificial chromosome) 1,000 kb
AACC2000- 2- 31
25
Human Genome 3 x 109 base
pairs Approximately 100,000 genes lt 10 of
DNA encodes for genes the rest represents
introns/repetitive elements Importance of
non-coding sequences currently not
understood AACC2000 -2 -33
26
Quality of Sequencing Clones are
sequenced more than once to verify the sequence
many times x 4 rough draft 1 error per 100
bases x 8-11 finished draft 1 error per
10,000 bases AACC2000 -2 -34
27
The Next Race It will not be who
has the sequence It will be how you can use
the sequence to arrive at products
DIAGNOSTICS THERAPEUTICS AACC2000
-2- 35
28
Genomics and Drug
DiscoveryGenomic technologies are involved in
all aspects of the drug discovery process from
target validation though to the marketed drug,
which include Molecular target
identification Drug target characterization
and validation Lead discovery Lead
optimization Clinical candidate to marketed
drug AACC2000- 2- 37
29
Key Corporate Players in
ProteomicsCompay Location ApproachCelera
Rockville, MD DatabasesIncyte
Pharmaceuticals Palo Alto, CA DatabasesGeneBio
Geneva, Switzerland DatabasesProteome
Inc. Beverly, MA DatabasesPE
Biosystems Framingham, MA InstrumentationCipherg
en Biosystems Palo Alto, CA Protein
arraysOxford GlycoSciences Oxford, UK 2D
gel/MSProtana Odense, Denmark 2D
gel/MSGenomic Solutions Ann Arbor, MI 2D
gel/MSLarge Scale Proteomics Corp. Rockville,
MD 2D gel/MS _____________________________________
_________________ 2D gel electrophoresis and
mass spectrometry
AACC2000- 2-381
30
Pharmacogenetics and Pharmacogenomics
in Drug Discovery________________________________
_______________________Aspect of Drug
Development Approach ___________________
____________________________________Drug-drug
interactions Examine polymorphism in metabolic
enzymesEfficacy Differentiate responders
from nonrespondersSide Effects Examine
variation in gene or genes involved in
mediating the effects (may be mechanism
related or unrelated)Toxicity Gene expression
profiling in cells treated with compound.
Look for toxicity signatures.

AACC2000- 2-
39
31
The Biography of the Year 2000(Francis Collins
and J.Craig Venter)
32
Creating an Array of Contigous BAC Clones
33
The .omics
34
(No Transcript)
35
Introduction to Genomics and Proteomics -
Historical Perspective and the FutureEleftherios
P. Diamandis, M.D., Ph.D., FRCPC (C)
UNIVERSITY OF TORONTO (Course 1505S/Jan. 9, 2001
1)
36
Organization of the
LectureHistorical BackgroundThe Human Genome
ProjectCritical Technologies Massive,
automated sequencing DNA and RNA
analysis Mass spectrometry
DNA and protein microarrays
Bioinformatics Single nucleotide
polymorphismsApplications
Diagnostics Therapeutics
PharmacogeneticsEthicsPatents
(Course 1505S/Jan. 9, 2001 2)

37
Historical MilestoneYear Milest
one1866 Mendels discovery of
genes1871 Discovery of nucleic
acids1951 First protein sequence
(insulin)1953 Double helix structure of
DNA1960s Elucidation of the genetic
code1977 Advent of DNA sequencing1975-79 First
cloning of human genes1986 Fully automated DNA
sequencing1995 First whole genome (Haemophilus
Influenza)1999 First human chromosome2000 Dros
ophila / Arabidopsis genomes2001 Human and
mouse genomes (Course 1505S/Jan. 9, 2001
3)
38
Technologies DNA Genomics mR
NA Transcriptomics Protein Proteomics Me
tabolites Metabolomics Functional genomics,
proteomics ----- etc. (Course 1505S/Jan.
9, 2001 4)
39
HistoryOn June 26, 2000, at The
White House, it was announced that the Human
Genome Project was essentially completed by -
Celera Genomics (private company) The National
Human Genome Research Initiative and its
International Partners (publicly funded)Work
has yet to be published but Celera scientists
submitted a paper to Science on December 6,
2000. (Course 1505S/Jan. 9, 2001 5)
40
Predicting the FutureWhat is going
to happen now that the human and other genomes
are completed?How quickly the next steps will
happen?What are the potential
difficulties?Are we expecting too
much? (Course 1505S - Jan. 15/01 - 6)
41
Grand PlanFind all the
genesTranslate genes to proteinsCompute
function by similarity search and comparison to
known proteinsCompute structure
(Course 1505S - Jan. 15/01 - 7)
42
Difficulties Gene prediction
programs are unreliable Function inference by
just similarity search may be
fallacious Computation of structure is still
unreliable Our databases may get contaminated
with wrong information. (Course 1505S
- Jan. 15/01 - 8)
43
Gene Prediction Programs were
designed based on knowledge of already cloned
genes (ORFs splice sites start/stop codons,
etc.) These programs provide excellent clues
for gene presence but they never or rarely
predict the complete gene structure The
computer prediction must be taken as a starting
point to experimentally clone a gene How many
genes in the genome? Estimate 27,462 to 312,
278! (Course 1505S - Jan. 15/01 - 9)
44
What is a Gene? Heritable unit
corresponding to a phenotype? DNA that encodes
for a protein? DNA that encodes RNA? What
if RNA is not translated? What if a gene is
not expressed? (Course 1505S - Jan.
15/01 - 10)
45
Prediction of FunctionWhat
is function? This is not a simple termFunction
may be a biological process (e.g. serine
protease activity) a molecular event
(e.g. proteolysis of a specific
substrate) a cellular structure (e.g.
membrane chromatin mitochondrion
etc.) relevance to a whole process (e.g.
cell cycle) relevance to the whole
organism (e.g. ovulation) Some
scientists have now initiated projects to
compute function of whole organisms.
(Course 1505S - Jan. 15/01 -
11)
46
Pattern Recognition Looks
for motifs that may have functional relevance
(family signatures) Membrane anchoring
Catalytic site Nucleotide binding
Nuclear localization signal Hormone
response element Calcium binding,
etc. Protein family resources (being created
now) (Course 1505S - Jan. 15/01 - 12)
47
Homology What is
homology? Definition Two proteins are
homologous if they are related by divergence
from a common ancestor. B Divergent
A C Evolution Ancestor D Hom
ologous (Course 1505S - Jan. 15/01 -
13)
48
Analogy What is
analogy? Definition Two proteins are
analogous if they acquired common structural
and functional features via convergent evolution
from unrelated ancestors.
Convergent A B Evolution C
D Unrelated Analogous (similar
structure and/or function)
(Course 1505S - Jan. 15/01 - 14)
49
Serine Proteases (Convergent
Evolution)Trypsin-like Subtilisin-like
Analogous proteinsMany
homologous Many homologousmembers membersTry
psin and subtilisin share groups of catalytic
residues with almost identical spatial geometries
but they have no other sequence or structural
similarities. (Course 1505S - Jan. 15/01
- 15)
50
Human Kallikrein Gene Family
(Divergent Evolution)15 homologous genes on
human chromosome 19q13.4 Divergence in tissue
expression and substrate specificity
(Course 1505S - Jan. 15/01 - 16)
51
OrthologsProteins that usually perform
same function in different species (e.g. DNA
polymerase glucose 6-phosphate dehydrogenase
retinoblastoma gene p53, etc.).ParalogsProtein
s that perform different but related functions
within one organism usually formed by gene
duplication and divergent evolution (e.g. the 15
kallikrein genes mentioned above).
(Course 1505S
- Jan. 15/01 - 17)
52
Functional Annotation -
Difficulties Who knows if the best matches in
a database query is really Orthologs or
Paralogs Modules Building blocks of
proteins. Finding a module in a protein does
not mean that a function can be assigned since
these modules do not always perform the same
functionAphorism The properties of a system
can be explained by, but not deduced from those
of its components (Course 1505S - Jan. 15/01
- 18)
53
Structure Prediction How proteins
fold in 3D space We still cannot reliably
compute structures of gt 100 amino acid
proteins (ab initio methods) Experiment and
computation Crystallography ? NMR
(Course 1505S - Jan. 15/01 - 19)
54
Future Lots of rigorous work
needs to be done Holistic view -- regulation
of gene expression -- metabolic
pathways -- signaling cascadesRemember
Proteins do not work in isolation but within
integrated networks. (Course 1505S - Jan.
15/01 - 20)
55
The Importance of Accurate
Functional Annotation Function in whole
organisms is complex and interrelated Need for
close collaboration between - software
developers - annotators - experimentalists H
olistic approaches needed for optimal
knowledge-based inference and innovation
(drugs, diagnostics, etc.) (Course 1505S -
Jan. 15/01 - 21)
56
How proten Structure is Elucidated
57
Protein Annotation
58
Protein Annotation
59
PLANT GENOMESSpecies
Genome Size (base
pairs)BrassicasThale cress Arabidoopsis 1.0 x
108 thaliana----------------------------------
--------------------------------------------------
--Oilseed rape/ Brassica napus 1.2
x 109canola-------------------------------------
-------------------------------------------------
CerealsRice Oryza sativa 4.2 x
108Barley Hordeum vulgare 4.8 x
109Wheat Triticum aestivum 1.6 x
1010Maize/corn Zea mays 2.5 x
109----------------------------------------------
---------------------------------------LegumesGa
rden pea Pitsum sativum 4.1 x 109Soya
bean Glycine max 1.1 x 109---------------------
--------------------------------------------------
--------------SolanaceaePotato Solanum 1.8
x 109 tuberosumTomato Lycopersicon 1.0 x
109 esculentum--------------------------------
--------------------------------------------------
---Human Homo sapiens 3.2 x 109
60
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com