Title: TuTh
1 Fall 2009 MB437/537 3credits Molecular
EvolutionADVANCES IN Molecular Evolution
What ARE THE latest theories on the Origins of
life? What are genome Sequencing
projects Teaching us about evolutionary
Complexity? What are the Bioethical implications
of Your Future Research?
From the Big Bang to Bioinformatics and Beyond
- Tu/Th
- 1100 AM - 1215 PM
Teach Evolution! Learn Science!
Professor Marcie McClure marsmcclure_at_gmail.com
MB537 SYLLABUS Lecture 1 9/1/09
Comments. Organization Introduction Lecture 2
9/3/09 Evolution the Big Picture Lecture
3 9/8/09 The BIG BANG and formation of the
elements necessary for life. Lecture 4
9/10/09 Biogenesis I The primitive earth and
the prebiotic soup. Lecture 5 9/15/09
Biogenesis II Self-assembly, Energetics and the
Protocell. Lecture 6 917/09 Biogenesis III
More on protocelluar formation. Lecture 7
9/22/09 Biogenesis IV Protein or Nucleic Acids
first? RNA or DNA? Lecture 8 9/24/09
The RNA world the three Domains of life and
LUCA or LUCC. Lecture 9
9/29/09 Origin of the Genetic Code and more on
LUCC Lecture 10 10/01/09 Last Day of LUCA begin
Genomes Content and Architecture Chap
8 Lecture 11 10/6/09 open discussion Lecture 12
10/8/09 Mutation nucleotide substitutions and
amino acid replacements. Chap 1
3 Lecture 13 10/13/09 Methods Analyzing
sequences rates/patterns. Chap 1,
3-4 Lecture 14 10/15/09 Molecular Phylogeny I
History, terms, definitions, and limits.
Chap 5 Lecture 15 10/20/09 Molecular
Phylogeny II How to determine a phylogenetic
tree. Lecture 16 10/22/09 Molecular
Phylogeny III Improvements and Extensions to
Genome Trees. Lecture 17 10/27/09 WHATS NEW?
Bayesian and HMM Approaches to phylogenetic
reconstruction. Lecture 18 10/29/09 Deviation
from Tree-like behavior horizontal transmission
of information. Lecture 19 11/3/09
EXAM Lecture 20 11/5/09 Convergent Evolution
the antifreeze story. Lecture 21
11/10/09 Evolution of Viruses. Lecture 22
11/12/09 Retroid Agents eukaryotic hosts
and disease states. Lecture 23 11/17/09 Do
viral RNA polymerases share ancestry? Lecture 24
11/19/09 Bioethics of the Human Genome Project/
Introduction to Bioinformatics. Lecture 25
11/24/09 open discussion 11/25-27/09 THANKSGIV
ING HOLIDAY Lecture 26 12/1/09 Lecture 27
12/3/09 Lecture 28 12/8/09 Lecture 29
3Last Lecture
Finished off C value paradox
So called junk DNA is mainly repeatitive in
What is mutation? Nucleotide subsitutions Silent
synonynous Transition vs Transversion
Classifications of mutations
4What about amino acid replacements?
5(No Transcript)
6(No Transcript)
7(No Transcript)
8McClure, 1995
9(No Transcript)
10Strategy for Assessing Protein Sequence Homology
Protein Sequence Data
gt25 identical homology
lt25 identical
Support for homology Statistical tests
OSM present functionally equivalent
likely homologue
Functional identification, Phylogenetic
analysis, Structural prediction
Support for homology Gene order and size,
common function
McClure, 2000
11 Phylogenetic Reconstruction 1) Rates and
patterns 2) History, terms, definitions and
limits 3) How to determine a phylogenetic
tree 4) Improvements to Trees 5) Extensions
to genome trees
12Lets consider some basic questions
1) How do we measure the rate of nucleotide
2) Do all genes or a subset there of evolve at
the same rate?
3) Do homologous genes in different organisms
evolve at the same rate?
4) Are rates different in nuclear, organellar and
viral genomes?
13(No Transcript)
14 1) How do we measure the rate of nucleotide
a) r K/2T, where T is an external measure of
b) in general seqs lt 1-2 different may be too
close to analyzeon the other hand once
seqs have diverged by 20-30 substitution
reaches saturation gt analyze amino acids
c) relative rate test
15 16(No Transcript)
17What causes the variation in the rate of change
in a protein?
In general a functional or selective constraint
is anything that keeps a gene or protein in a
specific evolutionary state.
Formally the selective constraint defines the
range of alternative nucleotides that is
acceptable at a given site without negatively
affecting function or structure of the gene
What are the types of selection that can act on a
18Neutral? positive? or negative selection?
- 1) NS sites have a higher probability of leading
to a change, perhaps deleterious, in a protein.
In classical term this is called purifying or
negative selection.
- 2) The converse is also true. NS changes have a
chance of improving function. When this happens
it is called positive selection.
- 3) S site changes do not change the protein.
- When there is no change in the protein it is
called what neutral.
- Evolution under the neutral theory, predicts,
using the null hypothesis - dS dN or Ks Kn
- 1) When dS gt dN or dS/dN gt 1, then purifying or
negative - Darwinian selection has occurred
- 2) When dN lt dS or dS/dN lt 1, then positive
Darwinian selection has occurred.
19Do rates of change vary among and between plastid
versus nuclear genes?
1) Mammalian mito. genome---structurally stable
--little variation from mammal to mammal a)
circular genome dsDNA, 15-17K BP, about 1/10,000
of the smallest animal genome b) doesnt code for
much only unique sequences 13 protein-coding
genes, two rRNA gene clusters, 22 tRNAs and
control regions for RX and TX c) S sites 10X,
NS sites vary among proteins but much greater
than nuclear genes
in contrast to
2) Plant mito. genome--structurally
unstable--under goes freq. rearrangement,
duplication and deletion of genes a) circular,
linear or subgenomic circles-- genome varies from
40K to 2,500K BP-- b) coding content at a
minimum--3 rRNA clusters, unknown number of
tRNAs, 15-30 proteins c) lower rates N and NS
then nuclear genes
in contrast to
3) to chloroplasts in vascular plants,
structurally stable a) circular genome,
120K-220K BP--both strands encode genes b)
tobacco chloroplasts 37 tRNA genes, 8 of which
have single introns, 8 rRNA clusters, 45
proteins (five of which have one intron and 2 of
which have 2 introns) c) lower rates then nuclear
The ratio of rates of change in plant mito,
chloro and nuclear genes is 1312
20Is there any correlation between organellar
genome stability and rates of change ?
- In mammals mito DNA evolves very rapidly, but
there is little spatial or size variation among
- In plants mito DNA evolves quite slowly, but
there is significant rearrangement of the genome.
3) In chloroplasts the rates of both evolution
and structural rearrangement is quite low.
21What is the Molecular Clock?
The rate of amino acid replacement in all
lineages over evolutionary time is approximately
22(No Transcript)
23Why did the molecular clock hypothesis stir such
- 1) A constant rate of protein evolution did not
fit with ideas of the erratic tempo of evolution
observed at the higher levels of complexity, i.e.
morphological and physiological. - 2) As observed after gene duplication rates of
change accelerate as they do during periods of
adaptive radiation. - 3) Many studies on viral evolution demonstrate
that the rate of accumulated mutation is a
function of environmental condition.
24Do homologous genes evolve at the same rate in
different organisms?
The rate of nucleotide substitution is higher in
African apes than in humans.
Both the rates of nucleotide substituion
and insertion/delection is higher in rodents than
in humans.
The generational time effect.
25Evaluation of the MC hypothesis
- From DNA sequence analyses from several orders of
mammals evidence does not support a global clock
for the order Mammalia. Significant variance in
sub. Rates is found both within and between
orders - 2) Variation between species also exists. Flys
evolve 5-10x the rate of vertebrates. - Bottomline
- Basically there is no Molecular Clock, but there
maybe local clocks given enough data
26Molecular clocks four decades of evolution,
Sudhir Kumar Nature Reviews Genetics 6, 654-662
(August 2005)
27What is the Molecular Clock?
Old assumption.
The rate of amino acid replacement in all
lineages over evolutionary time is approximately
New assumption
The rate of amino acid replacement is variable
within and between lineages over evolutionary
28Important points new molecular clocks or
Fossil record is ONLY the date when fossilization
occurred, this is not an evolutionary date.
New methods that allow for rate variability, that
is they allow uncertainity, make molecular
clocks more reliable.
Relaxed, even laid back analysis in the Bayesian
mode, must have multiple external, lower bounds
from fossil record.
Molecular dates of divergences or molecular
clocks give dates that are often hundreds of
millions older than the date of fossilization.
29Palaeontological calibrations for the lineage
splits between animals whose genomes have been
fully sequenced. From Donoghue and Benton (2007)
30Fig. 1. Divergence time estimates (Mya) among
eukaryotes, based on a Bayesian relaxed molecular
clock applied to 30,399 amino acid positions. The
topology is the highest-likelihood one, with
branch lengths proportional to the absolute ages
of the subtending nodes. Dictyostelium rooted the
tree but was pruned from dating analyses. White
rectangles delimit 95 credibility intervals on
node ages. Stars indicate the six nodes under
prior paleontological calibration (lower bound
only for the white star lower and upper bounds
for black stars). Gray areas encompass the bounds
between which calibration nodes stand a
posteriori during the Bayesian search. Primary
and secondary plastid endosymbioses are
indicated, respectively, by the circled 1 and 2.
Double horizontal arrows and dotted lines
indicate the displacement of 95 credibility
intervals for eight selected nodes after
rerooting the tree along the kinetoplastid
branch. Paleozoic, Mesozoic, and Cenozoic are
indicated by I, II, and III, respectively.
Transitions between Meso-/Neo-Proterozoic and
Cambrian are indicated by vertical dashed lines.
The timing of eukaryotic evolution Does a
relaxed molecular clock reconcile proteins and
fossils? Emmanuel J. P. Douzery, Elizabeth A.
Snell, Eric Bapteste, Frede ric Delsuc, and
Herve Philippe PNAS October 26, 2004 vol.
101 no. 43