The Neutral Mutation Theory of Molecular Evolution - PowerPoint PPT Presentation

1 / 71
About This Presentation
Title:

The Neutral Mutation Theory of Molecular Evolution

Description:

The most recent common ancestor of human and mouse lived about 75 million years ago. ... How many deaths are required to bring about a substitution? ... – PowerPoint PPT presentation

Number of Views:591
Avg rating:3.0/5.0
Slides: 72
Provided by: michal76
Category:

less

Transcript and Presenter's Notes

Title: The Neutral Mutation Theory of Molecular Evolution


1
The Neutral Mutation Theory of Molecular Evolution
Gila Kaplan, gilakaplan_at_bigfoot.com
4th Lecture, November 15th 2009
Itai Yanai Department of Biology Technion
Israel Institute of Technology
2
The Neutral Theory of Molecular Evolution
  • The Mouse Genome
  • The Neutral Theory
  • The speed of molecular evolution
  • The constancy of molecular evolution
  • Neutral mutations and functional constraint
  • Polymorphisms within populations

Introducing todays hero
Motoo Kimura
3
GATCTACCATGAAAGACTTGTGAATCCAGGAAGAGAGACTGACTGGGCAA
CATGTTATTCAGGTACAAAAAGATTTGGACTGTAACTTAAAAATGATCAA
ATTATGTTTCCCATGCATCAGGTGCAATGGGAAGCTCTTCTGGAGAGTGA
GAGAAGCTTCCAGTTAAGGTGACATTGAAGCCAAGTCCTGAAAGATGAGG
AAGAGTTGTATGAGAGTGGGGAGGGAAGGGGGAGGTGGAGGGATGGGGAA
TGGGCCGGGATGGGATAGCGCAAACTGCCCGGGAAGGGAAACCAGCACTG
TACAGACCTGAACAACGAAGATGGCATATTTTGTTCAGGGAATGGTGAAT
TAAGTGTGGCAGGAATGCTTTGTAGACACAGTAATTTGCTTGTATGGAAT
TTTGCCTGAGAGACCTCATTGCAGTTTCTGATTTTTTGATGTCTTCATCC
ATCACTGTCCTTGTCAAATAGTTTGGAACAGGTATAATGATCACAATAAC
CCCAAGCATAATATTTCGTTAATTCTCACAGAATCACATATAGGTGCCAC
AGTTATCCCATTTTATGAATGGATheMouseGenomeGTGATGAAAACCT
TAGGAATAATGAATGATTTGCGCAGGCTCACCTGATATTAAGACTGAGGT
TGGGTCTGGTCTGACTTTAATGTTTGCTTTGTTCATGAGCACCACATATT
GCCTCTCCTATGCAGTTAAGCAGGTAGGTGACAGAAAAGCCCATGTTTGT
CTCTACTCACACACTTCCGACTGAATGTATGTATGGAGTTTCTACACCAG
ATTCTTCAGTGCTCTGGATATTAACTGGGTATCCCATGACTTTATTCTGA
CACTACCTGGACCTTGTCAAATAGTTTGGACCTTGTCAAATAGTTTGGAG
TCCTTGTCAAATAGTTTGGGGTTAGCACAGACCCCACAAGTTAGGGGCTC
AGTCCCACGAGGCCATCCTCACTTCAGATGACAATGGCAAGTCCTAAGTT
GTCACCATACTTTTGACCAACCTGTTACCAATCGGGGGTTCCCGTAACTG
TCTTCTTGGGTTTAATAATTTGCTAGAACAGTTTACGGAACTCAGAAAAA
CAGTTTATTTTCTTTTTTTCTGAGAGAGAGGGTCTTATTTTGTTGCCCAG
GCTGGTGTGCAATGGTGCAGTCATAGCTCATTGCAGCCTTGATTGTCTGG
GTTCCAGTGGTTCTCCCACCTCAGCCTCCCTAGTAGCTGAGACTACATGC
CTGCACCACCACATCTGGCTAGTTTCTTTTATTTTTTGTATAGATGGGGT
CTTGTTGTGTTGGCCAGGCTGGCCACAAATTCCTGGTCTCAAGTGATCCT
CCCACCTCAGCCTCTGAAAGTGCTGGGATTACAGATGTGAGCCACCACAT
CTGGCCAGTTCATTTCCTATTACTGGTTCATTGTGAAGGATACATCTCAG
AAACAGTCAATGAAAGAGACGTGCATGCTGGATGCAGTGGCTCATGCCTG
TAATCTCAGCACTTTGGGAGGCCAAGGTGGGAGGATCGCTTAAACTCAGG
AGTTTGAGACCAGCCTGGGCAACATGGTGAAAACCTGTCTCTATAAAAAA
TTAAAAAATAATAATAATAACTGGTGTGGTGTTGTGCACCTAGAGTTCCA
ACTACTAGGGAAGCTGAGATGAGAGGATACCTTGAGCTGGGGACTGGGGA
GGCTTAGGTTACAGTAAGCTGAGATTGTGCCACTGCACTCCAGCTTGGAC
AAAAGAGCCTGATCCTGTCTCAAAAAAAAGAAAGATACCCAGGGTCCACA
GGCACAGCTCCATCGTTACAATGGCCTCTTTAGACCCAGCTCCTGCCTCC
CAGCCTTCT
4
A shuffled genome
Over 90 of the genomes can be partitioned into
regions of synteny
342 segments each gt 300kb with conserved synteny
in human are superimposed on the mouse genome
The mouse genome has 2.5 Gigabases (14 smaller
than the human).
MGSC Nature (2002) 420 520-562
5
Chromosome rearrangements can be retraced
Transform one genome into the other
The X chromosome
Pevzner 2003 Genome Research
6
The mouse genome
  • The most recent common ancestor of human and
    mouse lived about 75 million years ago.
  • Human and mouse genome sequences differ at nearly
    one out of every two nucleotides.
  • Less than 1 of the 30,000 mouse protein-coding
    genes do not have homologues in humans.

MGSC Nature (2002) 420 520-562
7
Looking at a multiple sequence alignment
Y16R_MYCIO_9-103 TFAIHTLGCKVNLFESNSIKNDLIMNGLVEV
PFDSKADVYIINTCTVTN Y474_AQUAE_2-97
KVAFETLGCRMNQFDTDLLKNKFIQKGYEVVSFEDMADVYVINTCTVTV
YQEV_BACSU_3-98 TVAFHTLGCKVNHYETEAIWQLFKEAGYERR
DFEQTADVYVINTCTVTN Q9ZIA1_2-97
KVAFITLGCRVNSYESEAMAEKFIKSGWEIVDNDEKADAYVINTCTVTN
Y416_RICPR_11-98 RQEIVTFGCRLNIYESEIIRKNLELSGLDN-
-------VAIFNTCAVTK Y830_THEMA_3-96
TVRIETFGCKVNQYESEYMAEQLEKAGYVVLP-DGNAAYYIVNSCAVTK
Y285_HELPJ_3-94 KVYFKTFGCRTNLFDTQVMGENLKDFSATLE
E--QEADIIIINSCTVTN Q9V812_72-162
KVFVKTWGCAHNNSDSEYMAGQLAAYGYRLSG-KEEADLWLLNSCTVKN
Y826_METTH_5-99 RVYIETFGCTFNQADSEIMAGVLREEGAVLT
G-IDDADVIIINTCYVKH Y867_METJA_13-102
RVYVEGYGCVLNTADTEIIKNSLKKHGFEVVNNLEEADIAIINTCVVRL
YI75_PYRHO_3-94 KVYIENYGCARNRADGEIMAALLYLSGHEIV
ESPEESEIVVVNSCAVKD Y269_HELPJ_2-99
KVYIETMGCAMNSRDSEHLLSELSKLDYKETSDPKMADLILINTCSVRE
Q9WZC1_2-98 RFYIKTFGCQMNENDSEAMAGLLVKEGFTPA
SSPEEADVVIINTCAVRR
The 1 operational rule The divergence is high
enough so that one can recognize many
functionally important elements by their greater
degree of conservation. Mouse Genome
Sequencing Consortium, Nature (2002)
8
I consider it exceedingly unlikely that any
gene will remain selectively neutral for any
length of time Ernst Mayr, 1963
http//library.mcz.harvard.edu/
9
The synthetic theory of evolution of the 1950s
was expected to hold even at the most basic level
of genetic sequences
G.G. Simpson (1964) as quoted by Jukes and King
(1969)
Thus Differences in sequence alignments are due
to selection
10
How evolution was thought to occur by the
Selectionist (Neo-darwinian) Theory
Mutation space
deleterious
advantageous
  • Most mutations are deleterious, decrease the
    fitness of the organism
  • removed from the population by purifying
    selection
  • Advantages mutations increase the fitness of the
    organism
  • Selected for by positive Darwinian (adaptive)
    selection

Fitness is judged according to survival rates and
fecundity
(??????)
11
GATCTACCATGAAAGACTTGTGAATCCAGGAAGAGAGACTGACTGGGCAA
CATGTTATTCAGGTACAAAAAGATTTGGACTGTAACTTAAAAATGATCAA
ATTATGTTTCCCATGCATCAGGTGCAATGGGAAGCTCTTCTGGAGAGTGA
GAGAAGCTTCCAGTTAAGGTGACATTGAAGCCAAGTCCTGAAAGATGAGG
AAGAGTTGTATGAGAGTGGGGAGGGAAGGGGGAGGTGGAGGGATGGGGAA
TGGGCCGGGATGGGATAGCGCAAACTGCCCGGGAAGGGAAACCAGCACTG
TACAGACCTGAACAACGAAGATGGCATATTTTGTTCAGGGAATGGTGAAT
TAAGTGTGGCAGGAATGCTTTGTAGACACAGTAATTTGCTTGTATGGAAT
TTTGCCTGAGAGACCTCATTGCAGTTTCTGATTTTTTGATGTCTTCATCC
ATCACTGTCCTTGTCAAATAGTTTGGAACAGGTATAATGATCACAATAAC
CCCAAGCATAATATTTCGTTAATTCTCACAGAATCACATATAGGTGCCAC
AGTTATCCCATTTTATGAATGGAGTThenumberofsubstitutionso
bservedGATGAAAACCTTAGGAATAATGAATGATTTGCGCAGGCTCACC
TGGATATTAAGACTGAGTCAAATGTTGGGTCTGGTCTGACTTTAATGTTT
GCTTTGTTCATGAGCACCACATATTGCCTCTCCTATGCAGTTAAGCAGGT
AGGTGACAGAAAAGCCCATGTTTGTCTCTACTCACACACTTCCGACTGAA
TGTATGTATGGAGTTTCTACACCAGATTCTTCAGTGCTCTGGATATTAAC
TGGGTATCCCATGACTTTATTCTGACACTACCTGGACCTTGTCAAATAGT
TTGGACCTTGTCAAATAGTTTGGAGTCCTTGTCAAATAGTTTGGGGTTAG
CACAGACCCCACAAGTTAGGGGCTCAGTCCCACGAGGCCATCCTCACTTC
AGATGACAATGGCAAGTCCTAAGTTGTCACCATACTTTTGACCAACCTGT
TACCAATCGGGGGTTCCCGTAACTGTCTTCTTGGGTTTAATAATTTGCTA
GAACAGTTTACGGAACTCAGAAAAACAGTTTATTTTCTTTTTTTCTGAGA
GAGAGGGTCTTATTTTGTTGCCCAGGCTGGTGTGCAATGGTGCAGTCATA
GCTCATTGCAGCCTTGATTGTCTGGGTTCCAGTGGTTCTCCCACCTCAGC
CTCCCTAGTAGCTGAGACTACATGCCTGCACCACCACATCTGGCTAGTTT
CTTTTATTTTTTGTATAGATGGGGTCTTGTTGTGTTGGCCAGGCTGGCCA
CAAATTCCTGGTCTCAAGTGATCCTCCCACCTCAGCCTCTGAAAGTGCTG
GGATTACAGATGTGAGCCACCACATCTGGCCAGTTCATTTCCTATTACTG
GTTCATTGTGAAGGATACATCTCAGAAACAGTCAATGAAAGAGACGTGCA
TGCTGGATGCAGTGGCTCATGCCTGTAATCTCAGCACTTTGGGAGGCCAA
GGTGGGAGGATCGCTTAAACTCAGGAGTTTGAGACCAGCCTGGGCAACAT
GGTGAAAACCTGTCTCTATAAAAAATTAAAAAATAATAATAATAACTGGT
GTGGTGTTGTGCACCTAGAGTTCCAACTACTAGGGAAGCTGAGATGAGAG
GATACCTTGAGCTGGGGACTGGGGAGGCTTAGGTTACAGTAAGCTGAGAT
TGTGCCACTGCACTCCAGCTTGGACAAAAGAGCCTGATCCTGTCTCAAAA
AAAAGAAAGATACCCAGGGTCCACAGGCACAGCTCCATCGTTACAATGGC
CTCTTTAGACCCAGCTCCTGCCTCCCAGCCTTCT
12
Haldanes cost of selection
How many deaths are required to bring about a
substitution?
Imagine a haploid population with two genes, A
and a
Thus each generation, sq A bearing individuals
will die and (1-s)q A will survive.
J.B.S. Haldane, "The Cost of Natural Selection",
Journal of Genetics, Vol. 55, 1957 p511-524
13
Haldanes cost of selection
How many deaths are required to bring about a
substitution?
Independent of the selective weight!
Fraction of cumulative selective deaths
D -2ln(p)
Initial frequency
For an initial frequency of 10-6, D 27.6.
Haldane used D 30 as an estimate. This means
that 30 times the population will need to die in
order to bring about the substitution. A
population can sustain losing 10 of the
population each generation. A new allele may be
substituted in a population roughly every 300
generations.
J.B.S. Haldane, "The Cost of Natural Selection",
Journal of Genetics, Vol. 55, 1957 p511-524
14
Motoo Kimura
In 1968 Kimura looks at the changes in
hemoglobin, cytochrome c, triosephosphate
dehydrogenase.
Kimura (1968) Nature 217 624-626
15
Rates of evolution in eight proteins
Applying the molecular clock Kimura calculates
that, on average, one change occurs every 28x106
years in a protein of length 100aa.
Kimura (1983) The neutral theory of molecular
evolution.
16
Kimura then calculates the substitution rate for
the entire genome
Years it take for a change in a protein of 100aa.
Per substitution
Number of proteins (of length 100aa / 300bp) in
the human genome (!!!).
Correction each amino acid change may require
1.2 basepair changes
Kimura (1968) Nature 217 624-626
17
According to Kimuras calculation, a substitution
occurs in the genome every 1.8 years!
This was in sharp to contrast to Haldanes
prediction of one substitution per 300
generations.
Kimura (1968) Nature 217 624-626
18
A modernized version of Kimuras calculation on
the human and mouse genomes
Years of separate of evolution between man and
mouse (2X the divergence time).
Even larger amount of divergence!
p0.5 between human and mouseK
-3/4ln(1-4/3p)L 2.64109, Where L 3.2109.
19
Kimura proposes that most of the substitutions
are neutral, i.e. only a tiny fraction of changes
are due to positive selection
deleterious
advantages
deleterious
neutral
A neutral mutation (or allele) is one that is of
equal fitness to the wildtype
Kimura (1968) Nature 217 624-626
20
The Neutral Mutation Theory of Molecular
Evolution Most evolutionary change and most of
the variability within a species, at the
molecular level, are caused not by selection but
by random drift of mutant genes that are
selectively equivalent.
Graur Li. Fundamentals of Molecular Evolution
(1999)
21
GATCTACCATGAAAGACTTGTGAATCCAGGAAGAGAGACTGACTGGGCAA
CATGTTATTCAGGTACAAAAAGATTTGGACTGTAACTTAAAAATGATCAA
ATTATGTTTCCCATGCATCAGGTGCAATGGGAAGCTCTTCTGGAGAGTGA
GAGAAGCTTCCAGTTAAGGTGACATTGAAGCCAAGTCCTGAAAGATGAGG
AAGAGTTGTATGAGAGTGGGGAGGGAAGGGGGAGGTGGAGGGATGGGGAA
TGGGCCGGGATGGGATAGCGCAAACTGCCCGGGAAGGGAAACCAGCACTG
TACAGACCTGAACAACGAAGATGGCATATTTTGTTCAGGGAATGGTGAAT
TAAGTGTGGCAGGAATGCTTTGTAGACACAGTAATTTGCTTGTATGGAAT
TTTGCCTGAGAGACCTCATTGCAGTTTCTGATTTTTTGATGTCTTCATCC
ATCACTGTCCTTGTCAAATAGTTTGGAACAGGTATAATGATCACAATAAC
CCCAAGCATAATATTTCGTTAATTCTCACAGAATCACATATAGGTGCCAC
AGTTATCCCATTTTATGAATGGAGTGATGAAAACCTTAGGAATAATGAAT
GATTTGCGCAGGCTCACCTGATATTAAGACTGAGRateofevolutionT
GTTGGGTCTGGTCTGACTTTAATGTTTGCTTTGTTCATGAGCACCACATA
TTGCCTCTCCTATGCAGTTAAGCAGGTAGGTGACAGAAAAGCCCATGTTT
GTCTCTACTCACACACTTCCGACTGAATGTATGTATGGAGTTTCTACACC
AGATTCTTCAGTGCTCTGGATATTAACTGGGTATCCCATGACTTTATTCT
GACACTACCTGGACCTTGTCAAATAGTTTGGACCTTGTCAAATAGTTTGG
AGTCCTTGTCAAATAGTTTGGGGTTAGCACAGACCCCACAAGTTAGGGGC
TCAGTCCCACGAGGCCATCCTCACTTCAGATGACAATGGCAAGTCCTAAG
TTGTCACCATACTTTTGACCAACCTGTTACCAATCGGGGGTTCCCGTAAC
TGTCTTCTTGGGTTTAATAATTTGCTAGAACAGTTTACGGAACTCAGAAA
AACAGTTTATTTTCTTTTTTTCTGAGAGAGAGGGTCTTATTTTGTTGCCC
AGGCTGGTGTGCAATGGTGCAGTCATAGCTCATTGCAGCCTTGATTGTCT
GGGTTCCAGTGGTTCTCCCACCTCAGCCTCCCTAGTAGCTGAGACTACAT
GCCTGCACCACCACATCTGGCTAGTTTCTTTTATTTTTTGTATAGATGGG
GTCTTGTTGTGTTGGCCAGGCTGGCCACAAATTCCTGGTCTCAAGTGATC
CTCCCACCTCAGCCTCTGAAAGTGCTGGGATTACAGATGTGAGCCACCAC
ATCTGGCCAGTTCATTTCCTATTACTGGTTCATTGTGAAGGATACATCTC
AGAAACAGTCAATGAAAGAGACGTGCATGCTGGATGCAGTGGCTCATGCC
TGTAATCTCAGCACTTTGGGAGGCCAAGGTGGGAGGATCGCTTAAACTCA
GGAGTTTGAGACCAGCCTGGGCAACATGGTGAAAACCTGTCTCTATAAAA
AATTAAAAAATAATAATAATAACTGGTGTGGTGTTGTGCACCTAGAGTTC
CAACTACTAGGGAAGCTGAGATGAGAGGATACCTTGAGCTGGGGACTGGG
GAGGCTTAGGTTACAGTAAGCTGAGATTGTGCCACTGCACTCCAGCTTGG
ACAAAAGAGCCTGATCCTGTCTCAAAAAAAAGAAAGATACCCAGGGTCCA
CAGGCACAGCTCCATCGTTACAATGGCCTCTTTAGACCCAGCTCCTGCCT
CCCAGCCTTCT
22
Rate of Neutral Evolution
  • The rate of evolution is determined by
  • the number of mutations produced each generation,
    and
  • the probability of fixation of each new mutation.
  • Rate of divergence mutationsfixation

Divergence number of differences between 2
sequences
Kimura (1968) Nature 217 624-626
23
There are 2Nm new mutations per generations
Given a mutation rate, m, of one mutation ( )
per genome per generation
Each person has two genomes, thus 2 new mutations
per person.
There are five people in this population, so 10
new mutations
Mutations per generation 2Nm new mutations per
generation
(Actual mutation rate is 60 mutations per
genome per generation)
24
What is the probability that a new mutation will
be fixed in the population?
Population 2N (assuming a diploid
population) Initial frequency
1/2N Probability of fixation ???
25
What is the probability of fixation for an
advantageous mutation?
From a theoretical perspective
1 e-4Nsq
Kimura showed this in 1964
P
1 e-4Ns
Initial frequency of a mutation
q 1/2N
1 e-2s
P (2s)/(4Ns)
1 e-4Ns
As s becomes 0
P 1/2N
26
Rate of Neutral Evolution
  • The rate of evolution is determined by
  • the number of mutations produced each generation,
    and
  • 2Nm
  • the probability of fixation of each new mutation.
  • 1/2N
  • Rate of divergence (k) MutationsFixation

Rate of divergence is equal to the mutation
rate!!! (independent of N!)
Kimura (1968) Nature 217 624-626
27
Rate of Neutral Evolution
Assume that f is the proportion of neutral
mutations.
advantageous
deleterious
neutral
f
1-f
k fm
A prediction of the neutral theory Divergence is
dependent only on the mutation rate.
Kimura and Ohta (1971) JME
28
The neutral theory explains the molecular clock
k fm
The uniformity of the rate of mutant
substitution per year for a given protein may be
explained by assuming constancy of neutral
mutation rate per year over diverse
lines. Kimura Ohta Nature (1971)
Kimura (1983) The neutral theory of molecular
evolution.
29
What is the probability of fixation for an
advantageous mutation?
1 e-4Nsq
P
1 e-4Ns
q 1/2N
1 e-2s
P
1 e-4Ns
For positive, small s and large N, P can be
approximated as
P ? 2s
30
Under advantageous mutations
  • The rate of evolution is determined by
  • the number of mutations produced each generation,
    and
  • 2Nm
  • the probability of fixation of each new mutation.
  • 2s
  • Rate of divergence (k) MutationsFixation

k 4Nsm
  • Under advantageous mutations, the rate of
    divergence depends upon,
  • The population size, N
  • the selective advantage of the mutation, s, and
  • the rate, u, at which the advantageous mutations
    are produced,
  • So that it is unlikely to be constant.

Kimura and Ohta (1971) Nature
31
GATCTACCATGAAAGACTTGTGAATCCAGGAAGAGAGACTGACTGGGCAA
CATGTTATTCAGGTACAAAAAGATTTGGACTGTAACTTAAAAATGATCAA
ATTATGTTTCCCATGCATCAGGTGCAATGGGAAGCTCTTCTGGAGAGTGA
GAGAAGCTTCCAGTTAAGGTGACATTGAAGCCAAGTCCTGAAAGATGAGG
AAGAGTTGTATGAGAGTGGGGAGGGAAGGGGGAGGTGGAGGGATGGGGAA
TGGGCCGGGATGGGATAGCGCAAACTGCCCGGGAAGGGAAACCAGCACTG
TACAGACCTGAACAACGAAGATGGCATATTTTGTTCAGGGAATGGTGAAT
TAAGTGTGGCAGGAATGCTTTGTAGACACAGTAATTTGCTTGTATGGAAT
TTTGCCTGAGAGACCTCATTGCAGTTTCTGATTTTTTGATGTCTTCATCC
ATCACTGTCCTTGTCAAATAGTTTGGAACAGGTATAATGATCACAATAAC
CCCAAGCATAATATTTCGTTAATTCTCACAGAATCACATATAGGTGCCAC
AGTTATCCCATTTTATGAATGGAGTGATGAAAACCTTAGGAATAATGAAT
GATTTGCGCAGGCTCACCTGATATTAAGACTGAGWhatdoesaneutral
mutatiOnlooklike?GTTGGGTCTGGTCTGACTTTAATGTTTGCTTTG
TTCATGAGCACCACATATTGCCTCTCCTATGCAGTTAAGCAGGTAGGTGA
CAGAAAAGCCCATGTTTGTCTCTACTCACACACTTCCGACTGAATGTATG
TATGGAGTTTCTACACCAGATTCTTCAGTGCTCTGGATATTAACTGGGTA
TCCCATGACTTTATTCTGACACTACCTGGACCTTGTCAAATAGTTTGGAC
CTTGTCAAATAGTTTGGAGTCCTTGTCAAATAGTTTGGGGTTAGCACAGA
CCCCACAAGTTAGGGGCTCAGTCCCACGAGGCCATCCTCACTTCAGATGA
CAATGGCAAGTCCTAAGTTGTCACCATACTTTTGACCAACCTGTTACCAA
TCGGGGGTTCCCGTAACTGTCTTCTTGGGTTTAATAATTTGCTAGAACAG
TTTACGGAACTCAGAAAAACAGTTTATTTTCTTTTTTTCTGAGAGAGAGG
GTCTTATTTTGTTGCCCAGGCTGGTGTGCAATGGTGCAGTCATAGCTCAT
TGCAGCCTTGATTGTCTGGGTTCCAGTGGTTCTCCCACCTCAGCCTCCCT
AGTAGCTGAGACTACATGCCTGCACCACCACATCTGGCTAGTTTCTTTTA
TTTTTTGTATAGATGGGGTCTTGTTGTGTTGGCCAGGCTGGCCACAAATT
CCTGGTCTCAAGTGATCCTCCCACCTCAGCCTCTGAAAGTGCTGGGATTA
CAGATGTGAGCCACCACATCTGGCCAGTTCATTTCCTATTACTGGTTCAT
TGTGAAGGATACATCTCAGAAACAGTCAATGAAAGAGACGTGCATGCTGG
ATGCAGTGGCTCATGCCTGTAATCTCAGCACTTTGGGAGGCCAAGGTGGG
AGGATCGCTTAAACTCAGGAGTTTGAGACCAGCCTGGGCAACATGGTGAA
AACCTGTCTCTATAAAAAATTAAAAAATAATAATAATAACTGGTGTGGTG
TTGTGCACCTAGAGTTCCAACTACTAGGGAAGCTGAGATGAGAGGATACC
TTGAGCTGGGGACTGGGGAGGCTTAGGTTACAGTAAGCTGAGATTGTGCC
ACTGCACTCCAGCTTGGACAAAAGAGCCTGATCCTGTCTCAAAAAAAAGA
AAGATACCCAGGGTCCACAGGCACAGCTCCATCGTTACAATGGCCTCTTT
AGACCCAGCTCCTGCCTCCCAGCCTTCT
32
The inverse relationship between the importance
of a protein or site within a protein and its
rate of evolution.
proteins, and sites within proteins, differ
with regard to the stringency of their
requirements. Replacements can be viewed from
the view-point of the protein chemist who sees
replacements as relating solely to function 29
amino acids in cytochrome c are needed to combine
with the heme group all others are probably
neutral
King, J. L., and Jukes, T. H. 1969. Non-Darwinian
evolution, Science 164, 788-798.
33
The 20 amino acids have overlapping properties
Small change
big change
34
Relationship between physico-chemical difference
and relative substitution frequency
Minor changes are more frequent
Relative substitution frequency
Drastic changes are infrequent
Physico-chemical difference
Kimura (1983) The neutral theory of molecular
evolution.
35
Pseudogenes as a paradigm of neutral evolution
Pseudogenes show an extremely high rate of
nucleotide substitution.
Li, Gojobori and Nei (1981) Nature 292 237-239
36
Different areas of the protein evolve at
different rates
Kimura (1983) The neutral theory of molecular
evolution.
37
Conservation in a typical gene
Splice sites
Start of translation
Start of transcription
Polyadenylation site
On the basis of 3,165 human-mouse pairs
MGSC Nature (2002) 420 520-562
38
Degeneracy of the Genetic Code
nonsynonymous
synonymous
Colors represent amino acids
Each of the 61 sense codons can mutate in 9
different ways 134 of the 549
possible changes are synonymous
39
Synonymous changes can be neutral mutations
  • If most DNA changes were due to adaptive
    evolution than one would imagine that most
    changes would occur in the first and second codon
    positions.
  • If DNA divergence includes neutral mutations,
    then the third position should change more
    rapidly because synonymous mutations are more
    likely to be neutral.

King, J. L., and Jukes, T. H. 1969. Non-Darwinian
evolution, Science 164, 788-798.
40
Preponderance of changes in the 3rd position
The first 220 nucleotides of human and mouse
renin binding protein
The third position of all codons are marked
Of the 31 changes 4 - 1st position 4 - 2nd
position 23 - 3rd position
41
Estimating separately the rate of synonymous
change and non-synonymous change
  • KS number of Synonymous substitutions per
    synonymous site
  • KA number of non-synonymous (Altering)
    substitutions per non-synonymous site

One way of estimating Ks and Ka would be to
examine each change individually and check if it
is synonymous or not. In the following we present
a method for doing this in a systematic manner.
42
Nucleotide sites can be classified into 3 types
of degenerate sites
2-fold Degenerate changes of this nucleotide
relate to pairs of codons for the same AA
4-fold degenerate changes of this nucleotide
relate to 4 codons for the same AA
0-fold degenerate - no change at this
nucleotide leaves coding for the same AA
Synonymous - Altering
(AA amino acids)
43
4-fold degenerate sites are found in 32 of the
3rd position of 61 codon sites
44
2-fold degenerate sites are found in 25 of the
3rd positions and 8 of the 1st position
45
0-fold degenerate sites are found in 2nd position
sites of all codons (61) and in of 53 of the 1st
position sites
46
Classify each site in a sequence according to the
degeneracy of the sites.
47
Classify each site in a sequence according to the
degeneracy of the sites.
00000200200220400200420400200400000200200400400400
2002204002004004004
00000200200220400200420400200400000200200400400200
2002204002004004002
Counting the number of 4-,2-,0-fold sites (taking
the average between the two sequences)
L0 (4545)/2 45 L2 (1315)/2 14 L4
(108)/2 9
48
Classify the differences with another sequence as
a. transition (S) or transversion (V) b.
degeneracy (0,2,4)
49
The key simplification is the special
relationship between transition/transversion and
degeneracy
Synonymous mutations
Non-synonymous mutations
(Exceptions 1st position of arginine
(CGA,CGG,AGA,AGG), last position of
isoleucine (AUU, AUC, AUA)).
50
We distinguish between transitions and
transversions according to the Kimura model
A
G
transitions
transversions
T
C
51
Use Kimuras 2-parameter model to estimate the
numbers of transitions (Ai) and transversions
(Bi) per i-th type site.
Calculate the proportions of transitional and
transversional differences Pi Si/Li (12/70) Qi
Vi/Li (3/70) Kimura model is used to correct
for multiple hits
(6 times more transitions than transversions)
The Kimura model is similar to the Jukes-Cantor
model (from the previous lecture) but also takes
into consideration that transitions and
transversions occur at different frequencies
(0.242) (0.045)
Ai (1/2) ln (1/(1- 2Pi Qi)) (1/4) ln (1/(1-
2Qi)) Bi (1/2) ln (1/(1- 2Qi))
52
Calculating KS and KA
KS synonymous substitutions per synonymous site
L2A2 L4A4 L4B4
KS
L2/3 L4
By convention, one third of 2-fold degenerate
sites are considered synonymous
KA of non-synonymous substitutions per
non-synonymous site
L0B0 L2B2 L0A0
KA
(2/3)L2 L0
Similarly, two thirds of 2-fold degenerate sites
are considered non-synonymous
Li, Wu and Luo, MBE 1985
53
Calculating KS and KA
Because transitional substitutions tend to
occur more often than transversional
substitutions and because most transitional
changes at two fold are synonymous changes, KS
is overestimated and KA is underestimated.
To correct for this the following formulas are
typically used
The weighted average of A2 and A4
L2A2 L4A4
KS B4
L2 L4
The weighted average of B0 and B2
L0B0 L2B2
KA A0
L0 L2
Li, Pamilo and Bianchi method Li JME (1993) 36
96-99
54
An example
L (top) L (bottom)
L0 (2930)/2 29.5 S0 (21)/21.5 V0
0 L2 (75)/2 6 S2 (10)/20.5 V2
0 L4 (910)/2 9.5 S4 0 V4(44)/24 P
0 1.5/29.5 Q0 0 P2 0.5/6 Q2 0 P4 0
Q4 4/9.5
Error this should be 202, sorry
55
An example (continued)
L (top) L (bottom)
Ai ½(ln(1/(1-2Pi-Qi))) ¼(ln(1/1-2Qi)) Bi
½(ln(1/(1-2Qi))) A0 0.054 B0 0 A2 0.091 B2
0 A4 -0.188 B4 0.923 KS B4 (A2L2
A4L4)/(L2 L4) 0.843 KA A0 (L0B0
L2B2)/(L0 L2) 0.054
As and Bs are the transition and transversion
frequencies but corrected for multiple hits
according to the Kimura model
Synonymous rate is much faster than
non-synonymous rate in this example
56
Synonymous rates are typically greater than
nonsynonymous rates
From Li, W-H Molecular Evolution who took it
from OhUigin and Li. JME 1992 35 377-384
57
The Molecular Clock of Viral Evolution
Different rates
Relationship between the number of nucleotide
substitutions and the difference in the year of
isolation for the H3 hemagglutinin gene of human
influenza A viruses. All sequence comparisons
were made with the strain isolated in 1968.
Gojobori et al. 1990 PNAS 87 10015-10018
58
Ks/Ka ratios can be used to gauge the level of
selection
If KA/Ks ? 1 -gt Sequences are diverging
neutrally with no selection If KA/Ks gt 1 -gt
The coding region is under positive selection If
KA/Ks ltlt 1 -gt The coding region is under
purifying selection and maybe also
positive selection
59
The neutral theory and function
  • Various alleles may be equally effective at
    promoting the survival and reproduction of the
    individual.
  • The mere existence of different functional forms
    is not evidence for the operation of natural
    selection
  • Selection can only be assessed through
    investigations of survival rates and fecundity

60
GATCTACCATGAAAGACTTGTGAATCCAGGAAGAGAGACTGACTGGGCAA
CATGTTATTCAGGTACAAAAAGATTTGGACTGTAACTTAAAAATGATCAA
ATTATGTTTCCCATGCATCAGGTGCAATGGGAAGCTCTTCTGGAGAGTGA
GAGAAGCTTCCAGTTAAGGTGACATTGAAGCCAAGTCCTGAAAGATGAGG
AAGAGTTGTATGAGAGTGGGGAGGGAAGGGGGAGGTGGAGGGATGGGGAA
TGGGCCGGGATGGGATAGCGCAAACTGCCCGGGAAGGGAAACCAGCACTG
TACAGACCTGAACAACGAAGATGGCATATTTTGTTCAGGGAATGGTGAAT
TAAGTGTGGCAGGAATGCTTTGTAGACACAGTAATTTGCTTGTATGGAAT
TTTGCCTGAGAGACCTCATTGCAGTTTCTGATTTTTTGATGTCTTCATCC
ATCACTGTCCTTGTCAAATAGTTTGGAACAGGTATAATGATCACAATAAC
CCCAAGCATAATATTTCGTTAATTCTCACAGAATCACATATAGGTGCCAC
AGTTATCCCATTTTATGAATGGAPolymorphismsandtheneutralt
heoryGTGATGAAAACCTTAGGAATAATGAATGATTTGCGCAGGCTCACC
TGATATTAAGACTGAGGTTGGGTCTGGTCTGACTTTAATGTTTGCTTTGT
TCATGAGCACCACATATTGCCTCTCCTATGCAGTTAAGCAGGTAGGTGAC
AGAAAAGCCCATGTTTGTCTCTACTCACACACTTCCGACTGAATGTATGT
ATGGAGTTTCTACACCAGATTCTTCAGTGCTCTGGATATTAACTGGGTAT
CCCATGACTTTATTCTGACACTACCTGGACCTTGTCAAATAGTTTGGACC
TTGTCAAATAGTTTGGAGTCCTTGTCAAATAGTTTGGGGTTAGCACAGAC
CCCACAAGTTAGGGGCTCAGTCCCACGAGGCCATCCTCACTTCAGATGAC
AATGGCAAGTCCTAAGTTGTCACCATACTTTTGACCAACCTGTTACCAAT
CGGGGGTTCCCGTAACTGTCTTCTTGGGTTTAATAATTTGCTAGAACAGT
TTACGGAACTCAGAAAAACAGTTTATTTTCTTTTTTTCTGAGAGAGAGGG
TCTTATTTTGTTGCCCAGGCTGGTGTGCAATGGTGCAGTCATAGCTCATT
GCAGCCTTGATTGTCTGGGTTCCAGTGGTTCTCCCACCTCAGCCTCCCTA
GTAGCTGAGACTACATGCCTGCACCACCACATCTGGCTAGTTTCTTTTAT
TTTTTGTATAGATGGGGTCTTGTTGTGTTGGCCAGGCTGGCCACAAATTC
CTGGTCTCAAGTGATCCTCCCACCTCAGCCTCTGAAAGTGCTGGGATTAC
AGATGTGAGCCACCACATCTGGCCAGTTCATTTCCTATTACTGGTTCATT
GTGAAGGATACATCTCAGAAACAGTCAATGAAAGAGACGTGCATGCTGGA
TGCAGTGGCTCATGCCTGTAATCTCAGCACTTTGGGAGGCCAAGGTGGGA
GGATCGCTTAAACTCAGGAGTTTGAGACCAGCCTGGGCAACATGGTGAAA
ACCTGTCTCTATAAAAAATTAAAAAATAATAATAATAACTGGTGTGGTGT
TGTGCACCTAGAGTTCCAACTACTAGGGAAGCTGAGATGAGAGGATACCT
TGAGCTGGGGACTGGGGAGGCTTAGGTTACAGTAAGCTGAGATTGTGCCA
CTGCACTCCAGCTTGGACAAAAGAGCCTGATCCTGTCTCAAAAAAAAGAA
AGATACCCAGGGTCCACAGGCACAGCTCCATCGTTACAATGGCCTCTTTA
GACCCAGCTCCTGCCTCCCAGCCTTCT
61
Genetic variation is estimated using
electrophoresis
Hubby and Lewontin (1966)
62
Natural populations are highly polymorphic
Polymorphic loci
Heterozygosity
Heterozygosity is defined as the probability of
choosing an individual that is heterozygous at a
given location.
From Evolution (1996) by Ridley who took it from
Nevo (1988)
63
In the 1950s polymorphisms were believed to be
mostly due to balancing selection (selection that
favors heterozygotes)
64
Unification of Population Genetics and Molecular
Evolution
Protein polymorphism is a phase of molecular
evolution
In our view, protein polymorphism and molecular
evolution are not two separate phenomena, but
merely two aspects of a single phenomenon caused
by random frequency drift of neutral mutants in
finite populations.
Kimura and Ohta (1971) Nature
One time freeze shows the polymorphisms
65
What is the probability of choosing from the gene
pool two copies of the same alleles?
The probability of choosing two different alleles
(1-1/2N) and having them be identical by state
(G).
The probability of choosing the same copy twice
(1/2N)
G 1/(2N) (1 1/(2N))G
Homozygosity in the next generation
Homozygosity in the present generation
66
The change in genetic variance each generation
From previous slide G 1/2N (1
1/2N)G Heterozygosity H 1 G Rearranging
H (1 1/2N)H Change in H ?H H H ?H
(1/2N)H
With only random drift acting (no selection or
mutation), the heterozygosity decreases each
generation by a factor of 1/2N
67
Adding mutations into the picture
Ruling out the possibilities of convergent
mutations
The probability of choosing two copies of the
same allele AND not have them mutate in the next
generation.
m is the mutation rate
68
From previous slide Heterozygosity Rearrangi
ng Change in H Assuming a steady-state
where over time the heterozygosity is constant
drift
mutation
At ?H 0,
Kimura and Crow, The Infinite Alleles Model 1964
69
An example
4Nm
H
1 4Nm
  • For example, for a given locus with
  • a mutation rate (m) of 10-7, in a
  • a population size (N) of 2.5 x 105
  • H 0.09, i.e. 9 of the population is expected
    to be heterozygote at this locus

A prediction of the neutral theory Heterozygosity
increases with population size and mutation rate
70
How are polymorphisms maintained?
Neutralist position polymorphisms are
selectively neutral and are maintained in a
population through mutational input and random
extinction. Selectionist position
polymorphism is actively maintained by some form
of balancing selection (heterozygote advantage)
or frequency dependent selection.
71
Darwin and neutral mutations Darwin
distinguished three kinds of variations
advantageous, deleterious, and neutral.
From Bernardi. PNAS 2007 104 (20)83858390
This preservation of favourable variations and
the rejection of injurious variations I call
Natural Selection Variations neither useful
nor injurious would not be affected by natural
selection.
Darwin C. On the Origin of Species. 1859
Write a Comment
User Comments (0)
About PowerShow.com