Title: Adaptive Evolution of Protein Coding Sequences
1Adaptive Evolution of Protein Coding Sequences
chris.creevey_at_may.ie
2What is Natural Selection?
3The Voyage of the H.M.S. Beagle
4The Galapagos finches
The beaks of the Finches are adapted to different
jobs in the same way as tools.
5The Struggle for existence
- Observation 1
- Populations sizes would increase exponentially if
all individuals born survived. - Observation 2
- Most populations are stable in size.
- Observation 3
- No two individuals in a population are exactly
the same. - Observation 4
- Much of this variation is heritable
6Introduction
- Natural Selection
- Natural selection is daily, hourly, scrutinising
the slightest variations, rejecting those that
are bad, preserving and adding up all those that
are good- The Origin of Species
7Levels of selection
- Species
- Population
- Individual
- Gene
Species level selection may lead to its
extinction, generally a large environmental
change.
Interspecific competition and predation can lead
to population decline, unless the population can
exploit new niches or find novel ways of avoiding
predation.
Intraspecific competition for shared resources
acts on the survival of progeny. Individuals that
can exploit their environment better than others
survive to pass their genes to their descendants.
The importance of the phenotypic characters
expressed by the genes decides how selection acts
on them.
8Selection on Genes
- Housekeeping Genes.
- Negative (Purifying) selection.
- Change is Bad.
- Genes that have a role in adaption.
- Positive (Adaptive) selection.
- Change is Good.
- Selectively neutral genes
- Genetic drift.
9Simple Genic Selection
10Positive Selection
11Selected gene groups on which positive selection
may operate
- Genes involved in defensive systems or immunity
- Genes involved in evading the defensive systems
or immunity - Genes involved in reproduction
- Genes involved in digestion
12Genes involved in defensive systems or immunity
Class I Chitinase genes
- Used in Plant defence against fungi.
- Researchers found that there were tissue
specific Class I chitinase genes. - Molecular analysis revealed that these genes had
evolved under positive selection.
13Genes involved in defensive systems or immunity
- Defensin family of genes
- Large family of broad-spectrum antimicrobial
peptides - When expressed at the cell surface, they have
been hypothesised to function as a biochemical
barrier against microbial infection. - They function by inhibiting colonisation of the
epithelium by a wide range pathogenic
micro-organisms. - In leukocytes, these peptides are stored in
cytoplasmic granules and are released into
phagolysosomes where they contribute to the
killing of engulfed micro-organisms.
14Genes involved in defensive systems or immunity
- Major Histocompatibility complex (MHC) of
vertebrates - Multigene family whose products are cell-surface
glycoproteins. - They function to present peptides to cytotoxic T
lymphocytes. - Unlike other examples of adaptive evolution,
these genes show an extremely high level of
polymorphism. - Each class I MHC gene differ with respect to the
antigens they can present. - Thus in a population exposed to an array of
pathogens, it is advantageous for an individual
to be polymorphic at the MHC loci, because a
heterozygote will be able to present a broader
array of antigens, and thus resist a broader
array of pathogens.
15Genes involved in evading the defensive systems
or immunity
- Human immunodeficiency virus-1 envelope gene
(env) - encodes the envelope glycoproteins gp120 and
gp41. - 5 hyper-variable regions were identified in
gp120 (V1 - V5)
16Genes involved in reproduction
- Sperm lysin of abalone
- Species selectivity is very important in most
marine invertebrates with external fertilisation.
- When examined between species there were huge
divergences.
- There were far more replacement than silent
changes - The selection may be related to the
establishment of barriers to heterospecific
fertilisation.
17- How do you detect adaptive evolution at the
genetic level?
18Molecular Evolution
Degeneracy within the Universal Genetic code.
19Silent and Replacement Substitutions
- Silent substitution
- Sequence 1 UUU CAU CGU
- Sequence 2 UUU CAC CGU
- Coded Amino Acids Phe His Arg
- Replacement substitution
- Sequence 1 UUU CAU CGU
- Sequence 2 UUU CAG CGU
- Coded Amino Acids Phe His Arg
Gln
20How do you detect positive selection?
- Dn
- Number of replacement substitutions
- Number of replacement sites
- Ds
- Number of silent substitutions
- Number of silent sites
- Dn/Ds gt 1 ? Positive Selection
21Calculating Pairwise distances
Example using Li 1993 method
Step 1 Classify Nucleotides into non-degenerate,
twofold and fourfold degenerate sites
L0 L2 L4
22Example Sequence 1 Sequence 2 TTT CTA TCT
CTG Degenerate? 002 204 004 204
002 204
How many of each Type? Average L0 3 L0
3 3 L2 2 L2 1 1.5 L4 1 L4
2 1.5
23Step 2 Count how many transitions or
transversions have occurred between the two
sequences at 0-fold 2-fold or 4-fold
degenerate sites.
T C A G
i.e. Sequence 1 TTT CTA Sequence 2 TCT
CTG TTT to TCT has1 transitional difference at a
0-fold degenerate site CTA to CTG has 1
transitional difference at a 4-fold degenerate
site
24Step 3 The Kimura method.
Number of differences between two sequences
Time since the sequences shared a common ancestor
25Step 4
Dn change at 0-fold sites transversions at
2-fold sites. Ds change at 4-fold sites
transitions at 2-fold sites.
26Where Dn/Ds Ratios fail, a cautionary tale!
- Protamines are small, DNA-binding proteins found
in sperm. - The proportion of arginine residues in P1 is
about 50 in mammals. - However the total number of amino acids and the
positions of arginine residues have changed
considerably during the course of mammalian
evolution. - This evolutionary pattern suggests that
protamine P1 is under an unusual form of negative
selection, in which the high proportion of
arginine residues is maintained but the positions
may vary. - Selection allows Amino acid replacements as long
as the proportion of Arginine residues stays the
same, so this is purifying selection, even though
Dn/Ds ratios identify positive selection.
27Worked Example Primate Lysozyme
28Lysozyme and foregut fermentation
- Lysozymes - enzymes that catalyse the break up of
some bacterial cell walls . - Important bacterial defence.
- Differences in Gastric lysozymes
- They are most active at low pH.
- They are unusually resistant to cleavage by
pepsin.
29Colubine monkeys (colubus and langurs)
30New World Monkeys
31Cercopithecines
32Hominoids
33Pair-wise distance analysis of Lysozyme data
34Walter Messier and Caro-Beth Stewart 1997
35Results
36Maximum Likelihood analysis of Lysozyme data
37Maximum Likelihood Ziheng Yang
Problems with Messier and Stewart analysis 1)
The reconstruction of ancestral sequences involve
random errors and systematic biases, and so
inferences based on such pseudo-data may be
unsafe. 2) The transition-transversion ratio and
non-uniform codon usage are factors which are
not properly accommodated in approximate
methods of pairwise comparison. Maximum
likelihood has the advantage of being based on
more realistic models of sequence evolution.
38Ziheng Yang 1998 Heterogenous model of
evolution
39Based on the phylogeny given, and from the
results of Messier and Stewart, we can formulate
the hypotheses that can be tested using maximum
likelihood.
?0 is the background Dn/Ds ratio ?C is the Dn/Ds
ratio of Branch C ?H is the Dn/Ds ratio of
Branch H
Then test the interesting Hypotheses Every ?i is
different free-ratio model ?0 ?C ?H
one-ratio model ?0 ?C , ?H
two-ratio model ?0 , ?C , ?H
three-ratio model Etc.
40- Results from Ziheng Yang 1998
- 1) The Dn/ Ds ratios in the Primate Lysozyme
genes are highly variable among evolutionary
lineages, indicating that the evolution of
primate Lysozyme is incompatible with a neutral
model. - 2) The Dn/ Ds of the lineage leading to the
Hominids was significantly greater than 1. - 3) The Dn/ Ds leading to the colobines was
significantly greater than the background Dn/ Ds
ratio, but was not greater than 1.
41Relative rate analysis of Lysozyme data
42Fixed and polymorphic mutations
- Mc Donald Kreitman
- Counts the number of fixed and polymorphic
substitutions that were either replacement or
silent.
McDonald, J. H. Kreitman, M. Adaptive protein
evolution at the Adh locus in Drosophila.
Nature 351,
652-654 (1991)
43Fixed and polymorphic substitutions
- Fixed substitutions occur on a between-species
branch and appear in all the descendant alleles.
44Fixed and polymorphic substitutions
- Polymorphic substitutions occur on a within
species branch and are a difference within the
species
45Fixations versus Polymorphisms
- Count the number of
- Replacement substitutions that are fixed
- Replacement substitutions that are polymorphic
- Silent substitutions that are fixed
- Silent substitutions that are polymorphic
- Examine the ratios
- Replacement-Fixed Silent-fixed
- Replacement-polymorphic Silent-polymorphic
46Negative versus Positive selection
- Negative selection
- Very few Replacement substitutions
- No difference expected whether fixed or not.
- Ratios the same.
- Positive selection
- Replacement substitutions that are beneficial are
kept within the population - More Fixed than Polymorphic replacement
substitutions - Ratios differ.
47Adaption to environmental change
X
Crandall, K. A. et al. Mol. Biol. Evol. 16 (3)
372-382. 1999
48Counting Replacement-fixed substitutions
96 HIS GLN
150 SER ARG
298 PHE LEU
243 ILE MET
230 ARG LYS
6
62 ALA VAL
2
1
1
0
49Counting Replacement-Polymorphisms
167 GLN ARG
107 LEU PRO
120 HIS GLN
171 CYS GLY
50Counting Silent-fixed substitutions
27 CUU CUC
51Counting Silent-polymorphisms
258 CCC CCG
111 GAA GAG
99 CGC CGA
127 CUA UUA
51 AAU AAC
267 UCU UCC
282 CGU CGC
108 AUC AUA
42 UAU UAC
168 AAU AAC
52Results
?
?
?
?
?
?
53Results for Primate Lysozyme sequences
Fore-gut fermenters
Messier, W. Stewart, C. (1997) Nature 385
151-154