Title: Evolutionary Biology Concepts
1Evolutionary Biology Concepts
- Molecular Evolution
- Phylogenetic Inference
BIO520 Bioinformatics Jim Lund
2Evolution
Evolution is a process that results in heritable
changes in a population spread over many
generations. "In fact, evolution can be
precisely defined as any change in the frequency
of alleles within a gene pool from one generation
to the next." - Helena Curtis and N. Sue Barnes,
Biology, 5th ed. 1989 Worth Publishers, p.974
3Levels of Evolution
- Changes in allele frequencies within a species.
- Speciation.
- Molecular changes
- Single bp changes.
- Genomic changes (alterations in large DNA
segments).
4Branching Descent
Evolutionary Tree
Family Tree
5Phylogeny
- Branching diagram showing the ancestral relations
among species. - Tree of Life
- History of evolutionary change
- FRAMEWORK for INFERENCE
6The framework for phylogenetics
- How do we describe phylogenies?
- How do we infer phylogenies?
7Inheritance
DNA
?RNA ?Protein ?Function
8Common Phylogenetic Tree Terminology
Terminal Nodes
Branches or Lineages
A
Represent the TAXA (genes, populations, species,
etc.) used to infer the phylogeny
B
C
D
Ancestral Node or ROOT of the Tree
E
Internal Nodes or Divergence Points (represent
hypothetical ancestors of the taxa)
9Phylogenetic trees diagram the evolutionary
relationships between the taxa
((A,(B,C)),(D,E)) The above phylogeny as
nested parentheses
These say that B and C are more closely related
to each other than either is to A, and that A, B,
and C form a clade that is a sister group to the
clade composed of D and E. If the tree has a
time scale, then D and E are the most closely
related.
10Two types of trees
Cladogram Phylogram
6
Taxon B
Taxon B
1
Taxon C
Taxon C
1
Taxon A
Taxon A
5
Taxon D
Taxon D
genetic change
no meaning
All show the same evolutionary relationships, or
branching orders, between the taxa.
11Rooted vs Unrooted Trees
12More Trees
13Trees-3
Polyphyletic Group
14Extinction
15Population Genetic Forces
Hardy-Weinberg Paradigm pq1 p2 2pq q2 1
- Natural Selection (fitness)
- Drift (homozygosity by chance)
- much greater in small populations
- Mutation/Recombination (variation)
- Migration
- homogenizes gene pools
16Modes of speciation
- Geographic isolation.
- Reproductive isolation.
- Sexual selection.
- Behavioral isolation.
17DNA, protein sequence change
18Multiple Changes/No Change
..CCU AUA GGG.. ..CCC AUA GGG.. ..CCC AUG
GGG.. ..CCC AUG GGC.. ..CCU AUG GGC.. ..CCU AUA
GGC..
5 mutations 1 DNA change 0 amino acid changes
(net)
Enumerating bp/aa changes underestimates
evolutionary change
19Mechanisms of DNA Sequence Change
- Neutral Drift vs Natural Selection
Traditional selection model
Neutral (Kimura/Jukes)
Pan-neutralism
20Mutation rate varies Gene-to-Gene
21Rate varies Site-to-Site
22Rate varies Site-to-Site
From Evolution. Mark Rdley, 3rd Ed.
23Constraints on Silent Changes
- Codon Biases-translation rates
- Transcription elongation rates
- polymerase pause sites
- Silent regulatory elements
- select for or against presence/absence
- Overall genome structure
24Neutralism in Eukaryotes vs Prokaryotes-Slightly
deleterious mutations Models
Most non-coding sites are neutral? Coding/noncodin
g can be flexible?
25DNA, Protein Similarity
- Similarity by common descent
- phylogenetic
- Similarity by convergence (rare)
- functional importance
- Similarity by chance
- random variation not limitless
- particular problem in wide divergence
26Homology-similar by common descent
27Inferring Trees and Ancestors
CCCAGG CCCAAG-gt CCCAAG CCCAAA-gt
CCTAAA CCTAAA-gt CCTAAC
Not always straightforward. The data doesnt
always give a single, correct answer.
28Homology, Orthology, Paralogy
Orthologs
29Paralogy Trap
30Improper Inference
Garbage in, garbage out!
31Our Goals
- Infer Phylogeny
- Optimality criteria
- Algorithm
- Phylogenetic inference
- (interesting ones)
32Watch Out
- The danger of generating incorrect results is
inherently greater in computational phylogenetics
than in many other fields of science. - the limiting factor in phylogenetic analysis is
not so much in the facility of software
application as in the conceptual understanding of
what the software is doing with the data.