Treethinking cont. Introduction to parsimony - PowerPoint PPT Presentation

1 / 52
About This Presentation
Title:

Treethinking cont. Introduction to parsimony

Description:

Mouse. Human. Is a newt more closely related to a fish or a human? Why do ... Mouse. Human. Is a crocodile more closely related to a lizard or an sparrow? ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 53
Provided by: David51
Category:

less

Transcript and Presenter's Notes

Title: Treethinking cont. Introduction to parsimony


1
Tree-thinking (cont.)Introduction to parsimony
2
The most important feature of a phylogenetic
trees is its topology (the order of branching)
A
B
C
D
F
G
F
G
C
D
A
B
E
E
Draw this topology with the taxa in the order
E-G-F-C-D-A-B
3
Which of the following has a different topology?
A
B
C
D
E
A
B
C
D
E
A
B
A
B
E
D
C
B
A
C
E
D
C
D
4
Various types of trees you will see
R
R
R
5
Which topology is different?
A
B
A
B
C
D
E
F
R
C
D
E
F
A
B
R
A
B
A
R
D
B
D
E
R
C
E
D
C
F
F
C
6
Evolutionary relatedness
  • Evolutionary relatedness recency of common
    ancestry
  • Topology contains the information needed to
    assess relative degree of relatedness

7
Fish
Newt
Lizard
Mouse
Human
Is a newt more closely related to a fish or a
human?
8
Why do people go wrong?Looking along the top
Fish
Newt
Lizard
Mouse
Human
Is a newt more closely related to a fish or a
human?
9
Fish
Newt
Lizard
Mouse
Human
  • This is not how evolution happened
  • All these species are alive today A living fish
    is not an ancestor of a newt
  • The order along the top can change without
    changing the content of the tree

10
Now, is a newt more closely related to a fish or
a human?
Fish
Newt
Lizard
Mouse
Human
11
The tree has the same topology
12
Trees depict descent not similarity
Turtle
Lizard
Crocodile
Sparrow
Is a crocodile more closely related to a lizard
or an sparrow?
13
Dont be distracted by similarity
Turtle
Lizard
Crocodile
Sparrow
It doesnt matter how many changes occurred here,
the tree shape remains the same
14
Fish
Newt
Lizard
Mouse
Human
Is a newt more closely related to a lizard or a
human?
15
The principle of phylogenetic inference
16
General procedure
  • We score tips for some variable characters
  • We have a model of how evolution might have given
    rise to the states we see
  • We identify the tree (etc.) that is most
    compatible with our data

17
A hypothetical example
AGTTGTACGTATGCCGA
18
O
A
B
C
AGTTGTAGGTATGCCGA
AGTAGTACGTATGCCTA
AGTAGTACGTATGCCGA
AGTAGCACGTATGACTA
19
Typical experimental strategy
Extract DNA
PCR with gene-specific probes
Sequence PCR product
Align DNA sequences from different species (data
matrix)
20
Data matrix
Taxa
O
A
B
C
21
Typical experimental strategy
Extract DNA
PCR with gene-specific probes
Sequence PCR product
Align DNA sequences from different species (data
matrix)
Phylogeny
22
Matrix -gt Tree
  • Use an algorithm
  • Apply an optimality criterion

23
The algorithmic approach
  • Make assumptions about how evolution works
  • Identify properties of the true tree under these
    assumptions
  • Develop an algorithm for finding the tree with
    these properties (can be very fast)
  • Two main ones
  • UPGMA - Assumes ultrametricity
  • Neigbor-joining - Assumes additivity

24
The problem with algorithms
  • Even if the real world matches our model there is
    an element of chance in evolution
  • The true tree may not be the one found
  • We have no way of evaluating the degree of
    support for the algorithmic tree relative to
    other possible trees

25
Optimality criteria
  • Make assumptions about how evolution works
  • Identify properties that will tend to be
    maximized or minimized on true trees
  • Score that property for all possible trees
  • Trees with better scores will be more likely to
    be true (if the model is correct)
  • Trees can be compared based on their score

26
Example of an optimality criterion Parsimony
  • Favor the tree that can explain the distribution
    of character states with the minimum number of
    character-state changes

27
A hypothetical example
AGTTGTAGGTATGCCGA
AGTAGTACGTATGCCTA
AGTAGTACGTATGCCGA
AGTAGCACGTATGACTA
AGTAGTACGT -ATGCCTA
AGTAGTACGTATGCCGA
AGTTGTACGTATGCCGA
28
Data matrix
1111111
12345678901234567
O
AGTTGTAGGTATGCCGA
A
AGTAGTACGTATGCCGA
B
AGTAGTACGTATGCCTA
C
AGTAGCACGTATGACTA
29
Remove invariant characters
1111111
12345678901234567
O
T T G C G
A
A T C C G
B
A T C C T
C
A C C A T
30
There are three possible arrangements that we
need to consider
C
B
A
O
Tree 1
Tree 2
Tree 3
31
These trees can be drawn without the root
R
R
R
32
These trees can be drawn without the root
33
Map the characters onto tree 1
1
2
3
4
5
T
T
G
C
G
O
A
A
T
C
C
G
B
A
T
C
C
T
A
O
C
A
G
C
A
T
C
B
Total cost (length) steps
34
Actually there are two ways to map character 5
3
O
G
A
O
A
G
B
T
A
O
C
T
C
B
C
B
Either way the character contributes __ steps to
the overall cost
35
Map the characters onto tree 2
1
2
3
4
5
O
T
T
G
C
G
A
A
T
C
C
G
B
C
O
A
T
C
C
T
C
A
G
C
A
T
A
B
Total cost
36
Map the characters onto tree 3
1
2
3
4
5
T
T
G
C
G
O
A
A
T
C
C
G
B
A
T
C
C
T
A
O
C
A
G
C
A
T
B
C
Total cost steps
37
What was the cost of each tree?
38
The difference in tree length is all due to
character 5
Parsimony informative
1
2
3
4
5
T
T
G
C
G
O
A
A
T
C
C
G
B
A
T
C
C
T
C
A
G
C
A
T
Parsimony uninformative
39
Parsimony informative characters
  • At least two states that occur in at least two
    taxa
  • A C G T T T A
  • T C G A T T A
  • G G G T T A G
  • G G A A A T ?
  • C A T G ? C G

40
Redraw tree 2 with root in place
R
This is the correct tree
R
41
Which rooted tree is correct?
A
H
G
A
B
E
O
F
G
H
C
D
F
B
O
E
C
D
B
A
O
A
D
E
H
G
F
B
C
O
F
E
D
C
A
B
G
H
C
C
A
B
42
Many issues glossed over
  • What if characters disagree?
  • How is the tree score determined?
  • How can we root the trees?
  • How do we find the optimal tree?
  • How can we evaluate the robustness of our
    conclusions?

43
How does character conflict arise?
  • The tree is not divergent (ignore)
  • A particular character changes more than once
    (Homoplasy)

A
B
C
D
E
F
G
G-gtA
Reversal
A-gtG
44
How can characters conflict arise?
  • The tree is not divergent (ignore)
  • A particular character changes more than once
    (Homoplasy)

A
B
C
D
E
F
G
G-gtA
G-gtA
Parallelism/ Convergence
45
Parsimony can still work
  • If characters are independent (a key assumption)
    homoplasy will be randomly distributed
  • Homoplasies will tend to cancel each other out
  • Non-homoplastic changes will tend to agree
  • Therefore,with enough characters the shortest
    tree is a good estimate of the true tree

46
The justification of parsimony
Good characters - mark real clades
Bad characters - the rest
Only bad characters contradict each other
47
Many issues glossed over
  • What if characters disagree?
  • How is the tree score determined?
  • How can we root the trees?
  • How do we find the optimal tree?
  • How can we evaluate the robustness of our
    conclusions?

48
Tree score calculation
In
Ltot ? Ln Wn
I1
The tree score is the sum of the minimum number
of weighted steps (Ln) for each character
multiplied by the weight of that character (Wn)
49
How is the minimum number of steps calculated?
  • Postorder traversal algorithm
  • The tree is arbitrarily rooted
  • Each internal node is inspected to see if there
    is an intersection in the possible states of its
    descendant nodes if not tree length is increased
  • It is not necessary to identify all ancestral
    state reconstructions (this requires a preorder
    traversal)

50
Why weight characters?
  • If we think some characters are less prone to
    homoplasy, we can upweight them
  • Character weights are multiplied by the character
    length

51
We can also weight character state transitions
  • Common examples
  • Ordered character states (morphology)

To state
From state
Step matrix
52
We can also weight character state transitions
  • Common examples
  • Transitions vs. transversions

To state
From state
Step matrix
53
We can also weight character state transitions
  • Common examples
  • Gains less likely than loss (restriction sites)

To state
From state
Step matrix (Asymmetric)
54
The weighting game
  • When should you weight characters/character-states
    ?
  • If you think that they differ in evidential power
  • How much should you modify weights?
  • There is no simple formula
  • It is probably better to err on the side of less
    extreme weights
  • Often sensible to try a range of weights
Write a Comment
User Comments (0)
About PowerShow.com