Assessing Phylogenetic Hypotheses and Phylogenetic Data

About This Presentation

Title:

Assessing Phylogenetic Hypotheses and Phylogenetic Data

Description:

... among the resulting trees is summarized with a majority-rule consensus tree ... sets are analysed and the results summarised with a majority-rule consensus tree ... – PowerPoint PPT presentation

Number of Views:81

Avg rating:3.0/5.0

Slides: 46

Provided by: bioinf6

Category:

more less

Transcript and Presenter's Notes

Title: Assessing Phylogenetic Hypotheses and Phylogenetic Data

1
Assessing Phylogenetic Hypotheses and
Phylogenetic Data

We use numerical phylogenetic methods because
most data includes potentially misleading
evidence of relationships
We should not be content with constructing
phylogenetic hypotheses but should also assess
what confidence we can place in our hypotheses
This is not always simple! (but do not despair!)

2
Assessing Data Quality

We expect (or hope) our data will be well
structured and contain strong phylogenetic signal
We can test this using randomization tests of
explicit null hypotheses
The behaviour or some measure of the quality of
our real data is contrasted with that of
comparable but phylogenetically uninformative
data determined by randomization

3
Random Permutation

Random permutation destroys any correlation among
characters to that expected by chance alone
It preserves number of taxa, characters and
character states in each character (and the
theoretical maximum and minimum tree lengths)

T
A
X
A

C
H
A
R
A
C
T
E
R
S

Original structured data with strong correlations
among characters
1
2
3
4
5
6
7
8
R
-
P
R
P
R
P
R
P
R
P
A
-
E
A
E
A
E
A
E
A
E
N
-
R
N
R
N
R
N
R
N
R
D
-
M
D
M
D
M
D
M
D
M
O
-
U
O
U
O
U
O
U
O
U
M
-
T
M
T
M
T
M
T
M
T
L
-
E
L
E
L
E
L
E
L
E
Y
-
D
Y
D
Y
D
Y
D
Y
D

T
A
X
A

C
H
A
R
A
C
T
E
R
S

1
2
3
4
5
6
7
8
Randomly permuted data with any correlation
among characters due to chance
R
-
P
N
U
D
E
R
T
O
U
A
-
E
R
E
A
P
L
E
A
D
N
-
R
M
R
M
M
A
D
N
P
D
-
M
L
T
R
E
Y
M
D
R
O
-
U
D
E
Y
U
D
E
Y
M
M
-
T
O
M
O
T
O
U
L
T
L
-
E
Y
D
N
D
M
P
M
E
Y
-
D
A
P
L
R
N
R
R
E
4
Matrix Randomization Tests

Compare some measure of data quality/hierarchical
structure for the real and many randomly permuted
data sets
This allows us to define a test statistic for the
null hypothesis that the real data are no better
structured than randomly permuted and
phylogenetically uninformative data
A permutation tail probability (PTP) is the
proportion of data sets with as good or better
measure of quality than the real data

5
Structure of Randomization Tests

Reject null hypothesis if, for example, more than
5 of random permutations have as good or better
measure than the real data

6
Matrix Randomization Tests

Measures of data quality include
1. Tree length for most parsimonious trees - the
shorter the tree length the better the data
(PAUP)
2. Numbers of pairwise incompatibilities between
characters (pairs of incongruent characters) -
the fewer character conflicts the better the data
3. Skewness of the distribution of tree lengths
(PAUP)

7
Matrix Randomization Tests
Ciliate SSUrDNA
Min 430 Max 927
1 MPT L 618 CI 0.696 RI 0.714 PTP
0.01 PC-PTP 0.001 Significantly non random
Real data
3 MPTs L 792 CI 0.543 RI 0.272 PTP
0.68 PC-PTP 0.737 Not significantly
different from random
Randomly permuted
Strict consensus
8
Skewness of Tree Length Distributions

Skewness of tree length distributions can be used
as a measure of data quality in randomization
tests
It is measured with the G1 statistic in PAUP
Significance cut-offs for data sets of up to
eight taxa have been published based on randomly
generated data (rather than randomly permuted
data)
PAUP does not perform the more direct
randomization test

9
Skewness of Tree Length Distributions

Studies with random (and phylogenetically
uninformative) data showed that the distribution
of tree lengths tends to be normal

shortest
NUMBER OF TREES
tree
Tree length

In contrast, phylogenetically informative data
is expected to have a strongly skewed
distribution with few shortest trees and few
trees nearly as short

shortest
NUMBER OF TREES
tree
Tree length
10
Skewness - example
11
Matrix Randomization Tests - use and limitations

Can detect very poor data - that provides no good
basis for phylogenetic inferences (throw it
away!)
However, only very little may be needed to reject
the null hypothesis (passing test ? great data)
Doesnt indicate location of this structure (more
discerning tests are possible)
In the skewness test, significance levels for G1
have been determined for small numbers of taxa
only so that this test remains of limited use

12
Assessing Phylogenetic Hypotheses - groups on
trees

Several methods have been proposed that attach
numerical values to nodes in trees that are
intended to provide some measure of the strength
of support for that node
These methods include
character resampling methods - the bootstrap and
jackknife
decay analyses
additional randomization tests

13
Bootstrapping

Bootstrapping is a modern statistical technique
that uses computer intensive random resampling of
data to determine sampling error or confidence
intervals for some estimated parameter

14
Bootstrapping (non-parametric)

Characters are resampled with replacement to
create many bootstrap replicate data sets
Each bootstrap replicate data set is analysed
(e.g. with parsimony, distance, ML)
Agreement among the resulting trees is summarized
with a majority-rule consensus tree
Frequency of occurrence of groups, bootstrap
proportions (BPs), is a measure of support for
those groups
Additional information is given in partition
tables

15
Bootstrapping
Resampled data matrix
Original data matrix

Characters
Characters
Summarise the results of multiple analyses with a
majority-rule consensus tree Bootstrap
proportions (BPs) are the frequencies with which
groups are encountered in analyses of replicate
data sets
Taxa 1 2 2 5 5 6 6 8
Taxa 1 2 3 4 5 6 7 8
A R R R Y Y Y Y Y
A R R Y Y Y Y Y Y
B R R R Y Y Y Y Y
B R R Y Y Y Y Y Y
C Y Y Y Y Y R R R
C Y Y Y Y Y R R R
D Y Y Y R R R R R
D Y Y R R R R R R
Outgp R R R R R R R R
Outgp R R R R R R R R
Randomly resample characters from the original
data with replacement to build many bootstrap
replicate data sets of the same size as the
original - analyse each replicate data set
D
A
B
C
D
A
B
C
B
C
D
A
1
5
1
2
5
96
2
8
8
7
2
6
6
66
2
6
5
1
4
3
Outgroup
Outgroup
Outgroup
16
Bootstrapping - an example
Partition Table
Ciliate SSUrDNA - parsimony bootstrap
123456789 Freq ----------------- .......
100.00 ....... 100.00 .......
100.00 ..... 100.00 ...
95.50 ....... 84.33 ....
11.83 .... 3.83 ..
2.50 ...... 1.00 ...... 1.00
Ochromonas (1)
Symbiodinium (2)
100
Prorocentrum (3)
Euplotes (8)
84
Tetrahymena (9)
96
Loxodes (4)
100
Tracheloraphis (5)
100
Spirostomum (6)
100
Gruberia (7)
Majority-rule consensus
17
Bootstrapping - random data
Partition Table
123456789 Freq ----------------- ..
71.17 ....... 58.87 .......
26.43 ....... 25.67 ...
23.83 ....... 21.00 ....
18.50 ....... 16.00 ......
15.67 ..... 13.17 .....
12.67 ...... 12.00 .......
12.00 ..... 11.00 .......
10.80 ...... 10.50 ...... 10.00
Randomly permuted data - parsimony bootstrap
Majority-rule consensus (with minority components)
18
Bootstrap - interpretation

Bootstrapping was introduced as a way of
establishing confidence intervals for phylogenies
This interpretation of bootstrap proportions
(BPs) depends on the assumption that the original
data is a random sample from a much larger set of
independent and identically distributed data
However, several things complicate this
interpretation
Many systematists consider these assumptions
unreasonable making any statistical
interpretation of BPs invalid
Some theoretical work indicates that BPs are very
conservative, and may underestimate confidence
intervals - problem increases with numbers of
taxa
BPs can be high for incongruent relationships in
separate analyses - and can therefore be
misleading (misleading data -gt misleading BPs)
with parsimony it may be highly affected by
inclusion or exclusion of a few characters

19
Bootstrap - interpretation

Bootstrapping is a very valuable and widely used
technique (it is demanded by some journals), but
requires a pragmatic interpretation
BPs depend on two aspects of the support for a
group - the numbers of characters supporting a
group and the level of support for incongruent
groups
BPs thus provides a reasonable index of the
relative support for groups provided by a set of
data

20
Bootstrap - interpretation

High BPs (e.g. gt85) is indicative of strong
signal in the data
Provided we have no evidence of strong misleading
signal (e.g. base composition biases, great
differences in branch lengths) high BPs are
likely to reflect strong phylogenetic signal
Low BPs need not mean the relationship is false!
Its just poorly supported
Bootstrapping can be viewed as a way of exploring
the robustness of phylogenetic inferences to
perturbations in the the balance of supporting
and conflicting evidence for groups

21
Jackknifing

Jackknifing is very similar to bootstrapping and
differs only in the character resampling strategy
Some proportion of characters (e.g. 50) are
randomly selected and deleted
Replicate data sets are analysed and the results
summarised with a majority-rule consensus tree
Jackknifing and bootstrapping tend to produce
broadly similar results and have similar
interpretations

22
Decay analysis

In parsimony analysis, a way to assess support
for a group is to see if the group occurs in
slightly less parsimonious trees also
The length difference between the shortest trees
including the group and the shortest trees that
exclude the group (the extra steps required to
overturn a group) is the decay index or Bremer
support
Total support (for a tree) is the sum of all
clade decay indices - this has been advocated as
a measure for an as yet unavailable matrix
randomization test

23
Decay analysis -example
Ciliate SSUrDNA data
Randomly permuted data
Ochromonas
27
Symbiodinium
1
Prorocentrum
1
45
Loxodes
3
Tetrahymena
Tracheloraphis
8
15
Spirostomum
10
Euplotes
7
Gruberia
24
Decay analyses - in practice

Decay indices for each clade can be determined
by
Saving increasingly less parsimonious trees and
producing corresponding strict component
consensus trees until the consensus is completely
unresolved
analyses using reverse topological constraints to
determine shortest trees that lack each clade
with the Autodecay program (in conjunction with
PAUP)

25
Decay indices - interpretation

Generally, the higher the decay index the better
the relative support for a group
Like BPs, decay indices may be misleading if the
data is misleading
Unlike BPs decay indices are not scaled (0-100)
and it is less clear what is an acceptable decay
index
Magnitude of decay indices and BPs generally
correlated (i.e. they tend to agree)
Only groups found in all most parsimonious trees
have decay indices gt zero

26
Decay indices - extensions

Traditional decay analysis is the determination
of decay indices of clades
Double decay analysis is the determination of
decay indices for all relationships - gives a
more comprehensive but potentially very
complicated summary of support
Analogues of parsimony decay indices are possible
for any optimality criterion (objective function)

27
Types of Cladistic Relationships
28
PTP tests of groups

A number of randomization tests have been
proposed for evaluating particular groups rather
than entire data matrices by testing null
hypotheses regarding the level of support they
receive from the data
Randomisation can be of the data or the group
These methods have not become widely used both
because they are not readily performed and
because their properties are still under
investigation
Topology dependent PTP tests are included in
PAUP but have serious problems (they dont work!)

29
Comparing competing phylogenetic hypotheses -
tests of two trees

Particularly useful techniques are those designed
to allow evaluation of alternative phylogenetic
hypotheses
Several such tests allow us to determine if one
tree is statistically significantly worse than
another
Winning sites test, Templeton test,
Kishino-Hasegawa test,
Shimodaira-Hasegawa test, parametric bootstrapping

30
Tests of two trees

All these tests are of the null hypothesis that
the differences between two trees (A and B) are
no greater than expected from sampling error
The simplest wining sites test sums the number
of sites supporting tree A over tree B and vice
versa (those having fewer steps on, and better
fit to, one of the trees)
Under the null hypothesis characters are equally
likely to support tree A or tree B and a binomial
distribution gives the probability of the
observed difference in numbers of winning sites

31
The Templeton test

Templetons test is a non-parametric Wilcoxon
signed ranks test of the differences in fits of
characters to two trees
It is like the winning sites test but also
takes into account the magnitudes of differences
in the support of characters for the two trees

32
Templetons test - an example
Recent studies of the relationships of turtles
using morphological data have produced very
different results with turtles grouping either
within the parareptiles (H1) or within the
diapsids (H2) the result depending on
the morphologist This suggests there may be -
problems with the data - special problems with
turtles - weak support for turtle relationships
1
Archosauromorpha
Lepidosauriformes
Diadectomorpha
Eosauropterygia
Younginiformes
Seymouriadae
Claudiosaurus
Captorhinidae
Araeoscelidia
Parareptilia
Paleothyris
Synapsida
Placodus
2
Parsimony analysis of the most recent data
favoured H2 However, analyses constrained by H2
produced trees that required only 3 extra steps
(lt1 tree length)
The Templeton test was used to evaluate the trees
and showed that the slightly longer H1 tree
found in the constrained analyses was not
significantly worse than the unconstrained H2
tree The morphological data do not allow choice
between H1 and H2
33
Kishino-Hasegawa test

The Kishino-Hasegawa test is similar in using
differences in the support provided by individual
sites for two trees to determine if the overall
differences between the trees are significantly
greater than expected from random sampling error
It is a parametric test that depends on
assumptions that the characters are independent
and identically distributed (the same assumptions
underlying the statistical interpretation of
bootstrapping)
It can be used with parsimony and maximum
likelihood - implemented in PHYLIP and PAUP

34
Kishino-Hasegawa test
If the difference between trees (tree lengths or
likelihoods) is attributable to sampling error,
then characters will randomly support tree A or B
and the total difference will be close to
zero The observed difference is significantly
greater than zero if it is greater than 1.95
standard deviations This allows us to reject the
null hypothesis and declare the sub-optimal tree
significantly worse than the optimal tree (p lt
0.05)
Sites favouring tree A
Sites favouring tree B
Expected
Mean
0
Distribution of Step/Likelihood differences at
each site
Under the null hypothesis the mean of the
differences in parsimony steps or likelihoods for
each site is expected to be zero, and the
distribution normal From observed differences we
calculate a standard deviation
35
Kishino-Hasegawa test - an example
Ciliate SSUrDNA
Ochromonas
Symbiodinium
Maximum likelihood tree
Prorocentrum
Sarcocystis
Theileria
Parsimonious character optimization of the
presence and absence of hydrogenosomes suggests
four separate origins of hydrogenosomes within
the ciliates
Plagiopyla n
Plagiopyla f
Trimyema c
Trimyema s
Cyclidium p
Cyclidium g
Cyclidium l
Glaucoma
Colpodinium
Tetrahymena
Paramecium
Discophrya
Trithigmostoma
Opisthonecta
Colpoda
Dasytrichia
Questions - how reliable is this result? - in
particular how well supported is the idea of
multiple origins? - how many origins can we
confidently infer?
Entodinium
Spathidium
Loxophylum
Homalozoon
Metopus c
Metopus p
Stylonychia
Onychodromous
Oxytrichia
Loxodes
Tracheloraphis
Spirostomum
Gruberia
anaerobic ciliates with hydrogenosomes
Blepharisma
36
Kishino-Hasegawa test - an example
Ciliate SSUrDNA data Most parsimonious tree
Ochromonas
99-100
Symbiodinium
Parsimony analysis yields a very similar tree -
in particular, parsimonious character
optimization indicates four separate origins
of hydrogenosomes within ciliates
95-100
11
Prorocentrum
7
Sarcocystis
81-86
Theileria
100
33
3
Plagiopyla n
100
48
Plagiopyla f
100
27
Trimyema c
15-0
Trimyema s
69-99
3
Glaucoma
11-0
100
75
6
Colpodinium
3
35-17
Tetrahymena
Paramecium
3
7
Cyclidium p
12
41-30
Cyclidium g
78-99
Cyclidium l
89-91
3
Discophryal
Trithigmostoma
100-99
23
46-26
3
Opisthonecta
100-98
18-0
Dasytrichia
50-53
17
Entodinium
67-99
1
3
Spathidium
100
56
3
Homalozoon
53-45
Decay indices and BPs for parsimony and distance
analyses indicate relative support for
clades Differences between the ML, MP
and distance trees generally reflect the less
well supported relationships
Loxophylum
69-78
100 42
5
Metopus c
4
Metopus p
3
83-82
45-72
Stylonychia
Onychodromous
100
27
Oxytrichia
96-100
Colpoda
10
Loxodes
Tracheloraphis
100
63
Spirostomum
100
26
Gruberia
100
18
80-50
Blepharisma
3
37
Kishino-Hasegawa test - example
Parsimony analyse with topological constraints
were used to find the shortest trees that forced
hydrogenosomal ciliate lineages together and
thereby reduced the number of separate origins
of hydrogenosomes
Ochromonas
Ochromonas
Symbiodinium
Symbiodinium
Prorocentrum
Prorocentrum
Sarcocystis
Sarcocystis
Theileria
Theileria
Plagiopyla n
Plagiopyla n
Plagiopyla f
Plagiopyla f
Trimyema c
Trimyema c
Trimyema s
Trimyema s
Cyclidium p
Cyclidium p
Cyclidium g
Metopus c
Cyclidium l
Metopus p
Dasytrichia
Dasytrichia
Entodinium
Entodinium
Loxophylum
Cyclidium g
Homalozoon
Cyclidium l
Spathidium
Loxophylum
Metopus c
Spathidium
Metopus p
Homalozoon
Loxodes
Loxodes
Each of the constrained parsimony trees were
compared to the ML tree and the Kishino-Hasegawa
test used to determine which of these trees were
significantly worse than the ML tree
Tracheloraphis
Tracheloraphis
Spirostomum
Spirostomum
Gruberia
Gruberia
Blepharisma
Blepharisma
Discophrya
Discophrya
Trithigmostoma
Trithigmostoma
Stylonychia
Stylonychia
Onychodromous
Onychodromous
Oxytrichia
Oxytrichia
Colpoda
Colpoda
Paramecium
Paramecium
Glaucoma
Glaucoma
Colpodinium
Colpodinium
Tetrahymena
Tetrahymena
Opisthonecta
Opisthonecta
Two examples of the topological constraint trees
38
Kishino-Hasegawa test
Test summary and results - origins of ciliate
hydrogenosomes (simplified)
Constrained analyses used to find most
parsimonious trees with less than four separate
origins of hydrogenosomes Tested against ML
tree Trees with 2 or 1 origin are all
significantly worse than the ML tree We can
confidently conclude that there have been at
least three separate origins of hydrogenosomes
within the sampled ciliates
N
o
.
C
o
n
s
t
r
a
i
n
t
E
x
t
r
a
D
i
f
f
e
r
e
n
c
e
S
i
g
n
i
f
i
c
a
n
t
l
y
O
r
i
g
i
n
s
t
r
e
e
S
t
e
p
s
a
n
d

S
D
w
o
r
s
e
?
4
M
L

1
0
-
-
4
M
P
-
-
1
3

?

1
8
N
o
3
(
c
p
,
p
t
)

1
3
-
2
1

?

2
2
N
o
3
(
c
p
,
r
c
)

1
1
3
-
3
3
7

?

4
0
Y
e
s
3
(
c
p
,
m
)

4
7
-
1
4
7

?

3
6
Y
e
s
3
(
p
t
,
r
c
)

9
6
-
2
7
9

?

3
8
Y
e
s
3
(
p
t
,
m
)

2
2
-
6
8

?

2
9
Y
e
s
3
(
r
c
,
m
)

6
3
-
1
9
0

?

3
4
Y
e
s
2
(
p
t
,
c
p
,
r
c
)

1
2
3
-
4
3
2

?

4
0
Y
e
s
2
(
p
t
,
r
c
,
m
)

1
0
0
-
3
5
3

?

4
3
Y
e
s
2
(
p
t
,
c
p
,
m
)

4
0
-
1
4
0

?

3
7
Y
e
s
2
(
c
p
,
r
c
,
m
)

1
2
4
-
4
6
6

?

4
9
Y
e
s
2
(
p
t
,
c
p
)
(
r
c
,
m
)

7
7
-
2
2
2

?

3
9
Y
e
s
2
(
p
t
,
m
)
(
r
c
,
c
p
)

1
3
1
-
4
4
2

?

4
8
Y
e
s
2
(
p
t
,
r
c
)
(
c
p
,
m
)

1
4
0
-
4
1
4

?

5
0
Y
e
s
1
(
p
t
,
c
p
,
m
,
r
c
)

1
3
1
-
5
1
5

?

4
9
Y
e
s
39
Shimodaira-Hasegawa Test

To be statistically valid, the Kishino-Hasegawa
test should be of trees that are selected a
priori
However, most applications have used trees
selected a posteriori on the basis of the
phylogenetic analysis
Where we test the best tree against some other
tree the KH test will be biased towards rejection
of the null hypothesis
The SH test is a similar but more statistically
correct technique in these circumstances

40
Reliability of Phylogenetic Methods

Phylogenetic methods (e.g. parsimony, distance,
ML) can also be evaluated in terms of their
general performance, particularly their
consistency - approach the truth with more data
efficiency - how quickly (how much data)
robustness - how sensitive to violations of
assumptions
Studies of these properties can be analytical or
by simulation

41
Reliability of Phylogenetic Methods

There have been many arguments that ML methods
are best because they have desirable statistical
properties, such as consistency
However, ML does not always have these properties
if the model is wrong/inadequate
properties not yet demonstrated for complex
inference problems such as phylogenetic trees

42
Reliability of Phylogenetic Methods

Simulations show that ML methods generally
outperform distance and parsimony methods over a
broad range of realistic conditions
Whelan et al. 2001 Trends in Genetics
17262-272
Most simulations are very (unrealistically)
simple
few taxa (typically just four)
few parameters (standard models - JC, K2P etc)

43
Reliability of Phylogenetic Methods

Simulations with four taxa have shown
Model based methods - distance and maximum
likelihood perform well when the model is
accurate (not surprising!)
Violations of assumptions can lead to
inconsistency for all methods (a Felsenstein
zone) when branch lengths or rates are highly
unequal
Maximum likelihood methods are quite robust to
violations of model assumptions
Weighted parsimony can perform better than
standard parsimony (has a smaller Felsenstein
zone) in some cases

44
Reliability of Phylogenetic Methods

However
Generalising from four taxon simulations may be
dangerous as conclusions may not hold for more
complex cases
A few large scale simulations (many taxa) have
suggested that parsimony can be very accurate and
efficient
Most methods are accurate in correctly recovering
known phylogenies produced in laboratory studies
More study of methods is needed to help in choice
of method using more realistic simulations

45
HAPPY BIRTHDAY PATRICIA

Write a Comment

User Comments (0)