Assessing Phylogenetic Hypotheses and Phylogenetic Data - PowerPoint PPT Presentation

About This Presentation
Title:

Assessing Phylogenetic Hypotheses and Phylogenetic Data

Description:

Random Permutation. Random permutation destroys any correlation among ... A permutation tail probability (PTP) is the proportion of data sets with as good ... – PowerPoint PPT presentation

Number of Views:167
Avg rating:3.0/5.0
Slides: 51
Provided by: ncl8
Category:

less

Transcript and Presenter's Notes

Title: Assessing Phylogenetic Hypotheses and Phylogenetic Data


1
Assessing Phylogenetic Hypotheses and
Phylogenetic Data
  • We use numerical phylogenetic methods because
    most data includes potentially misleading
    evidence of relationships
  • We should not be content with constructing
    phylogenetic hypotheses but should also assess
    what confidence we can place in our hypotheses
  • This is not always simple! (but do not despair!)

2
Assessing Data Quality
  • We expect (or hope) our data will be well
    structured and contain strong phylogenetic signal
  • We can test this using randomisation tests of
    explicit null hypotheses
  • The behaviour or some measure of the quality of
    our real data is contrasted with that of
    comparable but phylogenetically uninformative
    data determined by randomisation of the data

3
Random Permutation
  • Random permutation destroys any correlation among
    characters to that expected by chance alone
  • It preserves number of taxa, characters and
    character states in each character (and the
    theoretical maximum and minimum tree lengths)


T
A
X
A


C
H
A
R
A
C
T
E
R
S

1
2
3
4
5
6
7
8
R
-
P
R
P
R
P
R
P
R
P
Original structured data with strong correlations
among characters
A
-
E
A
E
A
E
A
E
A
E
N
-
R
N
R
N
R
N
R
N
R
D
-
M
D
M
D
M
D
M
D
M
O
-
U
O
U
O
U
O
U
O
U
M
-
T
M
T
M
T
M
T
M
T
L
-
E
L
E
L
E
L
E
L
E
Y
-
D
Y
D
Y
D
Y
D
Y
D

T
A
X
A


C
H
A
R
A
C
T
E
R
S

1
2
3
4
5
6
7
8
Randomly permuted data with correlation among
characters due to chance alone
R
-
P
N
U
D
E
R
T
O
U
A
-
E
R
E
A
P
L
E
A
D
N
-
R
M
R
M
M
A
D
N
P
D
-
M
L
T
R
E
Y
M
D
R
O
-
U
D
E
Y
U
D
E
Y
M
M
-
T
O
M
O
T
O
U
L
T
L
-
E
Y
D
N
D
M
P
M
E
Y
-
D
A
P
L
R
N
R
R
E
4
Matrix Randomisation Tests
  • Compare some measure of data quality
    (hierarchical structure) for the real and many
    randomly permuted data sets
  • This allows us to define a test statistic for the
    null hypothesis that the real data are no better
    structured than randomly permuted and
    phylogenetically uninformative data
  • A permutation tail probability (PTP) is the
    proportion of data sets with as good or better
    measure of quality than the real data

5
Structure of Randomisation Tests
  • Reject null hypothesis if, for example, more than
    5 of random permutations have as good or better
    measure than the real data

FAIL
TEST
Frequency
95 cutoff
PASS
TEST
reject null hypothesis
Measure of data quality (e.g. tree length, ML,
pairwise incompatibilities)
GOOD
BAD
6
Matrix Randomisation Tests
  • Measures of data quality include
  • 1. Tree length for most parsimonious trees - the
    shorter the tree length the better the data
    (PAUP)
  • 2. Any other objective function (Likelihood,
    Least Squares Fit, etc)
  • 3. Numbers of pairwise incompatibilities between
    characters (pairs of incongruent characters) -
    the fewer character conflicts the better the data

7
Matrix Randomization Tests
Min 430 Max 927
Ciliate SSUrDNA
1 MPT L 618 CI 0.696, RI 0.714 PTP
0.01 PC-PTP 0.001 Significantly non random
Real data
3 MPTs L 792 CI 0.543, RI 0.272 PTP
0.68 PC-PTP 0.737 Not significantly
different from random
Randomly permuted
Strict consensus
8
Matrix Randomisation Tests - use and limitations
  • Can detect very poor data - that provides no good
    basis for phylogenetic inferences (throw it
    away!)
  • However, only very little may be needed to reject
    the null hypothesis (passing test ? great data)
  • Doesnt indicate location of this structure (more
    discerning tests are possible)

9
Skewness of Tree Length Distributions
  • Studies with random and thus phylogenetically
    uninformative data showed that the distribution
    of tree lengths tends to be normal

shortest
NUMBER OF TREES
tree
Tree length
  • In contrast, phylogenetically informative data
    is expected to have a strongly skewed
    distribution with few shortest trees and few
    trees nearly as short

shortest
NUMBER OF TREES
tree
Tree length
10
Skewness of Tree Length Distributions
  • Measured with the G1 statistic (PAUP)
  • Skewness of tree length distributions could be
    used as a measure of data quality in a
    randomisation test
  • Significance cut-offs for data sets of up to
    eight taxa have been published based on randomly
    generated data (rather than randomly permuted
    data)

11
Skewness - example
REAL DATA Ciliate SSUrDNA g1-0.951947
RANDOMLY PERMUTED DATA g1-0.100478
12
Assessing Phylogenetic Hypotheses - groups on
trees
  • Several methods have been proposed that attach
    numerical values to internal branches in trees
    that are intended to provide some measure of the
    strength of support for those branches and the
    corresponding groups
  • These methods include
  • character resampling methods - the bootstrap and
    jackknife
  • comparisons with suboptimal trees - decay
    analyses
  • additional randomisation tests

13
Bootstrapping (non-parametric)
  • Bootstrapping is a modern statistical technique
    that uses computer intensive random resampling of
    data to determine sampling error or confidence
    intervals for some estimated parameter

14
Bootstrapping
  • Characters are resampled with replacement to
    create many bootstrap replicate data sets
  • Each bootstrap replicate data set is analysed
    (e.g. with parsimony, distance, ML)
  • Agreement among the resulting trees is summarized
    with a majority-rule consensus tree
  • Frequency of occurrence of groups, bootstrap
    proportions (BPs), is a measure of support for
    those groups
  • Additional information is given in partition
    tables

15
Bootstrapping
Resampled data matrix
Original data matrix


Characters
Characters
Summarise the results of multiple analyses with a
majority-rule consensus tree Bootstrap
proportions (BPs) are the frequencies with which
groups are encountered in analyses of replicate
data sets
Taxa 1 2 3 4 5 6 7 8
Taxa 1 2 2 5 5 6 6 8
A R R R Y Y Y Y Y
A R R Y Y Y Y Y Y
B R R R Y Y Y Y Y
B R R Y Y Y Y Y Y
C Y Y Y Y Y R R R
C Y Y Y Y Y R R R
D Y Y Y R R R R R
D Y Y R R R R R R
Outgp R R R R R R R R
Outgp R R R R R R R R
Randomly resample characters from the original
data with replacement to build many bootstrap
replicate data sets of the same size as the
original - analyse each replicate data set
D
A
B
C
D
A
B
C
B
C
D
A
1
5
1
2
5
96
2
8
8
7
2
6
6
66
2
6
5
1
4
3
Outgroup
Outgroup
Outgroup
16
Bootstrapping - an example
Partition Table
Ciliate SSUrDNA - parsimony bootstrap
123456789 Freq ----------------- .......
100.00 ....... 100.00 .......
100.00 ..... 100.00 ...
95.50 ....... 84.33 ....
11.83 .... 3.83 ..
2.50 ...... 1.00 ...... 1.00
Ochromonas (1)
Symbiodinium (2)
100
Prorocentrum (3)
Euplotes (8)
84
Tetrahymena (9)
96
Loxodes (4)
100
Tracheloraphis (5)
100
Spirostomum (6)
100
Gruberia (7)
Majority-rule consensus
17
Bootstrapping - random data
Partition Table
123456789 Freq ----------------- ..
71.17 ....... 58.87 .......
26.43 ....... 25.67 ...
23.83 ....... 21.00 ....
18.50 ....... 16.00 ......
15.67 ..... 13.17 .....
12.67 ...... 12.00 .......
12.00 ..... 11.00 .......
10.80 ...... 10.50 ...... 10.00
Randomly permuted data - parsimony bootstrap
Majority-rule consensus (with minority
components)
18
Bootstrap - interpretation
  • Bootstrapping was introduced as a way of
    establishing confidence intervals for phylogenies
  • This interpretation of bootstrap proportions
    (BPs) depends on assuming that the original data
    is a random (fair) sample from independent and
    identically distributed data
  • However, several things complicate this
    interpretation
  • Perhaps the assumptions are unreasonable - making
    any statistical interpretation of BPs invalid
  • Some theoretical work indicates that BPs are very
    conservative, and may underestimate confidence
    intervals - problem increases with numbers of
    taxa
  • BPs can be high for incongruent relationships in
    separate analyses - and can therefore be
    misleading (misleading data -gt misleading BPs)
  • with parsimony it may be highly affected by
    inclusion or exclusion of only a few characters

19
Bootstrap - interpretation
  • Bootstrapping is a very valuable and widely used
    technique - it (or some suitable) alternative is
    demanded by some journals, but it may require a
    pragmatic interpretation
  • BPs depend on two aspects of the support for a
    group - the numbers of characters supporting a
    group and the level of support for incongruent
    groups
  • BPs thus provides an index of the relative
    support for groups provided by a set of data
    under whatever interpretation of the data (method
    of analysis) is used

20
Bootstrap - interpretation
  • High BPs (e.g. gt 85) is indicative of strong
    signal in the data
  • Provided we have no evidence of strong misleading
    signal (e.g. base composition biases, great
    differences in branch lengths) high BPs are
    likely to reflect strong phylogenetic signal
  • Low BPs need not mean the relationship is false,
    only that it is poorly supported
  • Bootstrapping can be viewed as a way of exploring
    the robustness of phylogenetic inferences to
    perturbations in the the balance of supporting
    and conflicting evidence for groups

21
Jackknifing
  • Jackknifing is very similar to bootstrapping and
    differs only in the character resampling strategy
  • Some proportion of characters (e.g. 50) are
    randomly selected and deleted
  • Replicate data sets are analysed and the results
    summarised with a majority-rule consensus tree
  • Jackknifing and bootstrapping tend to produce
    broadly similar results and have similar
    interpretations

22
Decay analysis
  • In parsimony analysis, a way to assess support
    for a group is to see if the group occurs in
    slightly less parsimonious trees also
  • The length difference between the shortest trees
    including the group and the shortest trees that
    exclude the group (the extra steps required to
    overturn a group) is the decay index or Bremer
    support
  • Can be extended to any optimality criterion and
    to other relationships

23
Decay analysis -example
Ciliate SSUrDNA data
Randomly permuted data
Ochromonas
Ochromonas
27
Symbiodinium
Symbiodinium
1
Prorocentrum
Prorocentrum
1
45
Loxodes
Loxodes
3
Tracheloraphis
Tetrahymena
Spirostomum
Tracheloraphis
8
15
Gruberia
Spirostomum
10
Euplotes
Euplotes
Tetrahymena
7
Gruberia
24
Decay analyses - in practice
  • Decay indices for each clade can be determined
    by
  • Saving increasingly less parsimonious trees and
    producing corresponding strict consensus trees
    until the consensus is completely unresolved
  • analyses using reverse topological constraints to
    determine shortest trees that lack each clade
  • with the Autodecay or TreeRot programs (in
    conjunction with PAUP)

25
Decay indices - interpretation
  • Generally, the higher the decay index the better
    the relative support for a group
  • Like BPs, decay indices may be misleading if the
    data is misleading
  • Unlike BPs decay indices are not scaled (0-100)
    and it is less clear what is an acceptable decay
    index
  • Magnitude of decay indices and BPs generally
    correlated (i.e. they tend to agree)
  • Only groups found in all most parsimonious trees
    have decay indices gt zero

26
Trees are typically complex - they can be thought
of as sets of less complex relationships
27
Extending Support Measures
  • The same measures (BP, JP DI) that are used for
    clades/splits can also be determined for triplets
    and quartets
  • This provides a lot more information because
    there are more triplets/quartets than there are
    clades
  • Furthermore....

28
The Decay Theorem
  • The DI of an hypothesis of relationships is equal
    to the lowest DI of the resolved triplets that
    the hypothesis entails
  • This applies equally to BPs and JPs as well as
    DIs
  • Thus a phylogenetic chain is no stronger than its
    weakest link!
  • and, measures of clade support may give a very
    incomplete picture of the distribution of support

29
Bootstrapping with Reduced Consensus
A
B
C
D
E
F
G
I
J
X
H
X
A 1111100000 B 0111100000 C 0011100000 D
0001100000 E 0000100000 F 0000010000G
0000011000H 0000011100 I 0000011110 J
0000011111 X 1111111111
50.5
50.5
50.5
X
50.5
50.5
A
B
C
D
E
F
G
H
I
J
A
B
C
D
E
F
G
H
I
J
99
100
98
99
98
100
100
100
30
Pinpointing Uncertainty
31
Leaf Stability
  • Leaf stability is the average of supports of the
    triplets/quartets containing the leaf

32
PTP tests of groups
  • A number of randomization tests have been
    proposed for evaluating particular groups rather
    than entire data matrices by testing null
    hypotheses regarding the level of support they
    receive from the data
  • Randomisation can be of the data or the group
  • These methods have not become widely used both
    because they are not readily performed and
    because their properties are still under
    investigation
  • One type, the topology dependent PTP tests are
    included in PAUP but have serious problems

33
Comparing competing phylogenetic hypotheses -
tests of two (or more) trees
  • Particularly useful techniques are those designed
    to allow evaluation of alternative phylogenetic
    hypotheses
  • Several such tests allow us to determine if one
    tree is statistically significantly worse than
    another
  • Winning sites, Templeton, Kishino-Hasegawa,
    parametric bootstrapping (SOWH)
  • Shimodaira-Hasegawa, Approximately Unbiased

34
Tests of two trees
  • Tests are of the null hypothesis that the
    differences between two trees (A and B) are no
    greater than expected from sampling error
  • The simplest wining sites test sums the number
    of sites supporting tree A over tree B and vice
    versa (those having fewer steps on, and better
    fit to, one of the trees)
  • Under the null hypothesis characters are equally
    likely to support tree A or tree B and a binomial
    distribution gives the probability of the
    observed difference in numbers of winning sites

35
The Templeton test
  • Templetons test is a non-parametric Wilcoxon
    signed ranks test of the differences in fits of
    characters to two trees
  • It is like the winning sites test but also
    takes into account the magnitudes of differences
    in the support of characters for the two trees

36
Templetons test - an example
Recent studies of the relationships of turtles
using morphological data have produced very
different results with turtles grouping either
within the parareptiles (H1) or within the
diapsids (H2) the result depending on
the morphologist This suggests there may be -
problems with the data - special problems with
turtles - weak support for turtle relationships
1
Archosauromorpha
Lepidosauriformes
Diadectomorpha
Eosauropterygia
Younginiformes
Seymouriadae
Claudiosaurus
Captorhinidae
Araeoscelidia
Paleothyris
Parareptilia
Synapsida
Placodus
2
Parsimony analysis of the most recent data
favoured H2 However, analyses constrained by H2
produced trees that required only 3 extra steps
(lt1 tree length)
The Templeton test was used to evaluate the trees
and showed that the slightly longer H1 tree
found in the constrained analyses was not
significantly worse than the unconstrained H2
tree The morphological data do not allow choice
between H1 and H2
37
Kishino-Hasegawa test
  • The Kishino-Hasegawa test is similar in using
    differences in the support provided by individual
    sites for two trees to determine if the overall
    differences between the trees are significantly
    greater than expected from random sampling error
  • It is a parametric test that depends on
    assumptions that the characters are independent
    and identically distributed (the same assumptions
    underlying the statistical interpretation of
    bootstrapping)
  • It can be used with parsimony and maximum
    likelihood - implemented in PHYLIP and PAUP

38
Kishino-Hasegawa test
If the difference between trees (tree lengths or
likelihoods) is attributable to sampling error,
then characters will randomly support tree A or B
and the total difference will be close to
zero The observed difference is significantly
greater than zero if it is greater than 1.95
standard deviations This allows us to reject the
null hypothesis and declare the sub-optimal tree
significantly worse than the optimal tree (p lt
0.05)
Sites favouring tree A
Sites favouring tree B
Expected
Mean
0
Distribution of Step/Likelihood differences at
each site
Under the null hypothesis the mean of the
differences in parsimony steps or likelihoods for
each site is expected to be zero, and the
distribution normal From observed differences we
calculate a standard deviation
39
Kishino-Hasegawa test
Ciliate SSUrDNA
Ochromonas
Symbiodinium
Prorocentrum
Sarcocystis
Theileria
Plagiopyla n
Parsimonious character optimization of the
presence and absence of hydrogenosomes suggests
four separate origins of within the ciliates
Plagiopyla f
Trimyema c
Trimyema s
Cyclidium p
Cyclidium g
Cyclidium l
Glaucoma
Colpodinium
Tetrahymena
Paramecium
Discophrya
Trithigmostoma
Opisthonecta
Colpoda
Dasytrichia
Questions - how reliable is this result? - in
particular how well supported is the idea of
multiple origins? - how many origins can we
confidently infer?
Entodinium
Spathidium
Loxophylum
Homalozoon
Metopus c
Metopus p
Stylonychia
Onychodromous
Oxytrichia
Loxodes
Tracheloraphis
Spirostomum
Gruberia
Blepharisma
anaerobic ciliates with hydrogenosomes
Maximum likelihood tree
40
Kishino-Hasegawa test
Parsimony analyse with topological constraints
found the shortest trees forcing hydrogenosomal
ciliate lineages together, thereby reducing the
number of separate origins of hydrogenosomes
Ochromonas
Ochromonas
Symbiodinium
Symbiodinium
Prorocentrum
Prorocentrum
Sarcocystis
Sarcocystis
Theileria
Theileria
Plagiopyla n
Plagiopyla n
Plagiopyla f
Plagiopyla f
Trimyema c
Trimyema c
Trimyema s
Trimyema s
Cyclidium p
Cyclidium p
Cyclidium g
Metopus c
Cyclidium l
Metopus p
Dasytrichia
Dasytrichia
Entodinium
Entodinium
Loxophylum
Cyclidium g
Homalozoon
Cyclidium l
Spathidium
Loxophylum
Metopus c
Spathidium
Metopus p
Homalozoon
Loxodes
Loxodes
Each of the constrained parsimony trees were
compared to the ML tree and the Kishino-Hasegawa
test used to determine which of these trees were
significantly worse than the ML tree
Tracheloraphis
Tracheloraphis
Spirostomum
Spirostomum
Gruberia
Gruberia
Blepharisma
Blepharisma
Discophrya
Discophrya
Trithigmostoma
Trithigmostoma
Stylonychia
Stylonychia
Onychodromous
Onychodromous
Oxytrichia
Oxytrichia
Colpoda
Colpoda
Paramecium
Paramecium
Glaucoma
Glaucoma
Colpodinium
Colpodinium
Tetrahymena
Tetrahymena
Opisthonecta
Opisthonecta
Two topological constraint trees
41
Kishino-Hasegawa test
Test summary and results (simplified)
Constrained analyses used to find most
parsimonious trees with less than four separate
origins of hydrogenosomes Tested against ML
tree Trees with 2 or 1 origin are all
significantly worse than the ML tree We can
confidently conclude that there have been at
least three separate origins of hydrogenosomes
within the sampled ciliates
N
o
.
C
o
n
s
t
r
a
i
n
t
E
x
t
r
a
D
i
f
f
e
r
e
n
c
e
S
i
g
n
i
f
i
c
a
n
t
l
y
O
r
i
g
i
n
s
t
r
e
e
S
t
e
p
s
a
n
d

S
D
w
o
r
s
e
?
4
M
L

1
0
-
-
4
M
P
-
-
1
3

?

1
8
N
o
3
(
c
p
,
p
t
)

1
3
-
2
1

?

2
2
N
o
3
(
c
p
,
r
c
)

1
1
3
-
3
3
7

?

4
0
Y
e
s
3
(
c
p
,
m
)

4
7
-
1
4
7

?

3
6
Y
e
s
3
(
p
t
,
r
c
)

9
6
-
2
7
9

?

3
8
Y
e
s
3
(
p
t
,
m
)

2
2
-
6
8

?

2
9
Y
e
s
3
(
r
c
,
m
)

6
3
-
1
9
0

?

3
4
Y
e
s
2
(
p
t
,
c
p
,
r
c
)

1
2
3
-
4
3
2

?

4
0
Y
e
s
2
(
p
t
,
r
c
,
m
)

1
0
0
-
3
5
3

?

4
3
Y
e
s
2
(
p
t
,
c
p
,
m
)

4
0
-
1
4
0

?

3
7
Y
e
s
2
(
c
p
,
r
c
,
m
)

1
2
4
-
4
6
6

?

4
9
Y
e
s
2
(
p
t
,
c
p
)
(
r
c
,
m
)

7
7
-
2
2
2

?

3
9
Y
e
s
2
(
p
t
,
m
)
(
r
c
,
c
p
)

1
3
1
-
4
4
2

?

4
8
Y
e
s
2
(
p
t
,
r
c
)
(
c
p
,
m
)

1
4
0
-
4
1
4

?

5
0
Y
e
s
1
(
p
t
,
c
p
,
m
,
r
c
)

1
3
1
-
5
1
5

?

4
9
Y
e
s
42
Problems with tests of trees
  • To be statistically valid, the Kishino-Hasegawa
    test should be of trees that are selected a
    priori
  • However, most applications have used trees
    selected a posteriori on the basis of the
    phylogenetic analysis
  • Where we test the best tree against some other
    tree the KH test will be biased towards rejection
    of the null hypothesis
  • Only if null hypothesis is not rejected will
    result be safe from some unknown level of bias

43
Problems with tests of trees
  • The Shimodaira-Hasegawa test is a more
    statistically correct technique for testing trees
    selected a posteriori and is implemented in PAUP
  • However it requires selection of a set of
    plausible topologies - hard to give practical
    advice
  • Parametric bootstrapping (SOWH test) is an
    alternative - but it is harder to implement and
    may suffer from an opposite bias due to model
    mis-specification
  • The Approximately Unbiased test (implemented in
    CONSEL) may be the best option currently

44
Problems with tests of trees
45
Taxonomic Congruence
  • Trees inferred from different data sets
    (different genes, morphology) should agree if
    they are accurate
  • Congruence between trees is best explained by
    their accuracy
  • Congruence can be investigated using consensus
    (and supertree) methods
  • Incongruence requires further work to explain or
    resolve disagreements

46
Reliability of Phylogenetic Methods
  • Phylogenetic methods (e.g. parsimony, distance,
    ML) can also be evaluated in terms of their
    general performance, particularly their
  • consistency - approach the truth with more data
  • efficiency - how quickly (how much data)
  • robustness - sensitivity to violations of
    assumptions
  • Studies of these properties can be analytical or
    by simulation

47
Reliability of Phylogenetic Methods
  • There have been many arguments that ML methods
    are best because they have desirable statistical
    properties, such as consistency
  • However, ML does not always have these properties
  • if the model is wrong/inadequate (fortunately
    this is testable to some extent)
  • properties not yet demonstrated for complex
    inference problems such as phylogenetic trees

48
Reliability of Phylogenetic Methods
  • Simulations show that ML methods generally
    outperform distance and parsimony methods over a
    broad range of realistic conditions
  • Whelan et al. 2001 Trends in Genetics
    17262-272
  • But
  • Most simulations cover a narrow range of very
    (unrealistically) simple conditions
  • few taxa (typically just four!)
  • few parameters (standard models - JC, K2P etc)

49
Reliability of Phylogenetic Methods
  • Simulations with four taxa have shown
  • Model based methods - distance and maximum
    likelihood perform well when the model is
    accurate (not surprising!)
  • Violations of assumptions can lead to
    inconsistency for all methods (a Felsenstein
    zone) when branch lengths or rates are highly
    unequal
  • Maximum likelihood methods are quite robust to
    violations of model assumptions
  • Weighting can improve the performance of
    parsimony (reduce the size of the Felsenstein
    zone)

50
Reliability of Phylogenetic Methods
  • However
  • Generalising from four taxon simulations may be
    dangerous as conclusions may not hold for more
    complex cases
  • A few large scale simulations (many taxa) have
    suggested that parsimony can be very accurate and
    efficient
  • Most methods are accurate in correctly recovering
    known phylogenies produced in laboratory studies
  • More realistic simulations are needed if they are
    to help in choosing/understanding methods
Write a Comment
User Comments (0)
About PowerShow.com