Title: Support for trees and nodes
1Support for trees and nodes
2Assessing phylogenetic hypothesis
- We should not be content with constructing
phylogenetic hypothesis but should also assess
what confidence we can place in our hypothesis
3Assigning confidence intervals to phylogenies
- Resampling methods
- bootstrap (non parametric)
- jackknife
- Other methods
- decay analyses (only MP)
- posterior probability (ML bayesian)
4Assigning confidence intervals to phylogenies
- The sampling error
- we use samples in our studies
- the values stimated from a sample of a population
will be more or less close to the real value but
rarely they will coincide - A way of calculating the sampling error is taking
multiple samples from the population and
comparing the estimations obtained for each of
them.
5(No Transcript)
6(No Transcript)
7Bootstrapping
- The frecuency with which a certain group appears
is a measure of the goodness of that group - These values are shown in a consensus tree and
some additional information is given on a table
8Bootstrapping
Pseudorreplicate-1
Original matrix
Characters Taxa 1 2 2 2 5 5 8 8 A
R R R R Y Y Y Y B R R R R Y Y Y Y C
Y Y Y Y Y Y R R D Y Y Y Y R R R R Outgroup
R R R R R R R R
Characters Taxa 1 2 3 4 5 6 7 8 A
R R Y Y Y Y Y Y B R R Y Y Y Y Y Y C
Y Y Y Y Y R R R D Y Y R R R R R R Outgroup
R R R R R R R R
Pseudorreplicate-2
The pseudoreplicates are obtained from the
original matrix with replacement to built a new
matrix of the same size than the original one.
Characters Taxa 3 3 4 4 5 7 7 8 A
Y Y Y Y Y Y Y Y B Y Y Y Y Y Y Y Y C
Y Y Y Y Y R R R D R R R R R R R R Outgroup
R R R R R R R R
Pseudorreplicate-n
9Bootstrapping
Real phylogeny stimate
Trees obtained from the 100 pseudo-replicates
Case-1
30
60
10
Case-2
31
20
49
Consensus Bootstrap case-2
Consensus Bootstrap case-1
AB 90 ABC 60 CD 40 BCD 10
80 49 51 20
80
51
90
60
10Bootstrapping
Partition table
SSUrDNA Ciliates - parsimony
123456789 Freq ----------------- .......
100.00 ....... 100.00 .......
100.00 ..... 100.00 ...
95.50 ....... 84.33 ....
11.83 .... 3.83 ..
2.50 ...... 1.00 ...... 1.00
Ochromonas (1)
Symbiodinium (2)
100
Prorocentrum (3)
Euplotes (8)
84
Tetrahymena (9)
96
Loxodes (4)
100
Tracheloraphis (5)
100
Spirostomum (6)
100
Gruberia (7)
Majority-rule consensus
11Interpretation of Bootstraps (BPs)
- Felsenstein (1985) is a measure of repeatibility
- Probability that the internal branch appears when
a new analysis is done with an independent sample
- Felsenstein y Kishino (1993) is a measure of
exactitud - probability that the internal branch is in the
real tree
12Interpretation of Bootstraps (BPs)
- BPs provide an index of the relative support for
groups provided by a set of data under a certain
method of analysis. - High BPs are indicative of strong signal in the
data - Provided we have no evidence of strong misleading
signal (e.g. base composition bias, great
differences in branch lengths) high BPs are
likely to reflect strong phylogenetic signal
13Interpretation of Bootstraps (BPs)
- Low BPs need not mean the relationship is false,
only that is poorly supported (by this data)
14Interpretation of Bootstraps (BPs)
- Bootstrapping was introduced as a way of
establishing confidence intervals for phylogenies
- This interpretation of bootstrap proportions
(BPs) depends on the assumption that the original
data is a random sample from a much larger set of
independent and identically distributed data
15Interpretation of Bootstraps (BPs)
- However, several things complicate this
interpretation - Perhhaps the assumptions are unreasonable -
making any statistical interpretation of BPs
invalid - Some theoretical work indicates that BPs are very
conservative, and may underestimate confidence
intervals - problem increases with numbers of
taxa
16Interpretation of Bootstraps (BPs)
- BPs can be high for incongruent relationships in
separate analyses - and can therefore be
misleading (misleading data -gt misleading BPs) - with parsimony it may be highly affected by
inclusion or exclusion of only a few characters
17Jackknifing
- Jackknifing is similar to bootstrapping, the only
difference being in the way the data are
resampled - A certain proportion of data are eliminated at
random (por ej. 50) - The pseudoreplicates are analysed and the results
are summarised in a majority-rule consensus
tree - Jackknifing and bootstrapping use to give similar
results and are interpreted in a similar way
18Assigning confidence intervals to phylogenies
- Resampling methods
- bootstrap (non parametric)
- jackknife
- Other methods
- decay analyses (only MP)
- posterior probability (ML bayesian)
19Posterior Probability
- In the Bayesian method of inference for every
node its posterior probability is calculated
(BPP). - This value has a direct interpretation from the
statistical point of view - The probability that the given group is true,
given a model, the premises and the data
(Huelsenbeck, 2002) - However, the BPP values, with the same data, use
to be a lot higher than bootstrap values.
20Posterior Probability vs. bootstrap
- Douady et al., 2003 proposed to treat bootstrap
and posterior probability as the lower and upper
limits respectively. - Alfaro et al., 2003 find that BPP can give high
values for branches that are very short (being
true) - What is clear the two values are not equivalent
and consequently it is not possible to compare
them.