Title: Molecular dating methods continued
1Molecular dating methods continued
2Why should we expect a clock?
- Under neutral evolution but that is too fast for
most (all?) data sets - If there is reasonable constancy of population
size, mutation rate, and patterns of selection - Perhaps all we can hope is that rates of
evolution will change slowly and/or rarely
3How do we test for clock like evolution?
- Relative rates tests
- Likelihood ratio tests
4The likelihood approach
- Consider two models of evolution
- The usual model
- The same model but
- A root is specified
- The summed branch lengths from any node to all
descendants of that node are the same - Do a likelihood ratio test with n-2 degrees of
freedom
5If a clock model is not rejected
- Calculate rates and then extrapolate from known
to unknown pairwise distances
DOA 0.4 DAB 0.1 TOA 90 TAB (0.1/0.4)
x 90 22.5 Ma
O
A
B
0.05
0.05
0.2
0.15
90
690 Ma
O
90
22.5
7Should obtain confidence intervals around date
estimates
- Look at the curvature of the likelihood surface
(PAML) - Use bootstrapping (parametric or non-parametric)
8Calibrating the tree
- How does one attach a date to an internal node?
How old is the fossil? Where does a fossil fit
on the tree?
F (90 Ma)
O
9What does that tell us?
F (90 Ma)
O
This node is at least 90 Ma
10What else?
This node is less than 90 Ma
F
O
This node is at least 90 Ma
11The lineage leading to F could have been missed
F
O
This node is at least 90 Ma
12General issues
- Fossils generally provide only minimal ages
- The age is attached to the node below the lowest
place on the tree that the fossil could attach - Maximal or absolute ages can only be asserted
when there is lots of fossil data - Geological events can sometimes be used to obtain
minimal ages
13What if a clock is rejected?
- Until recently three (bad) choices
- Give-up on molecular dating
- Go ahead and use molecular dating anyway
- Delete extra-fast or extra-slow taxa
- Now one has several options
- Assume local clocks
- NPRS
- PL
- Model-based methods (Bayesian)
14Local clock
15Local clock
We assign branches to rate categories but force a
single rate per category (R1-c) Find the set of
rates that maximize the likelihood Can do a
likelihood ratio test against a strict clock (df
c-1)
16Non-Parametric Rate-Smoothing(NPRS Sanderson
1998)
d1
Node k
a
d2
Adjust times so as to minimize overall roughness
17Penalized Likelihood(Sanderson 2001)
- Semi-parametric likelihood approach
- The observed data are branch-lengths
2
1
3
10
5
4
4
7
1
2
18Penalized Likelihood
- Given the duration of a branch and its rate of
evolution, we can calculate the probability
(Poisson) of a given branch length - For a set of branching times and rates we can
calculate the likelihood of obtaining this set of
branch lengths
2
1
3
10
5
4
4
7
1
2
19Penalized likelihood
- What would be the maximum likelihood solution?
- A different rate on each branch
- To prevent this we penalize the likelihood by
subtracting the roughness function (same as NPRS)
adjusted by a smoothing parameter, ? - Adjust the degree of penalization using ?
- How do you pick a value of ? ?
20Penalized Likelihood
- Selects optimal value of ? using
cross-validation pick the value that minimizes
the errors made in predicting terminal branch
lengths
21(No Transcript)
22Penalized Likelihood
- More flexible than NPRS
- More difficult to implement
- Worth trying for non-clock-like data
23Bayesian dating
- Two main approaches
- Local clocks (a certain number of changes of rate
are permitted and their position is kept track of
during MCMC) - Autocorrelation of rates (evolutionary rates tend
to change gradually) - somewhat analogous to NPRS
and PL
24Thorne et al. (2001) methodMol. Biol. Evol.
18352-361
- Most widely used. See http//www.plant.ch/bayesi
andating1.4.pdf - Each node has a rate. A nodes rate will tend to
be correlated with the ancestral node - The rate change (as a function of time) via
Brownian motion - The rate of a node is normally distributed with a
mean equal to the ancestral node and a variance
estimated from the data