Title: Maximum likelihood
1Maximum likelihood
2The maximum likelihood criterion
- The optimal tree is that which would be most
likely to give rise to the observed data (under a
given model of evolution)
3How it differs from parsimony (from Swofford et
al. 1996)
- What can we say about the placement of another
taxon with state C?
4How it differs from parsimony (from Swofford et
al. 1996)
- Parsimony the new taxon could attach in several
places
5How it differs from parsimony (from Swofford et
al. 1996)
- ML - One place is favored
- State at ? most likely A
6An outline of the ML approachConsider one
character, i
(It is useful to arbitrarily root the tree)
7Sum across all possible histories for i
There are 4(n-2) arrangements for n taxa
8For each tree we calculate the likelihood of
getting the observed states L(i)
A
G
G
G
t2
t3
t4
t5
A
t1
A
L(i) PA x PA-A(t1)x PA-G(t2)x PA-G(t3)x
PA-A(t4)x PA-G(t5)
9Multiply across all sites (assume independence)
L will be very small(lnL will be a large
negative number)
10Tree searching
- Search for the set of branch-lengths that
maximize L ( lower -lnL score) - Record that score
- Search for tree topologies with the best score
Time consuming
11Issues glossed over
- Where do we get Pn - the probability of state n
at the arbitrary root node? - Equiprobable (25)
- Empirical (frequency in the entire matrix)
- Estimated (optimized by ML on each tree)
- Where do we get Pi-j(t) - the probability of
going from state i to state j in time t?
12Typical Simplifying Assumptions
- Stationarity
- Reversibility
- Site independence
- Markovian process (no memory)
13The simplest model of molecular evolution
Jukes-Cantor
Instantaneous rate matrix (Q-matrix)
14Calculating probabilities of change
- To convert the Q matrix into a matrix giving the
probability of starting at state i and ending in
state j, t time units later uses the formula
P(t) eQt
15The simplest model of molecular evolution
Jukes-Cantor
Substitution probability matrix (P-matrix)
16More complicated (realistic) models for DNA
- Allow deviation from equiprobable base
frequencies - HKY85 F81GTR
- Allow two substitution types (ti and tv)
- K2P HKY85
- Allow for six substitution types
- GTR
17Relationship among models
18Relationship between MP and ML
- One argument - MP is inherently nonparametric ?
No direct comparison possible - MP is an ML model that makes particular
assumptions
19The Goldman (1990) model(see Lewis 1998 for more)
- We force all branch lengths to be equal
- The Likelihood for a character only includes the
set of ancestral states that maximizes the
likelihood
20Why use MP
- The model is clearly less realistic, but
- We can do more thorough searches and data
exploration (computational efficiency) - Robust results will usually still be supported
21Why use ML
- The model (assumptions) are explicit
- We can statistically compare alternative models
- We can conduct parametric statistical tests
(under the assumption that we have used the
correct model) - But, even the most complex model is still
unrealistically simple