Comput. Genomics, Lecture 5b Character Based Methods for Reconstructing Phylogenetic Trees: Maximum Parsimony - PowerPoint PPT Presentation

About This Presentation

Title:

Comput. Genomics, Lecture 5b Character Based Methods for Reconstructing Phylogenetic Trees: Maximum Parsimony

Description:

Comput. Genomics, Lecture 5b Character Based Methods for Reconstructing Phylogenetic Trees: Maximum Parsimony Based on presentations by Dan Geiger, Shlomo Moran, and ... – PowerPoint PPT presentation

Number of Views:152

Avg rating:3.0/5.0

Slides: 22

Provided by: Shlo6

Category:

more less

Transcript and Presenter's Notes

Title: Comput. Genomics, Lecture 5b Character Based Methods for Reconstructing Phylogenetic Trees: Maximum Parsimony

1
Comput. Genomics, Lecture 5bCharacter Based
Methods for Reconstructing Phylogenetic
TreesMaximum Parsimony
Based on presentations by Dan Geiger, Shlomo
Moran, and Ido Wexler. Modified by Benny
Chor. References Durbin et al 7.4, Gusfield
17.1-17.3, SetubalMeidanis 6.1
2
Phylogenetic Trees - Reminder

Leaves represent objects (genes, species) being
compared
Internal nodes are hypothetical ancestral
objects
In a rooted tree, path from root to a node
corresponds to a path in evolutionary time
An unrooted tree specifies relationships among
objects, but not evolutionary time

3
Parsimony Based Approch

Input Character data (aligned sequences)
Goal/Output A labeled tree (labeled internal
nodes) that explains the data with a
minimal
number of changes across edges

4
Parsimony An Example

Various trees that could explain the phylogeny
of the following
four sequences AAG, AAA, GGA, AGA. For
example,

Parsimony prefers the second tree to the first,
because it requires less substitution events
(three vs. four changes).

5
Big and Small Parsimony

Usually the approaches to finding a maximum
parsimony
tree have two separate components
A search through the space of trees (BIG
parsimony)
Given a specific tree topology, find an
assignment of ancestral labels to internal
nodes as to the minimize the total number of
changes across tree edges (small parsimony)

6
Formally Big Parsimony

Input Character data (aligned sequences)
Goal/Output A labeled tree (labeled internal
nodes) that minimizes number of changes
across edges (over all trees and internal
labelings).

7
Formally Small Parsimony

Input Character data (aligned sequences)
and a tree with sequences at leaves.
Goal/Output A labeling of internal nodes that
minimizes number of changes across edges
(over all internal labelings).

8
Big, Small, and Weighted Parsimony

Small parsimony has a linear time solution
(Fitch algorithm).
BIG parsimony is NP hard
(easy reduction from vertex cover, VC).
Weighted small parsimony also has a linear
time solution
(Sankoffs algorithm, dynamic programming).

9
Small Parsimony Fitchs Algorithm

Traverse tree up, from leaves to root,
finding sets of possible ancestral states
(labels) for each internal node.
Traverse tree down, from root to leaves,
determining ancestral states (labels) for
internal nodes.
Key observation Different sites are
independent. Can solve one site at a time.

10
Fitchs Algorithm Step 1

Do a post-order (from leaves to root)
traversal of tree
Find out possible states Ri of internal node i
with children j and k

11
Fitchs Algorithm Step 1

of changes union operations

T
T
AGT
CT
GT
C
T
G
T
A
T
12
Fitchs Algorithm Step 2

Do a pre-order (from root to leaves) traversal
of tree
Select state rj of internal node j with parent
i

13
Fitchs Algorithm Step 2
14
Weighted Version

Instead of assuming all state changes are unit
cost
( ?equally likely), use different costs
S(a,b) for
different changes
1st step of algorithm is to propagate costs up
through tree

15
Weighted Version of Fitchs Algorithm

Want to determine min. cost Ri(a)
of assigning character a to node i
for leaves

16
Weighted Version of Fitchs Algorithm

want to determine min. cost Ri(a)
of assigning character a to node i
for internal nodes

a
i
j
k
b
17
Weighted Version of Fitchs Algorithm Step 2

do a pre-order (from root to leaves) traversal
of tree
select minimal cost character for root
For each internal node j, select character
that produced minimal cost at parent i

18
Big Parsimony Exploring the Space of Trees

Weve considered small parsimony How to find
the minimum number of changes for a given tree
topology
To solve big parsimony, need some search
procedure for exploring the space of tree
topologies
There are
unrooted trees on n leaves

19
Exploring the Space of Trees

taxa (n) trees 4 15 5 105 6 945 8
135,135 10 30,405,375
20
Does This Implies Big MP is Hard?

taxa (n) trees 4 15 5 105 6 945 8
135,135 10 30,405,375
Not necessarily There could be some smarter way
to zoom directly to best topology. But We
will show hardness of Big MP by a (simple)
reduction from vertex cover (VC).
21
Big MP is NP Hard !

First, define VC and VC for triangle free
graphs. Then

You will show a poly time reduction from VC to
VC for triangle free graphs as part of home
assignment (easy).
In class, I will show a poly time reduction from
VC for triangle free graphs to Big MP
(old style, white board proof).
This establishes NP hardness of Big MP.

Write a Comment

User Comments (0)