Molecular phylogenetics 2 - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Molecular phylogenetics 2

Description:

Distance methods are based on the idea that if we knew the evolutionary ... Gibbon. Orang-utan. 49. 34.5. 8.5. 44.5. 26. 94. 75. Algorithms for finding distance trees ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 15
Provided by: jimpr8
Category:

less

Transcript and Presenter's Notes

Title: Molecular phylogenetics 2


1
Molecular phylogenetics 2
  • Level 3 Molecular Evolution and Bioinformatics
  • Jim Provan

Page and Holmes Sections 6.2-4
2
Distance methods goodness-of-fit measures
  • Distance methods are based on the idea that if we
    knew the evolutionary distances between all
    members of a set of sequences, we could
    reconstruct the evolutionary history of these
    sequences
  • In practice, distances are almost never exactly
    tree metrics goodness-of-fit methods seek the
    metric tree that accounts best for the observed
    distances
  • If a1 then criterion is Farriss f statistic
  • If a2 then F is the least-squares-fit criterion

3
Goodness-of-fit measures
4
Minimum evolution
  • Given an unrooted metric tree for n sequences,
    there are (2n 3) branches, each with length ei
  • The sum of these branch lengths is the length L
    of the tree
  • The minimum evolution (ME) tree is the tree which
    minimises L

5
Minimum evolution
6
Algorithms for finding distance trees
  • Neighbourliness
  • Given four sequences, it is possible to label
    them a-d such that
  • d(a,b) d(c,d) ? d(a,c) d(b,d) d(a,d)
    d(b,c)
  • Given data for n sequences that are perfectly
    additive, above equation can be used to identify
    additive subtrees for the data and from them
    construct the complete phylogeny
  • For data that are not perfectly additive, optimum
    tree is one with most quartets for which above
    equation holds
  • Neighbour joining (NJ) clustering method which
    approximates ME tree
  • Unweighted pairgroup method with arithmetic
    averages (UPGMA) ultrametric

7
Objections to distance methods
  • Distances lose information
  • If data is from, say, hybridisation studies then
    there is no option but to use distances
  • Converting sequence data into distances means
    that the evolution of individual sites cannot be
    traced
  • Uninterpretable branch lengths
  • Length of ME tree earlier was 331.5
    substitutions!
  • Not possible to have 0.5 substitution
  • If we deal with expected change (averages etc.)
    then 0.5 is entirely feasible
  • Number of substitutions may not be biologically
    possible tree lengths less than total number of
    observed mutations

8
Maximum parsimony
  • Goal of maximum parsimony is to reconstruct the
    evolution of sequences whilst invoking the fewest
    changes
  1. ATATT
  2. ATCGT
  3. GCAGT
  4. GCCGT

9
Maximum parsimony
Site 3 (2 steps)
OR
10
Maximum parsimony
  • Total number of evolutionary changes is the
    length of the tree i.e. the sum of the number of
    changes at each site

11
Generalised parsimony
  • In previous example, each substitution was
    weighted equally

12
Weighted parsimony
  • Not all sites are likely to be equally
    phylogenetically useful
  • Sites that evolve very quickly will rapidly
    become saturated
  • Such sites may become misleading
  • Relative value of different sites can be
    reflected by weighting hence the length of the
    tree becomes
  • The greater the phylogenetic value of a
    character, the greater the weight (w) assigned to
    it

13
Justification for parsimony
  • Advantages of parsimony
  • Straightforward to understand
  • Apparently makes few assumptions about
    evolutionary processes
  • Has been extensively studied mathematically
  • Powerful software implementations available
  • Justification for choosing most parsimonious tree
    more controversial
  • Any character that does not fit a given tree
    requires the postulation of homoplasy the most
    parsimonious tree minimises the number of ad hoc
    hypotheses required and thus is preferred
  • If evolutionary change is rare, the tree that
    minimises change best reflects the evolutionary
    process

14
Objections to parsimony
  • Principal objection to parsimony is that under
    some models of evolution it is not consistent
    (i.e. even if we add more data we will still
    obtain the wrong tree)
  • Classic scenario is long branch attraction
  • Two unrelated sequences separated from their
    ancestor by a long edge
  • In order for parsimony to recover the correct
    tree ((A,B),(C,D)) there must be a majority of
    sites supporting the split A,B,C,D
  • Problem of chance homoplasy
Write a Comment
User Comments (0)
About PowerShow.com