Molecular evidence for endosymbiosis - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Molecular evidence for endosymbiosis

Description:

Found yeast nuclear genes exhibit more sequence similarity (closer in ... Alignment via Smith-Waterman. Origin of species. Mitochondrial DNA and human evolution ... – PowerPoint PPT presentation

Number of Views:223
Avg rating:3.0/5.0
Slides: 25
Provided by: UWPar4
Category:

less

Transcript and Presenter's Notes

Title: Molecular evidence for endosymbiosis


1
Molecular evidence for endosymbiosis
  • Perform blastp to investigate sequence similarity
    among domains of life
  • Found yeast nuclear genes exhibit more sequence
    similarity (closer in evolutionary time) with
    archaeal genes
  • Found yeast mitochondrial genes exhibit more
    sequence similarity with eubacterial genes

2
t-test and significance
  • t-test determines if the data come from the same
    population or if there are significant
    differences
  • Calculate the mean of data, standard deviation of
    each data set, derive a weighted standard
    deviation to be used in t-test
  • Compare to t-critical value obtained from t-table
    or software

3
Origins of eukaryotic cells
4
Martin-Muller hypothesis
Martin and Muller hypothesis
5
Evidence from phylogenetic relationships
6
Leprae vs. tuberculosis
  • Leprae (3.2Mb) is 50 coding, contrasted with
    4.4 Mb and 91 coding for tuberculosis
  • Comparing genomes using Mummer
  • http//www.tigr.org/tigr-scripts/CMR2/webmum/mumpl
    ot

7
How Mummer works
  • Uses suffix trees to create an internal
    representation of a genome sequence
  • Identify maximal unique matches (MUM) version
    2.0 uses streaming whereas 1.0 adds sequence 2 to
    suffix tree for sequence 1
  • Alignment via Smith-Waterman

8
Origin of species
  • Mitochondrial DNA and human evolution
  • Evolution of pathogens

9
Phylogeny data mining by biologists
  • Molecular phylogenetics is using clustering
    techniques to discern relationships between
    different biological sequences

10
Why phylogenetics?
  • Understand evolutionary history
  • Map pathogen strain diversity for vaccines
  • Assist in epidemiology (Dentist and HIV)
  • Aid in prediction of function of novel genes
  • Biodiversity
  • Microbial ecology

11
Changes can occur
12
Observing differences in nucleotides
  • The simplest measure of distance between two
    sequences is to count the of sites where the
    two sequences differ
  • If all sites are not equally likely to change,
    the same site may undergo repeated substitutions
  • As time goes by, the number of differences
    between two sequences becomes less and less an
    accurate estimator of the actual number of
    substitutions that have occurred

13
The relationship between time and substitutions
is non-linear
14
Various models have been generated to more
accurately estimate distance and evolution
  • All use the following framework

Probability matrix pAC is the probability of a
site starting with an A had a C at the end of
time interval t, etc.
Base composition of sequence fa frequency of A
15
Jukes-Cantor Model
  • Distance between any two sequences is given by
    d -3/4 ln(1-4/3p)
  • p is the proportion of nucleotides that are
    different in the two sequences
  • All substitutions are equally probable
  • Each position in matrix a except diagonal
    1-Sa

16
Kimuras two parameter model
  • d ½ ln1/(1-2P-Q) ¼ ln1/1-2Q)
  • P and Q are proportional differences between the
    two sequences due to transitions and
    transversions, respectively.
  • Accounts for transition bias in sequences
    (transversions more rare)

17
Evolutionary models
18
Implementing models and building trees
19
Rooted vs. unrooted
  • Root ancestor of all taxa considered
  • Unrooted relationship without consideration of
    ancestry
  • Often specify root with outgroup
  • Outgroup distantly related species (ie. mammals
    and an archaeal species)

20
Tree building
  • Get protein/RNA/DNA sequences
  • Construct multiple sequence alignment
  • Compute pairwise distances (if necessary)
  • Build tree topology and distances
  • Estimate reliability
  • Visualize

21
Distance methods
  • UPMGA
  • Neighbor joining

22
Unweighted pair-group method using arithmetic
averages (UPGMA)
  • Assumes a constant rate of gene substitution,
    evolution
  • Clustering algorithm that measures distances
    between all sequences, merges the closest pair,
    recalculates that node as an average, then merges
    the next closest pair, re-iterate
  • Usually gives a rooted tree

23
Testing the reliability of trees
  • Interior branch test or Bootstrap analysis
  • Bootstrap analysis subsequences or sequence
    deletion or replacement re-draw trees how many
    times do you get some branching? Bootstrap
    values of 70 (95) or greater are normally
    considered reliable

24
Homework due on 10/6
  • Discovery questions in Chapter 2
  • 4, 25-27
Write a Comment
User Comments (0)
About PowerShow.com