BIN6002 project report - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

BIN6002 project report

Description:

Through phylogenetic analysis (phylogenome approch), we want to place two ... Methods in Enzymology 224, 456-487. 1. 2. 3. 4. 5. Sequence data. Align Sequences ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 16
Provided by: megasunBc
Category:

less

Transcript and Presenter's Notes

Title: BIN6002 project report


1
BIN6002 project report
Phylogenomics based on a collection of
mitochondrial proteins
  • By Tetsu Ishii and Tan Wang
  • June 25, 2004

2
(No Transcript)
3
Our project task
  • We are given a dataset of mt genome sequences
    from over 100 species.
  • Through phylogenetic analysis (phylogenome
    approch), we want to place two species, nuclearia
    and emiliania in the previous phylogenetic tree.

4
Our dataset
103 mitochondia and a-proteobacteria genome from
different species
5
Which species and which genes were included in
our analysis ?
  • Cob and cox1 gene were used to do a quick
    phylogenetic analysis using distance methods,
    species with long branch were eliminated.
  • Some species were eliminated simply because we
    are not interested.
  • Totally, 35 species including outgroup were
    selected
  • The common genes set 11 genes were selected
    because they present in all remained species.
    These 11 genes were concatenated and used in the
    phylogenetic analysis.

6
The five steps in phylogenetic analysis
1
Sequence data
2
Align Sequences
Phylogenetic signal?evolutionary processes?
3
Distances methods
Characters based methods
Distance calculation (which model?)
4
Choose a method
phyml
protpars
Puzzle
Bionj
ML
MP
MB
Wheighting? (sites, changes)?
Model?
Calculate or estimate best fit tree
5
Test phylogenetic reliability
Modified from Hillis et al., (1993). Methods in
Enzymology 224, 456-487
7
Which methods should we choose ?
  • We simply try all 5 methods, which we can access
  • Maximum likelihood phyml, puzzle
  • Distance method Bionj
  • Maximum parsimony phylip (protpars)
  • MrBayes

8
Which model should we choose ?
  • Models are described in terms of the tendency of
    one amino acids/base to change to another
    site-to-site rate variation the composition.
  • Dont assume a model. Rather, find a model that
    fits your data.
  • We try both JTT with/without gamma distribution.

9
Two examples models
Site rate heterogeneity were modeled with gamma
distribution
10
Is the JTT model with gamma distribution better
than the JTT with no gamma for these data
(puzzle) ?
  • model ln likelihood ?
  • JTT w/gamma -76158.97
  • JTT w/o gamma -82133.12 5974.15

2d2 (Inw/gamma-Inw/o gamma)
P0
11
puzzle
JTT w/gamma
ln likelihood -76158.97
12
puzzle
JTT w/o gamma
ln likelihood -82133.12
13
100
100
a-Proteobacteria
100
100
Rhodophytes
100
100
47
PlantsChlorophytes
48
100
69
100
Jakobids
88
100
100
Haptophyceae
Distance method
1
Cercozoa
100
90
59
100
Stramenopile
91
JTT w/gamma
98
Mycetozoa
92
61
93
Holozoa
97
100
100
100
100
Fungi
99
14
100
Cercozoa
100
Stramenopile
100
100
Maximum likelihood
Haptophyceae
40
PlantsChlorophytes
100
Jakobids
100
26
JTT w/gamma
100
Rhodophytes
100
100
27
Jakobids
100
Mycetozoa
98
86
100
87
Holozoa
86
100
100
100
Fungi
100
100
100
100
a-Proteobacteria
100
15
The conclusion
  • With the phylogenome approach, we are able to
    place nuclearia within Holozoa and emiliania
    within Haptophyceae.
  • We have a few difficulties to analyze the huge
    sequence data perl problem, limited time,
    project organization, etc.
Write a Comment
User Comments (0)
About PowerShow.com