Diapositive 1 - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Diapositive 1

Description:

GreenPhylDB: A phylogenomic platform for plant comparative genomics ... Oryza sativa and Arabidopsis thaliana model plants of monocotyledon and dicotyledon ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 28
Provided by: con104
Category:

less

Transcript and Presenter's Notes

Title: Diapositive 1


1
GreenPhylDB A phylogenomic database for plant
comparative genomics
Matthieu CONTE mconte_at_cirad.Fr
POSTER 145
2
PLAN
  • Sequencing projects and comparative genomics
  • GreenPhylDB A phylogenomic platform for plant
    comparative genomics
  • GOST GreenPhyl Orthologs Search Tool
  • Sequencing projects and comparative genomics
  • GreenPhylDB A phylogenomic platform for plant
    comparative genomics
  • GOST GreenPhyl Orthologs Search Tool

3
Sequencing Projects
Genome projects are generating vast amounts of
sequences The objective is now to determine the
function of predicted genes Computational
methods are needed to help annotation transfer
and functional prediction
More than 500 genomes fully sequenced
4
Comparative genomics
Predict gene function for one species using
information available from other species
Gene with unknown function
Model species
Gene with function X
5
Homologous genes orthologous - paralogous
  • Orthologous genes are homologous genes that are
    descended from the last common ancestor through
    speciation and most probably encode proteins with
    a similar function in different species

Arabidopsis gene
Rice gene A
Rice gene B
  • Paralogous genes are referred as homologous genes
    that evolved through duplications and may encode
    proteins with more divergent functions

6
How to predict homologous genes? Similarity vs.
Homology
Similarity and Homology are not the same thing,
even if homology is inferred from certain types
of similarity Similar having likeness or
resemblance (an observation) Homolog
genetically connected (an historical fact common
ancestor)
7
Function prediction by similarity?
Popular similarity methods BLASTp, BBMH/RBH
  • ADVANTAGES
  • Easy to use
  • Fast
  • Directly on full genomes
  • DRAWBACKS
  • How to fix E-value threshold for annotation
    transfer?
  • False positive/negative rate.
  • Two sequences can present some similarity
    without any evolutionary relationships
  • Real ortholog have some time low similarity
    score
  • Cannot identify duplication events
  • Tricky to predict one-to-many or many-to-many
    relationships (inparanoid, OrthoMCL, KOG)

8
Function prediction by phylogeny?
  • ADVANTAGES
  • Efficient for detection of duplications and
    speciations (paralogs and orthologs)
  • Efficient to detect complete relationships (1/n,
    n/n) if you use complete family

9
PLAN
  • Sequencing projects and comparative genomics
  • GreenPhylDB A phylogenomic platform for plant
    comparative genomics
  • GOST GreenPhyl Orthologs Search Tool
  • GreenPhylDB A database for plant comparative
    genomics
  • (in press in Nucleic Acids Research)

10
GreenPhylDB A phylogenomic platform for plant
comparative genomics
Developed on two plant model species
  • Oryza sativa and Arabidopsis thaliana model
    plants of monocotyledon and dicotyledon
  • Full genome available
  • Gene annotation quality (TAIR release 7, TIGR
    release 5)
  • Most of functional evidence
  • Full sequenced genome of other plants exists but
    annotation still in progress.
  • In the future, GreenPhylDB will integrate other
    plant genomes

11
GreenPhyl Pipeline
50200 rice genes TIGR
30500 Arabidopsis genes TAIR
Phylogenomics of full plant genomes GreenPhyl
a methodology for genome-wide search of orthologs
in plants (submitted)
12
GreenPhyl pipeline An optimised phylogenetic
method for full genomes analysis
Including
  • A methodology for gene families clustering of
    full genomes

2. A generic and optimisated phylogenetic
pipeline for ortholog inference
3. A validation method using a test set of
orthologs and paralogs
13
GreenPhylDB 2 importants aspects
  • A plant gene family database
  • Most important plant gene family database 6400
    manually annotated
  • A pre-computed phylogenomic analysis database
  • Total number of gene families analyzed 4400

14
Example of GreenPhylDB family entry
15
Family database statistics
Total number of clusters 21038
64 TF families validated using DRTF/DATF
Databases 492 validated using TAIR families 1903
validated using InterPro families 984 validated
using KEGG families 702 Rice specific 117
Arabidopsis specific
Manually annotated 6400 in progress
16
Example of GreenPhylDB sequence entry
Present phylogenomic predictions ranked by
confidence
17
Phylogenomic database statistics
Total number of families analysed 4425 51000
orthologs relationships with score above 50
18
  • For more information on GreenPhylDB
  • Go on Help page
  • http//greenphyl.cirad.fr

Rice and Arabidopsisok But Im working on
maize or banana?
19
PLAN
  • Sequencing projects and comparative genomics
  • GreenPhyl DB A phylogenomic platform for plant
    comparative genomics
  • GOST GreenPhyl Orthologs Search Tool

20
GOST (GreenPhyl Orthologs Search Tool)
  • Objectives
  • Identify by phylogeny methods orthologous and
    paralogous genes for any plant gene
  • Work on a larger set of the plant gene families
    (GreenPhylDB)
  • Develop a tool as fast as similarity search
    (Blastp) by using pre-computed phylogeny from
    GreenPhylDB data sources.

21
GOST (GreenPhyl Orthologs Search Tool) A
Phylogenomic Tool for plant comparative genomics
2 different use cases
22
Sequence submission
  • Requirements
  • Protein sequence

Note Optimal performance with COMPLETE sequence
23
Family identification and species selection
  • Requirement
  • to indicate species

24
Phylogenomic predictions for the query
25
Phylogenomic prediction for the query
26
Web accessibility
  • GOST is accessible via the GreenPhylDB website
    (http//greenphyl.cirad.fr)
  • Web services for automatic workflow of genome
    annotation on GCP platform (e.g.
    http//dayhoff.generationcp.org)

27
THANKS
I am now seeking a post doctoral position in
plant functional genomics My CV
at http//greenphyl.cirad.fr/mconte.html
Write a Comment
User Comments (0)
About PowerShow.com