Unlocated Arthropod genes and ways to find them - PowerPoint PPT Presentation

1 / 10
About This Presentation
Title:

Unlocated Arthropod genes and ways to find them

Description:

Many bug genes are hard to find - Daphnia's many tandems were lost for a bit ... nasonia : Wasp gene predictions, homology, EST ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 11
Provided by: dong167
Category:

less

Transcript and Presenter's Notes

Title: Unlocated Arthropod genes and ways to find them


1
Unlocated Arthropod genesand ways to find them
  • Many bug genes are hard to find
  • - Daphnias many tandems were lost for a bit
  • Duplicate genes, a bain and a boon
  • Genome tile expression picks out many more

April 2008
Don Gilbert
Genome Informatics Lab, Biology Dept., Indiana
University gilbertd_at_indiana.edu
2
Environ Stresses find Novels
  • Novel Daphnia genes show under stress
  • Novel Drosophila species genes are missed by
    prediction

3
Duplicate genes are common
  • Daphnia surpasses C.elegans for rich tandem gene
    set.
  • Bugs have many tandem genes

4
Duplicates confuse Finders
  • Prediction errors are common in duplicate gene
    regions.
  • None of 13 predictors found all 4 tandems of this
    Dwil P450 cluster, but each gene was properly
    predicted among them.

5
Duplicates find Errors
Prediction cline is artifact of Dmel training.
Retraining with Dmoj removes it.
  • Duplicates solve prediction dilemma in
    Drosophila.

6
Odorant genes concur
Curation of Drosophila Obp genes also removes
prediction cline.
Vieira et al. (2007), and further analysis by
myself recovered genes using Psi-Blast trained on
species Obp genes. Computational errors are
significantly more common in Far-, Mid-mel group.
Obp genes show no overall gain/loss across groups.
7
Tile expression finds genes
Daphnia tile expression with gene finding calls
26 coding bases over the genome, compared to 17
from gene predictions, or 5,000 - 10,000 new
genes.
Manak et al 2006, with Drosmel also found 24
CDS/genome, up from 18 CDS/genome from reference
gene set. Computational tools need to mature
gene finding is preliminary.
8
Summary Locating novel genes
  1. More genes are expressed in unusual environs, and
    are specific. Use many environmental,
    developmental and tissue conditions to see range
    of genes via expression. Understand the limits
    of gene homology.
  2. Duplicate genes are common, a problem, an aid to
    finding genes. Examine duplicate genes carefully.
    Tools that distinguish these can be used to find
    paralogs missed by traditional methods.
  3. Near species training reduces errors and spurious
    effects. Use same-species and near-species data
    as much as possible in preparing automated
    annotations. Be aware of and control for
    informant species-distance as a source of bias.
  4. Genome-wide tile expression finds more genes. As
    an alternative to EST studies, it has values and
    drawbacks. Computational methods need to improve
    to use this data well.

9
Genome maps on your laptop
  • Genome data sets that I use are available for
    your computer.
  • Includes GMOD GBrowse software in a ready-to-run
    bundle
  • http//eugenes.org/gmod/genomeview-package2008/
  • This is fully configured for Intel-MacOSX 10.5,
    others need further installation.
  • See http//www.gmod.org/GBrowse
  • Map data (large) are at ftp//eugenes.org/eugenes
    /gbrowse/databases/
  • daphnia_pulex Daphnia genome data from
    wfleabase.org
  • nasonia Wasp gene predictions, homology,
    EST
  • tribcas Tribolium basic gene set from
    NCBI genomes
  • drospege 12 Drosophila genomes
  • drosmel Dros. mel rel 5.5 genome with
    Affymetrix transcriptome data

10
End note
  • Acknowledgements
  • I am grateful to support from NSF (DBI-0640462)
    and the NIH, including TeraGrid award for making
    this work possible.
  • Daphnia sequencing and portions of the analyses
    were provided by DOE Joint Genome Institute and
    in collaboration with the Daphnia Genomics
    Consortium (DGC).
  • References
  • Gilbert, 2007. New and old genes in Drosophila
    genomes. http//insects.eugenes.org/DroSpeGe/abou
    t/analysis-doc/
  • Gilbert, 2007. Daphnia gene duplicates.
    http//wfleabase.org/genome-summaries/gene-duplica
    tes/
  • Gilbert, 2008. Tandem genes lost found.
    http//insects.eugenes.org/DroSpeGe/about/analysis
    -doc/
  • Manak, JR et al., 2006. .. unannotated
    transcription in Dros. mel. Nature Genetics,
    doi10.1038/ng1875
  • Vieira, F.G. et al. 2007. .. analysis of the
    Odorant-Binding genes in Drosophila genomes.
    Genome Biology, doi10.1186/gb-2007-8-11-r235
Write a Comment
User Comments (0)
About PowerShow.com