Coding Domain Sequence Prediction and Alternative Splicing Detection in Human Malaria Gambiae - PowerPoint PPT Presentation

About This Presentation
Title:

Coding Domain Sequence Prediction and Alternative Splicing Detection in Human Malaria Gambiae

Description:

ABC transporter. Malonyl CoA Carrier. ATP-dependent RNA helicase. ATI, ExonS. Amino acid binding, metabolic process. peptidase M1, protein metabolism. – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 7
Provided by: Jun107
Learn more at: https://www.iscb.org
Category:

less

Transcript and Presenter's Notes

Title: Coding Domain Sequence Prediction and Alternative Splicing Detection in Human Malaria Gambiae


1
Coding Domain Sequence Prediction and Alternative
Splicing Detection in Human Malaria Gambiae
  • Jun Li1, Bing-Bing Wang2, Jose M. Ribeiro3,
    Kenneth D. Vernick1,4
  • 1. Dept of Microbiology, University of Minnesota,
    St. Paul, MN. 2. Pioneer Hi-Bred International,
    Johnston, IA. 3. LMVR/NAID, NIH, MD. 4. UGGIV,
    Institut Pasteur, Paris, France

2
Introduction
  • Nearly 2/3 of the worlds population are at risk
    for malaria
  • 1.5 to 2.5 million children die annually
  • A. gambiae is the major malaria vector
  • Genome-wide research needs good CDS structure
    prediction and alternative splicing information.
  • Current used A. gambiae CDS structures were
    predicted based on comparative algorithms that
    are too conserve. A lot of genes are missing.
  • Comparative gene prediction algorithms also have
    problems in prediction of terminal exons, thus,
    gt40 CDS predicted by this algorithm miss start
    and/or stop codons.
  • The purpose of this work is to create a A.
    gambiae specific gene model, fix the incompletion
    of CDS, and provide the AS information.

3
Combinational Gene Prediction Algorithm
  • Gold gene set to train
  • GlimmerHMM
  • Open-Reading-Frame
  • -Selection Algorithm
  • Exon-Gene-Union Algorithm

Where x is the basepair, A is ab initio
predicted CDS and P is comparative predicted
CDS C is combinational CDS
4
Combinational algorithm improves single algorithm
prediction
Sensi-tivity Speci-ficity Com-plete Rate
GlimmerHMM 95 90 100
ensembl 92 99 60
Combi-national algorithm 96 99 95
Comparison of CDS structure from combinational
algorithm and ensembl.
5
Alternative splicing detection in A. gambiae
AS distribution in A. gambiae
Est-aid AS detection algorithm
Align EST to genome, Processing alignments,
extract exon/intron information
Upload to MySQL DB
Quality control, make EST cluster, merge introns
and exons from individual alignments
Compare intron/intron and intron/exon,
find overlapping event, classify AS event.
Conclusion 1512 CDS have alternative splicing,
most of AS happened in CDS region which will
enrich protein structure and function. Manual
curation shows that the false positive (due to
EST contamination) is low (10). The AS type
distribution indicated that mosquito is more
close to plants than mammals.
6
Software package and web presentation
The combinational CDS prediction and alternative
splicing detection pipeline have been integrated
into our open-source package (welcome
collaboration). Results is also accessible
through web.
Write a Comment
User Comments (0)
About PowerShow.com