Splicing Exons: A Eukaryotic Challenge to Gene Prediction - PowerPoint PPT Presentation

About This Presentation
Title:

Splicing Exons: A Eukaryotic Challenge to Gene Prediction

Description:

All substrings that exhibit a certain level of similarity will be called putative exons. ... Use brute force to generate a set of putative exons. ... – PowerPoint PPT presentation

Number of Views:182
Avg rating:3.0/5.0
Slides: 13
Provided by: ianm1
Category:

less

Transcript and Presenter's Notes

Title: Splicing Exons: A Eukaryotic Challenge to Gene Prediction


1
Splicing Exons A Eukaryotic Challenge to Gene
Prediction
  • Ian McCoy

2
(No Transcript)
3
Gene Prediction
  • Genes must be identified to make the genome
    useful
  • Computational Problem Take a seemingly random
    sequence of characters, millions or billions of
    bases long, and find the genes.

4
A Serious Complication
  • Only 3 of the human genome contains genes

5
Similarity-Based Approach
  • Instead of looking for a gene for a target
    protein directly, use a protein in a related
    organism.
  • Find all local similarities between a genomic
    sequence and the target protein sequence.
  • All substrings that exhibit a certain level of
    similarity will be called putative exons.

6
Exon-Chaining Problem
  • Use brute force to generate a set of putative
    exons.
  • Represent each exon with three parameters
    (l,r,w).
  • Find a maximum set of nonoverlapping putative
    exons.

7
Formulate as Graph Problem
  • Create a graph G with 2n verticies n vertices
    are starting(left) positions of exons and n
    vertices are ending(right) positions of exons.
  • The set of left and right interval ends is sorted
    into increasing order.
  • There are edges between each li and ri of weight
    wi for I from 1 to n and 2n-1 additional edges
    of weight 0 connecting adjacent vertices.

8
Input A set of weighted intervals (putative
exons) Output The length of the maximum chain of
intervals from this set
9
Dynamic Programming Algorithm
  • ExonChaining (G, n) //Graph, number of intervals
  • for i ? 1 to 2n
  • si ? 0
  • for i ? 2 to 2n
  • if vertex vi in G corresponds to right end of
    the interval I
  • j ? index of vertex for left end of the
    interval I
  • w ? weight of the interval I
  • sj ? max sj w, si-1
  • else
  • si ? si-1
  • return s2n

10
(No Transcript)
11
Shortcomings
  • A large number of short exons will decrease the
    efficacy of our method for finding putative
    exons.
  • Exons may be out of order.

12
Any Questions?
  • Jones, Neil C., and Pavel A. Pevzner. An
    Introduction to Bioinformatics Algorithms.
    Cambridge MIT Press, 2004. (p.200-203)
Write a Comment
User Comments (0)
About PowerShow.com