A Splice Site Locating - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

A Splice Site Locating

Description:

A Splice Site Locating & Analyzing Tool. December 6, 2002. BioE 142. Kelly Han. Diana Wong ... Locations of introns and exons are valuable. Splice sites are ... – PowerPoint PPT presentation

Number of Views:88
Avg rating:3.0/5.0
Slides: 17
Provided by: diana49
Category:
Tags: locating | site | splice

less

Transcript and Presenter's Notes

Title: A Splice Site Locating


1
A Splice Site Locating Analyzing Tool
  • December 6, 2002
  • BioE 142
  • Kelly Han
  • Diana Wong
  • Tyler Hillman

2
Background
  • Locations of introns and exons are valuable
  • Splice sites are mysterious and hard to predict

3
Existing Research
  • Gene prediction poor accuracy levels are
  • 80 nucleotide level
  • 45 exon level
  • 20 whole gene level
  • Many (15) papers on gene prediction
  • GenScan using HMM
  • GeneSplicer a statistical method that integrates
    multiple sources of known evidence
  • FirstEF predict CpG-related and non-CpG_related
    first exons
  • Hexon linear discriminant analysis
  • MZEF, FGENES, etc.list goes on

4
Problems
  • Accuracy of gene prediction may slowly improve,
    but basic algorithms behind these various
    approaches have changed little since 1997
  • Most existing approaches look for a specific
    pattern or unique featuredoes not allow for
    generalization
  • Coupling algorithmic method with human heuristics

5
Project Overview
  • Part I
  • Created a useful tool for users to visualize
    possible splice sites by color coding nucleotides
  • Included basic analysis pattern ranking (freq)
  • Allows the user to make educated decisions
  • Part II
  • Proposed experimental design
  • Stand alone, theoretical approach

6
Part I
  • Found all splice sites from an annotated genome,
    Drosophila melanogaster
  • 6 bases up/down stream
  • Statistically analyzed splice site patterns
  • Ranked them
  • Use data to find possible splice sites in a
    sequence inputted by user
  • Be able to analyze any annotated genome the user
    wishes to input

7
(No Transcript)
8
(No Transcript)
9
(No Transcript)
10
Entering search query
11
(No Transcript)
12
(No Transcript)
13
Part II
  • Available Data Drosophila melanogaster gene
    sequences with annotated intron/exon boundaries
    determined by molecular biology bench work
  • Our Question
  • Are the sequences around the splice sites just
    RANDOM sequences? In other words, is there any
    uniqueness?

14
Our Approach--- Sequence Reconstruction Method
  • Step 1 Cut the gene at known intron/exon
    boundaries
  • Step 2 Reconstruct it using overlaps
  • Step 3 Cut the same gene with a random pattern
  • Step 4 Reconstruct it
  • Compare result of Step 2 result of Step 4

15
Reconstruction--- By Graph Theory
  • Reconstruction Method Hamilton Path
  • Each vertex corresponds to a specific k-tuple
  • Arcs map the overlap among the k-tuples
  • If a k-1 nucleotide overlap is the suffix (last
    k-1 nucleotides) of a k-tuple t1, and it is also
    the prefix (first k-1 nucleotide) of a k-tuple
    t2, then an arc is drawn from t1 to t2.
  • All the vertices of the graph must be visited,
    therefore mathematically sequence reconstruction
    Hamilton Path problem

16
Interpretation of Results
  • / inconclusive
  • -/- inconclusive
  • /- significant
  • The pattern at the boundary is more
    organized/unique than randomly selected sequence
  • (Possible errors poorly chosen boundary
    sequence, reconstruction errors, etc.)
Write a Comment
User Comments (0)
About PowerShow.com