Improving Free Energy Functions for RNA Folding PowerPoint PPT Presentation

presentation player overlay
1 / 20
About This Presentation
Transcript and Presenter's Notes

Title: Improving Free Energy Functions for RNA Folding


1
Improving Free Energy Functions for RNA Folding
  • RNA Secondary Structure Prediction

2
Why RNA is Important
  • Machinery of protein construction
  • Catalytic role in cells
  • May be possible to destroy specific sequences of
    RNA (to interrupt protein production)
  • RNase P (Cech/Altman c.1981)

3
RNA Structural Levels
Secondary http//anx12.bio.uci.edu/hudel/bs99a/l
ecture21/lecture2_2.html Tertiary
http//www.leeds.ac.uk/bmb/courses/teachers/trnbal
ls.html
4
Abstracting the problem
A
G
C
G
C
A
U
C
Zuker (1981) Nucleic Acids Research 9(1) 133-149
5
Why it is hard
  • Large search space (hard to enumerate)

Hofacker et al. (1994) Monat. Chem. 125 167-188
6
Why it is hard
  • Secondary structure does not exist.
  • Unlike proteins
  • Putative structures (prone to revision)
  • Quality of Energy Functions
  • Discussed later

7
Current Algorithms
  • Single-Strand
  • Minimum Free Energy (Zuker et. al. 1981)
  • Partition Functions (McCaskill 1990)
  • Comparative Sequence Analysis
  • Max. Weighted Matching (Nussinov et. al. 1978)
  • Stochastic CFG (Sakikibara et. al. 1994)
  • Phylogenetic Trees (Gulko et. al. 1995)
  • Statistical Significance (Noller Woese, early
    80s)

See proposal for references
8
MFE / Tinoco Hypothesis
The free energy of a secondary structure equals
the sum of the free energies of the loops and
stacked pairs
Tinoco et al. (1971) Nature 230 362-367.
9
Proposed System
AAUCG...CUUCUUCCA
2
GA (E)
3
1
MFE (E)
AAUCG...CUUCUUCCA
10
Step I - Calc MFE Structure
  • Given a sequence ? apply the MFE algorithm
  • Generates secondary structure S?

11
Step II - Structural Similarity
  • Given a database of experimentally verified RNA
    structures
  • Let Q? be the database structure most similar to
    S?
  • Based on RNase P Database (Brown 1999)

12
Step III - Construct E
  • Create a new energy function

13
Discussion on E
  • E has global information
  • Global information precludes the use of dynamic
    programming (MFE, Partition)
  • Leaves (stochastic) combinatorial optimization
  • Gradient Descent (no ?E/?S)
  • Genetic Algorithms / Simulated Annealing

14
Step IV - Genetic Algorithm
  • RNA Structural Prediction by GA
  • Input sequence ?
  • Output structure that maximizes E for ?
  • Steady State Genetic Algorithm
  • Pseudoknots forbidden (conflicts)
  • Fitness -E
  • Effect of Similarity(Q?, S?) diminishes with each
    generation (pseudo-SA).

15
Genetic Algorithm - Repn.
  • Stem-loop representation (Chen et. Al. 2000)
  • Window method (EMBOSS Palindrome)

16
Genetic Algorithm - Operators
  • Mutation
  • Add stem from stem pool to a child
  • Crossover

17
Preliminary Results
  • E does not lead to drastic speed up
  • Genetic algorithm is very slow
  • If initial population generated randomly from
    stem pool.
  • Use suboptimal folding for initial population.

18
Preliminary Results Explained
  • The real structure is usually very similar the
    Tinoco optimal structure.
  • View E as a way of choosing among the suboptimal
    structures.

19
Future Work
  • More testing on the entire RNase P Database (gt
    400 structures)
  • Tune E
  • Accuracy comparison to MFE and Partition Function
    Algorithms
  • Parallelize genetic algorithm

20
  • END
Write a Comment
User Comments (0)
About PowerShow.com