Title: RNA Secondary Structure Prediction
1- RNA Secondary Structure Prediction
2(No Transcript)
316s rRNA
4RNA Secondary Structure
Pseudoknot
Dangling end
Single- Stranded
Interior Loop
Junction (Multiloop)
Bulge
Stem
Hairpin loop
Image Wuchty
5RNA secondary structure
G A A A G G
A-U U-G C-G A-U
G-C
Loop
wobble pair
Stem
canonical pair
6RNA secondary structure representation
Legitimate structure
7Non-canonical interactions of RNA
secondary-structure elements
These patterns are excluded from the prediction
schemes as their computation is too intensive.
Pseudoknot
Kissing hairpins
Hairpin-bulge contact
8Rules for 2D RNA prediction
- Base Pairs in stems GOOD
- Additional possible assumptions e.g., GC better
than AT - Bulges, Loops BAD
- Canonical Interactions (base pairs, stems,
bulges, loops) OK - Non canonical interactions (pseudoknots, kissing
hairpins) Forbidden - The more interactions The better
9Predicting RNA secondary Structure
- Allowed base pairing rules (Watson-Crick AU,
GC, and Wobble pair GU) - Sequences may form different structures
- An free energy value is associated with each
possible structure - Predict the structure with the minimal free
energy (MFE)
10Simplifying Assumptions for Structure Prediction
- RNA folds into one minimum free-energy structure.
- There are no non-canonical interactions (base
pairs never cross). - The energy of a particular base pair in a double
stranded regions is sequence independent - Neighbors do not influence the energy.
Was solved by dynamic programming Zucker and
Steigler 1981
11Sequence-dependent free-energy (the nearest
neighbor model)
U U C G U A A
U G C A UCGAC 3
U U C G G C A
U G C A UCGAC 3
Example values GC GC GC GC AU GC
CG UA -2.3 -2.9 -3.4 -2.1
12Free energy computation
U U A A G C
G C A G C U A A U
C G A U A 3 A 5
5.9 (4 nt loop)
-1.1 mismatch of hairpin
-2.9 stacking
3.3 (1 nt bulge)
-2.9 stacking
-1.8 stacking
-0.9 stacking
-1.8 stacking
5 dangling
-2.1 stacking
-0.3
G -4.6 KCAL/MOL
-0.3
13Prediction Programs
- Mfold
- http//www.bioinfo.rpi.edu/applications/mfold/old/
rna/form1.cgi - Vienna RNA Secondary Structure Prediction
- http//rna.tbi.univie.ac.at/cgi-bin/RNAfold.cgi
14Mfold - Suboptimal Folding
- For any sequence of N nucleotides, the expected
number of structures is greater than 1.8N - A sequence of 100 nucleotides has 3?1025
possible folds. If a computer can calculate 1000
folds/second, it would take 1015 years (age of
universe 1010 years)! - Mfold generates suboptimal folds whose free
energy fall within a certain range of values.
Many of these structures are different in trivial
ways. These suboptimal folds can still be useful
for designing experiments.
15Example
16Output