Title: RNA secondary structure
1RNA secondary structure
- RNA is (usually) single-stranded
- The nucleotides want to pair with their
Watson-Crick complements (AU, GC) - They may settle for a wobble pair (GU)
- The set of such pairs is the secondary structure
of the sequence
2The problem
- Input An RNA sequence R
- Output A predicted secondary structure
- S (ri1, rj1), (ri2, rj2) .. (rin, rjn)
3Faustian bargain for RNA secondary structure
prediction
- Assume an RNA strand folds to lowest energy
- Three types of base pair interactions (G-C, A-U,
G-U) - All structures are nested
- Energies are strictly additive
4Lowest energy assumption
- There are exponentially many configurations for a
nucleic acid (or protein) sequence - A sequence assumes its stable configuration
quickly (seconds) - How does it find global minimum quickly?
(Levenstein paradox) - Alternative it finds local minimum
5Base pair interactions
- Atomic interactions too complex to calculate
- Base pair interactions are reasonable compromise
6Nested structures
- More than 95 of structures are nested, but ..
- Those that arent may well be significant
- Non-nested structures include pseudoknots and
kissing loops
7Pseudoknot
8Kissing loops
9Loops
- Nested structures are loops
- The size of a loop is the number of unpaired
nucleotides it contains - The arity of a loop is the number of interior
pairs it contains - Every loop has exactly one closing pair
10Hairpin loop
Arity 0 (always), size 4 (variable)
11Hairpin loop -- size
unpaired nucleotides
12Hairpin loop - closing pair
Closing pair
13Bulge
Arity 1 (always), size 1 (variable
14Interior loop
Arity 1 (always), size 2
15Multiloop
Arity 2 (variable), size 3 (variable)
16Composite RNA secondary structure
17Nearest neighbor formula
- Energy is a function of closing base pair and its
nearest neighbors - Non-symmetric ( E(XYZW) ¹ E(WZYX) )
- Energies of nested loops are additive
- E(R) E(ri1, ri11, rj-11), (ri12, rj-12) ..
(ri1n, rj-1n)
18Loop recurrence relation
E(Si1,j) E(Si,j1) minE(Si,k) E(Si,k), i lt k
lt j E(Li,j)
E(Si,j)
where ...
19Li,j recurrence relation
Li,j is a hairpin Li,j is a stacked pair Li,j is
an i-bulge Li,j is a j-bulge Li,j is an interior
loop
E(Li,j)
min
20Hairpin loop
Nearest neighbors