RNA Secondary Structure - PowerPoint PPT Presentation

About This Presentation
Title:

RNA Secondary Structure

Description:

'The Human Genome Project and related efforts have generated enormous amounts of ... For instance, understanding the base pairing, or secondary structure, of single ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 52
Provided by: asamoah
Category:

less

Transcript and Presenter's Notes

Title: RNA Secondary Structure


1
Predicting RNA Secondary Structures A Lattice
Walk Approach to Modeling Sequences Within the
HIV-1 RNA Structure
  • Facing the Challenge of Infectious Diseases in
    Africa The Role of Mathematical Modeling
  • University of Witswatersrand
  • Johannesburg, South Africa
  • September 25-27, 2006
  • Asamoah Nkwanta, Ph.D.
  • Morgan State University
  • Nkwanta_at_jewel.morgan.edu

2
TOPICS
  • RNA Prediction Molecular Biology
  • RNA Combinatorics
  • Certain Class of Random Walks
  • Matrix Theory
  • Connection Between Walks RNA
  • Modeling HIV-1 RNA Sequences

3
RNA Secondary Structure Prediction
The Human Genome Project and related efforts
have generated enormous amounts of raw biological
sequence data. However, understanding how
biological sequences encode structural
information remains a fundamental scientific
challenge. For instance, understanding the base
pairing, or secondary structure, of
single-stranded RNA sequences is crucial to
advancing knowledge of their novel biochemical
functions. C. E. Heithsch, Combinatorics
on Plane Trees, Motivated by RNA Secondary
Structure Configuration (preprint, 2005)
4
What is RNA Secondary Sequence Prediction ?
5
RNA Secondary Structure Prediction
  • Given a primary sequence, we want to find the
    biological function of the related secondary
    structure. To achieve this goal we predict
    (model) its secondary structure.
  • Most methods predict secondary structure rather
    than tertiary structure. The three dimensional
    shape is important for biological function, and
    it is harder to predict.

6
Molecular Biology (Cont.)
3-D structure of Haloarcula marismortui 5S
ribosomal RNA in large ribosomal subunit
7
(No Transcript)
8
Molecular Biology
  • Central Dogma
  • DNA ? RNA ? Protein
  • Transcription / Translation

9
Molecular Biology (Cont.)
10
Molecular Biology (Cont.)
However, the "Central Dogma" has had to be
revised a bit.  It turns out that you CAN go back
from RNA to DNA, and that RNA can also make
copies of itself.  It is still NOT possible to go
from Proteins back to RNA or DNA, and no known
mechanism has yet been demonstrated for proteins
making copies of themselves.
11
Molecular Biology (cont.)
  • HIV is one of a group of atypical viruses called
    retroviruses that maintain their genetic
    information in the form of RNA. Retroviruses are
    capable of producing DNA from RNA.

12
Molecular Biology (Cont.)
13
Molecular Biology (cont.)
  • Ribonucleic acid (RNA) molecule Three main
    categories
  • mRNA (messenger) carries genetic information
    from genes to other cells
  • tRNA (transfer) carries amino acids to a
    ribosome (cells for making proteins)
  • rRNA (ribosomal) part of the structure of a
    ribosome

14
Molecular Biology (cont.)
  • Other types (RNA) molecules
  • snRNA (small nuclear RNA) carries genetic
    information from genes to other cells
  • miRNA (micro RNA) carries amino acids to a
    ribosome (cells for making proteins)
  • iRNA (immune RNA) part of the structure of a
    ribosome (Important for HIV studies)

15
RNA Secondary Structure
  • RNA secondary structures are important in many
    biological processes and efficient structure
    prediction can give vital directions for
    experimental investigation.
  • B. Knudsen and J. Hein, Pfold RNA secondary
    structure prediction using stochastic
    context-free grammars (Nucleic Acids Research,
    2003)
  • There are published examples involving tRNA,
    rRNA, and other types of RNA

16
RNA Secondary Structure (cont.)
  • A ribonucleic acid (RNA) molecule consists of a
    sequence of ribonucleotides (typically single
    stranded)
  • Each ribonucleotide contains one of four bases
    adenine (A), cytosine (C), guanine (G), and
    uracil (U)

17
Secondary Structure (cont.)
  • Note U is replaced by thymine (T) in DNA
  • As the molecule forms, chemical bonds join A-U
    and C-G pairs, (Unstable G-U). These are called
    the Watson-Crick pairs.

18
Secondary Structure (cont.)
  • Primary Structure The linear sequence of bases
    in an RNA molecule
  • Secondary Structure The folding or coiling of
    the sequence due to bonded nucleotide pairs A-U,
    G-C
  • Tertiary Structure The three dimensional
    configuration of an RNA molecule

19
Primary RNA Sequence
  • CAGCAUCACAUCCGCGGGGUAAACGCU
  • Nucleotide Length, 27 bases

20
Geometric Representation
  • Secondary structure is a graph defined on a set
    of n labeled points
  • (M.S. Waterman, 1978)
  • Biological
  • Combinatorial/Graph Theoretic
  • Random Walk

21
(No Transcript)
22
RNA COMBINATORICS
  • RNA Numbers 1,1,1,2,4,8,17,37,82,185,423,978,
  • These numbers count various combinatorial objects
    including RNA secondary structures of length n.

23
(No Transcript)
24
RNA COMBINATORICS (cont.)
  • The number of RNA secondary structures for the
    sequence 1,n is counted by the coefficients of
    s(z)
  • Coefficients of the formal power series
  • (1,1,1,2,4,8,17,37,82,185,423,978,)

25
RNA COMBINATORICS (cont.)
  • The number of lattice paths with unit steps R
    (right), U (up) D (down) that go from (0,0),
    remain in the first quadrant of the coordinate
    plane, and return to the x-axis under the
    restriction that there are never consecutive UD
    steps is the nth RNA number
  • (1,1,1,2,4,8,17,37,82,185,423,978,)

26
RNA COMBINATORICS (cont.)
  • The number of RNA sequences of length n that can
    be formed over the words A,U,G,C such that the
    letters A U are not adjacent is equal to
  • What a remarkable formula for an integer, when n
    1 we get 4, and n 2 we get 14.

27
Counting Sequence Database
  • The On-line Encyclopedia of Integer Sequences
    http/www.research.att.com/njas/sequences/index.ht
    ml
  • N.J.A. Sloane S. Plouffe, The Encyclopedia of
    Integer Sequences, Academic Press, 1995.

28
RNA EQUATIONS
  • Recurrence Relations

29
RNA EQUATIONS (cont.)
  • Generating Function
  • 1,1,1,2,4,8,17,37,82,185,423,978,

30
RNA EQUATIONS (cont.)
  • Exact Formula

31
RNA EQUATIONS (cont.)
  • s(n,k) is the number of structures of length n
    with exactly k base pairs For n,k gt 0,

32
RNA EQUATIONS (cont.)
  • Asymptotic Estimate As n grows without bound

33
Random Walk
  • A random walk is a lattice path from one point to
    another such that steps are allowed in a discrete
    number of directions and are of a certain length

34
RNA Walk Type I
  • NSE Walks Unit step walks starting at the
    origin (0,0) with steps up, down, and right
  • No walks pass below the x-axis and there are no
    consecutive NS steps

35
RNA Walk Type I (cont.)
  • N (0,1) up
  • S (0,-1) down
  • E (1,0) right

36
Type I Walk Array (n x k)
37
RNA Walk Type II
  • NSE Walks Unit-step walks starting at the
    origin (0,0) with steps up, down, and right such
    that no walks pass below the x-axis and there are
    no consecutive SN steps

38
Type II Walk Array (n x k)
39
Examples
  • Type I ENNESNESSE
  • Type II NEEENSEEES

40
RNA Walk Bijection
  • Theorem There is a bijection between the set of
    NSE walks of length n1 ending at height k 0
    and the set of NSE walks of length n ending at
    height k 0.
  • Source Lattice paths, generating functions, and
    the Riordan group, Ph.D. Thesis, Howard
    University, Washington, DC, 1997

41
Matrices Count Lattice Walks
  • Type I Walks
  • 1 0 0 0 0 0 0 -
  • 1 1 0 0 0 0 0 -
  • 1 2 1 0 0 0 0 -
  • 2 3 3 1 0 0 0 -
  • 4 6 6 4 1 0 0 -
  • 8 13 13 10 5 1 0 -
  • 17 28 30 24 15 6 1 -
  • - - - - - - - -
  • Type II Walks
  • 1 0 0 0 0 0 0 -
  • 1 1 0 0 0 0 0 -
  • 2 2 1 0 0 0 0 -
  • 4 4 3 1 0 0 0 -
  • 8 9 7 4 1 0 0 -
  • 17 20 17 11 5 1 0 -
  • 37 41 41 29 16 6 1 -
  • - - - - - - - -

The ith-jth entry corresponds to the number of
random walks of length i and ending height j.
42
Type I Formation Rule (Recurrence)
43
The Connection Between RNA and the Walks
  • Theorem There is a bijection between the set of
    RNA secondary structures of length n and the set
    of NSE walks ending at height k0.
  • Source Lattice paths and RNA secondary
    structures, DIMAC Series in Discrete Math.
    Theoretical Computer Science 34 (1997) 137-147.
    (CAARMS2 Proceedings)

44
(No Transcript)
45
HIV-1 RNA Sequence Prediction
  • We want to construct a lattice walk method to
    predict secondary RNA sequences that code for
    regions of the SL2 and SL3 domains within the
    HIV-1 5 UTR RNA molecule.
  • These domains are important for HIV genomic
    packaging

46
HIV-1 RNA Structural Components
47
Components of Secondary Structure
  • Base pairs
  • Bulges
  • Interior Loops
  • End loops
  • Hairpin
  • Multibranch loops junctions where more than one
    hairpin or more complex secondary structures are
    appended.

48
HIV-1 Sequence (SL2 SL3)
  • The following sequence was obtained from the NCBI
    website. The first 363 nucleotides were
    extracted from the entire HIV-1 RNA genomic
    sequence
  • GGUCUCUCUGGUUAGACCAGAUCUGAGCCUGGGAGCUCUCUGGCUAACUA
    GGGAACCCACUGCUUAAGCCUCAAUAAAGCUUGCCUUGAGUGCUUCAAGU
    AGUGUGUGCCCGUCUGUUGUGUGACUCUGGUAACUAGAGAUCCCUCAGAC
    CCUUUUAGUCAGUGUGGAAAAUCUCUAGCAGUGGCGCCCGAACAGGGACC
    UGAAAGCGAAAGGGAAACCAGAGGAGCUCUCUCGACGCAGGACUCGGCUU
    GCUGAAGCGCGCACGGCAAGAGGCGAGGGGCGGCGACUGGUGAGUACGCC
    AAAAAUUUUGACUAGCGGAGGCUAGAAGGAGAGAGAUGGGUGCGAGAGCG
    UCAGUAUUAAGCG
  • Color key
  • SL2 yellow
  • SL3 - red

49
Known Sequence of the SL2 Domain
50
Lattice Walk Model
  • Start with an RNA primary sequence
  • Perform RNA combinatorial analysis on the given
    sequence
  • Connect lattice walks to the given sequence using
    Type I and II walks
  • Calculate identified sequences to find the
    minimum free energy
  • Predict secondary sequence
  • Conduct laboratory experiments for biological
    functionality

51
Acknowledgments
  • National Science Foundation, DIMACS, AIMS,
    Burroughs Wellcome, SACEMA, WITS
  • MATH. Modeling 561, Graduate Students
  • Collaborators Dwayne Hill, Biology Dept., MSU,
    and Alvin Kennedy, Chemistry Dept., MSU
Write a Comment
User Comments (0)
About PowerShow.com