RNA surveillance and degradation: the Yin Yang of RNA - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

RNA surveillance and degradation: the Yin Yang of RNA

Description:

RNA surveillance and degradation ... both strands rRNA workflow pA reads intersecting 45S pre-rRNA pA reads intersecting 45S pre-rRNA Accumulation of micro RNA ... – PowerPoint PPT presentation

Number of Views:119
Avg rating:3.0/5.0
Slides: 23
Provided by: NewU311
Learn more at: http://www.mscs.mu.edu
Category:

less

Transcript and Presenter's Notes

Title: RNA surveillance and degradation: the Yin Yang of RNA


1
RNA surveillance and degradation the Yin Yang of
RNA
AAAAAAAAAAA
RNA Pol II
production
RNA
destruction
AAA
Ribosome
2
MODEL
Mtr4
Polyadenylation by Trf4p
Trf4p
AAAAA
Hypomodified tRNAiMet

Rrp46p
Rrp43p
Csl4p
AAAAA
Rrp45p
Rrp42p
Rrp44p
Exosome
Mtr3p
Rrp41p
Rrp4p
Rrp40p
Rrp6p
Degradation of hypomodified tRNAiMet
- Hypothetical diagram of the exosome
3
Workflow
4
Next Gen sequencing PolyA-Seq
TRAMP Complex
AAAA
Papd5
Mtr4
ZCCHC7
AAAA
AAAA
siRNA knockdown
5
Library creation for NGS
6
Map paired end reads to genome
  • BWA (Burrows-Wheeler Aligner) Algorithm used to
    map each pair of reads to the genome
  • Report each pair of reads as a single nucleotide
    position within the genome where polyadenylation
    detected in an RNA sample
  • Average insert size 300
  • Read size 45

7
Raw reads vs Mapped reads
Data type/kd type Raw reads Mapped reads positions
Replicate Data
Mtr4 15,135,078 10,853,534 651,551
Ctrl 16,348,780 11,708,310 652,128
Rrp6 15,971,926 12,388,266 705,173
Original data
Mtr4 ND 34,204,534 1,124,968
Ctrl ND 7,195,942 582,256
Rrp6 ND 8,241,505 597,672
Normalization of data reads per million (rpm)
8
Analysis
  • Starting with refseq database
  • Raw read counts converted to reads per million
  • Reads at position/total reads in sample
  • Remove all non-coding RNAs
  • From each sample collect normalized reads mapping
    at the 3 end /- 50 bases of each refseq
    encoding protein
  • Dot Plot normalized reads on log scale, X
    axiscontrol and Y axismMtr4KD

9
mRNA polyadenylation does not change between Mtr4
and control KD
R20.95141
10
Problems encountered
  • Sequencing read depth very different in the
    original data
  • 34 mil mapped reads in one sample 8 mil in other
  • Lack of 3 replicates for robust statistical
    analysis of data
  • Removal of internal A
  • Seq reads that map to a oligoadenylate track in
    the genome
  • Algorithm developed misses many
  • Manual removal takes too much time.

11
Remove Internal A
AAAAAAAA
TTTTTTTTT
TTTTTTTTT
12
How to mine the data based on a hypothesis
  • Hypothesis PolyA RNAs of unknown identity will
    accumulate upon depletion of mMtr4 vs. the
    control.
  • How can the transcriptome be queried?
  • How detailed should a query be?
  • Every pA position, or only those exhibiting
    greater than x number of raw/normalized reads?
  • How do we find significant differences with one
    sample, or possibly two?
  • How can repetitive elements be accounted for in
    the data?

13
Custom annotation to remove bias from existing
annotations
  • Data mapped with Bowtie to mouse genome mm10
    build
  • Mapped data from KD and control compared using
    cufflinks to explore gene expression differences
    using a custom annotation
  • Custom annotation
  • 1000 base pair genes with 500 base pair overlap
    with next gene
  • This did not work well

14
Problems with using custom annotation
  • First real problem was the no computing could
    handle more than 5000 genes of the custom
    annotation at a time
  • One chromosome had 147K genes
  • There was a problem with assignment when the
    reads overlapped
  • Cuffdiff would randomly assign the reads to only
    one of the genes.
  • Overlaps split into two fasta files, but we could
    not capture differences in the data that we knew
    exists.
  • cuffdiff collects data from the entire 1000 bp
    gene and compares between 2 samples
  • This method leads to false negatives for pA data
    where the focus is on one or a few positions as a
    pA event.

15
What next?
16
F-Seq
  • Tags to identify specific sequence features for
    different library preparations (ChIP-seq),
    (DNase-seq) and (pA-seq).
  • Will summarize and display individual sequence
    data as an accurate and interpretable signal, by
    generating a continuous tag sequence density
    estimation.

17
Generating Peaks with FSeq
  • 1. Estimate kernel density to estimate pdf
  • 2. compute threshold
  • nwnw/L.
  • xc,
  • Repeat step 2 k times
  •  s SDs above the mean
  • 2.1 threshold output module is modifiable

18
Magnitude of data one sample both strands
51 million bases of Chromosome 12
12 thousand bases of Chromosome 12
Chromsome 12 is 121 million base pairs long
19
rRNA workflow
20
pA reads intersecting 45S pre-rRNA
18S
28S
5.8S
21
pA reads intersecting 45S pre-rRNA
18S
5.8S
28S
22
Accumulation of micro RNA processed 5 leader
upon depletion of Mtr4
  • Comparison of Mtr4 V. Control KD
  • Abundant polyA found near 5 end of annotated
    Mir322
  • Confirmed using molecular technique
Write a Comment
User Comments (0)
About PowerShow.com