Genomics of microRNA genes and their target mRNAs - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Genomics of microRNA genes and their target mRNAs

Description:

validation using small RNA cloning. Machine Learning ... of miRs vs. scrambled counterparts against ENTIRE human RefSeq mRNAs, look for ... – PowerPoint PPT presentation

Number of Views:227
Avg rating:3.0/5.0
Slides: 32
Provided by: Nsm50
Category:

less

Transcript and Presenter's Notes

Title: Genomics of microRNA genes and their target mRNAs


1
Genomics of microRNA genes and their target mRNAs
  • Jan. 23, 2006

2
Readings
  • Bentwich (review for background)
  • Smalheiser and Torvik (experimental, for
    detailed analysis)
  • Hill et al. (points to the future)

3
(No Transcript)
4
Where are miR genes?
  • Intergenic tendency to cluster
  • Primary transcripts are polycistronic
  • Promoter, (splicing), 5-cap, poly A tail
  • Intronic most are in sense orientation
  • may also be in clusters
  • At least some are pol II
  • No promoter needed?
  • Relation to genes they regulate? Chimeric
    transcripts, antisense orientation, origin from
    pseudogenes or inverted duplication of target
    mRNAs
  • miRs in viral genomes too

5
(No Transcript)
6
Rules for predicting hairpins as being miR
precursors
  • Stem-loop, stem is over 23 bases
  • LOTS of hairpins in the genome all over
  • predicted loop often smaller than actual size
  • cross species conservation and pattern of change
    over evolution (loop vs. arm vs. arm), drop of
    conservation just outside the hairpin
  • nucleotide composition, melting temperature,
    symmetrical bubbles
  • Clustering (better signal-to-noise ratio)
  • validation using small RNA cloning

7
Machine Learning
  • Training Sets positive and negative examples
  • Goal recognize patterns, summarize features
  • Features to be encoded (may be redundant,
    nonlinear or non-independent)
  • Algorithm for learning or deciding something
    about the testing set. How to weight each
    feature, how to combine features?
  • Optimization criterion (when to stop analyzing??)
  • Form clusters, set cut-off values, optimize
    multiple parameters into a single ranked value,
    predict groups
  • Process is only as robust as the training sets!

8
Where do miR genes come from originally?
  • some come from inverted target duplications,
    genomic repeats, pseudogenes these give
    automatic targets too in some cases
  • Very large pool of candidate hairpins!!
  • could evolve by change in hairpin sequences that
    allow Drosha or Dicer to cut it
  • or could acquire promoter to transcribe the
    hairpin

9
Computational Prediction of miR targets (in
Animals)
  • Machine learning with positive and negative
    examples
  • Prediction confidence vs. biological reality
  • Improve signal to noise ratio
  • cross species conservation,
  • Binding energy of miR to target sequence
  • Multiple hits to same target
  • Multiple hits from different miRs on same target
  • Restrict attention to 3-UTR
  • 5- seeds, 6-8 with or without GU matches
    allowed
  • Mismatch between miR and target in central region
  • Validation by cloning, expression, genetics

10
What is the purpose of miR-target interaction?
  • Turn OFF genes that should not be expressed at a
    given time or cell type?
  • Lim paper and other recent papers showing
    OPPOSITE pattern of expression of miR and target
    mRNAs
  • Or the opposite, modulate levels up and down
    dynamically?
  • Examples in plants
  • miR-214 hits all five synaptic targets.
  • Mir-134 in brain, regulates a brain target LIMK1
  • combinatorial miRs targets with
    CO-expression..

11
miRs are constrained by targets and vice versa
  • miRs often NOT co-expressed with target mRNAs (in
    time or tissue type)
  • housekeeping genes have short 3-UTRs to avoid
    regulation by miRs (anti-targets)
  • so, miRs constrain target sequences
  • conversely, targets constrain miR seed sequences
    too!!!
  • (many different miRs that are otherwise unrelated
    share the same seed sequence)

12
Evolution of miR genes
  • duplication and divergence
  • keep one miR conserved (paired with target mRNA)
  • but the others may drift in terms of the pattern
    of their expression
  • or the sequence of their target mRNAs

13
(No Transcript)
14
Our Study of Human microRNA targetsBMC
Bioinformatics 2004
  • No training set, no heuristics required.
  • Compare the characteristics of the ENTIRE SET of
    miRs vs. scrambled counterparts against ENTIRE
    human RefSeq mRNAs, look for statistical outliers
  • Identify outlying tails for exact hit length,
    gapped-BLAST score, proximity of hits from
    distinct microRNAs on same target

15
microRNA sequences (nonredundant set)
Scrambled
Number of hits of exact hit length x or greater
0.2
Exact hit length, x
16
(No Transcript)
17
(No Transcript)
18
35
microRNA sequences (nonredundant set)
Scrambled
30
25
20
Number of microRNA sequences
15
10
5
0
2
4
6
8
10
12
14
16
Number of distinct targets hit
19
Take-Home Lessons for Human Long Seed Targets
  • Longer exact hit length, better overall match,
    closer multiple hits all predict true targets
    (unlikely to arise by chance)
  • gt10 bases up to perfect and near-perfect
    complementarity even in mammals
  • NO preference for 3-UTR on target
  • NO preference for 5 end of miR to be a perfect
    6 base match

20
perfect microRNA-target interactions
  • A few examples of microRNAs exhibiting perfect
    complementarity have been described (miR-196,
    miR-127 and miR-136.
  • During our analysis of long-seed interactions we
    were struck by the existence of targets that had
    perfect complementarity to microRNAs along their
    full length.
  • Whereas long-seeds are defined as having exact
    Watson-Crick base pairing with no GU matches,
    recent data suggest that complementarity
    interactions that contain up to, say, 5-7 GU
    matches (but no frank mismatches) could still be
    deemed perfect in gene silencing.
  • In particular, we noted that human miR-95 was
    perfectly complementary (including up to 4 GU
    matches) with scores of human mRNAs and ESTs
    (fig. 6). Similarly, miR-151 was perfectly
    complementary to 6 transcripts, which was
    significantly above the level expected by chance.

21
SW perc perc perc query position in
query matching repeat position in
repeat score div. del. ins. sequence
begin end (left) repeat class/family
begin end (left) ID 203 10.0 0.0 0.0
hsa-mir-151 2 31 (59) C L2
LINE/L2 (94) 3178 3149 1 193
28.0 8.0 0.0 hsa-mir-151 4 78 (12)
L2 LINE/L2 3102 3182 (90)
2 202 17.6 0.0 0.0 hsa-mir-28
2 35 (51) C L2 LINE/L2 (93)
3179 3146 3 236 18.2 0.0 0.0
hsa-mir-28 43 86 (0) L2
LINE/L2 3137 3180 (92) 3 392
2.2 0.0 0.0 hsa-mir-321 2 47 (12)
tRNA-Arg-AGG tRNA 28 73 (3) 4
194 22.5 0.0 0.0 hsa-mir-325 3
42 (56) L2 LINE/L2 3220
3259 (13) 5 219 23.4 0.0 0.0
hsa-mir-325 52 98 (0) C L2
LINE/L2 (18) 3295 3249 6 252
10.5 0.0 0.0 hsa-mir-340 9 46 (49)
C MARNA DNA/Mariner (346) 240 203
7 206 15.2 0.0 0.0 hsa-mir-95
2 34 (47) L2 LINE/L2
3273 3305 (8) 8 201 17.1 0.0 0.0
hsa-mir-95 47 81 (0) C L2
LINE/L2 (7) 3306 3272 8 195
7.1 0.0 0.0 mmu-mir-151 1 28 (40) C
L2 LINE/L2 (96) 3176 3149 1
193 20.6 0.0 0.0 mmu-mir-28 2
35 (51) C L2 LINE/L2 (93)
3179 3146 2 214 20.4 0.0 0.0
mmu-mir-28 43 86 (0) L2
LINE/L2 3137 3180 (92) 2 304
21.9 2.7 0.0 mmu-mir-297-1 1 73 (3)
RMER12 LTR/ERVK 774 848 (474)
3 185 22.8 0.0 3.4 mmu-mir-297-2
3 61 (3) (TATATG)n Simple_repeat 2
58 (0) 4 392 2.2 0.0 0.0
mmu-mir-321 2 47 (12) tRNA-Arg-AGG
tRNA 28 73 (3) 5 202
20.0 0.0 0.0 mmu-mir-325 3 42 (56)
L2 LINE/L2 3220 3259 (13)
6 240 21.3 0.0 0.0 mmu-mir-325
52 98 (0) C L2 LINE/L2
(18) 3254 3208 6 252 10.5 0.0 0.0
mmu-mir-340 12 49 (49) C MARNA
DNA/Mariner (346) 240 203 7 402
15.1 0.0 0.0 mmu-mir-341 12 84 (12)
(CGGT)n Simple_repeat 3 75 (0)
8 203 10.0 0.0 0.0 rno-mir-151
7 36 (61) C L2 LINE/L2
(94) 3178 3149 1 194 10.3 2.4 4.9
rno-mir-151 38 78 (19) L2
LINE/L2 3180 3219 (94) 2 236
19.2 0.0 0.0 rno-mir-28 35 86
(0) L2 LINE/L2 3137 3221
(92) 3 486 8.8 0.0 0.0 rno-mir-297
1 68 (0) (TATG)n
Simple_repeat 1 68 (0) 4 392
2.2 0.0 0.0 rno-mir-321 2 47 (12)
tRNA-Arg-AGG tRNA 28 73 (3) 5
202 20.0 0.0 0.0 rno-mir-325 3
42 (56) L2 LINE/L2 3220
3259 (13) 6 240 21.3 0.0 0.0
rno-mir-325 52 98 (0) C L2
LINE/L2 (18) 3254 3208 6 451
23.1 2.2 0.0 rno-mir-327 4 94 (0)
C RodERV21 LTR/ERV1 (2811) 2180 2088
7 476 5.0 0.0 0.0 rno-mir-333
36 95 (0) B2_Rat1 SINE/B2
2 61 (127) 8 234 13.2 0.0 0.0
rno-mir-340 12 49 (49) C MARNA
DNA/Mariner (346) 240 203 9 402
15.1 0.0 0.0 rno-mir-341 12 84 (12)
(CGGT)n Simple_repeat 3
22
(No Transcript)
23
lt--22 mir-151 lt--1
lt--22
mir-95 lt--1 atgatctgacactcgaggagct

acgagttatttatgggcaactt


lt--22 mir-28 lt--1

1--gt mir-325 21--gt gagttatctgacactcgag
gaa
cctagtaggtgtccagtaagt


tccactagaatgtaagctccatgagggcagggactttgtctgttttgt
tcactgctgtatccccagcgcctagmacagtgcctggcacatagtaggc
3188
--L2A consensus--gt
3305/3314


cccactggactgtgagctccgcgagggcagggac
tgtgtctgtcttgttcaccactgtatccccagcgcctagcacagtgcctg
gcacatagcaggcgctcagtaaatgtttgttgaa 294
--L2B
consensus--gt
411/419
24
(No Transcript)
25
miR-151 perfect hits
NM_014400.1 - Homo sapiens GPI-anchored
metastasis-associated protein homolog (C4.4A),
mRNA mir-151 22 TACTAGACTGTGAGCTCCTCGA 1 hit
C4.4A 1584
TACTAGACTGTGAGCTCCTCGA 1605/1698
3'UTR Repeatmasker predicts a L2 element
within the mRNA at position 1575-1673. 2)
NM_005373.1 - Homo sapiens myeloproliferative
leukemia virus oncogene (MPL), mRNA mir-151
22 TACTAGACTGTGAGCTCCTCGA 1 hit
?? MPL 5' 3526
TACTAGATTGTGAGCTCCTTGA 3547/3646
3'UTR Repeatmasker predicts a L2 element
within the mRNA at position 3504-3646.
26
miR-28 showed an excellent hit upon transcription
factor E2F6
miR 3' GAGT--T-ATCTGACACTCGAGGAA 5' w/GU
matches BLAST
Target 5' 2207 TTTACCATTAGACTGTGAGCTCCTT
2231/2342
27
LINE-2 derived miRs
  • The LINE-2 repeat derived microRNAs appear to
    recognize transcripts that share repeats in their
    3-UTR regions.
  • when they bind with perfect or near-perfect
    complementarity, they may be expected to lead to
    degradation of the transcripts.
  • This could serve as a mechanism for detecting and
    neutralizing aberrant transcripts (having
    readthrough transcription from retained introns
    or neighboring genomic regions)
  • as well as serving to regulate specific mRNAs.

28
Take-Home Lessons
  • miRs may not all derive from the same genomics,
    may not all interact with their targets via the
    same rules.
  • Transposable elements have contributed to both
    miR genes and targets.
  • Helps explain why targets have generic tags, why
    many targets are in 3-UTR region, why miRs are
    not all conserved.
  • Alu repeats as a putative miR target!

29
CAGCACUUUGGGAG 14-mer from Alu consensus
position 32-45 a) sequence length
microRNA CAGCACUUU 9 93, 372, 106b
AGCACUUUG 9 17-5p, 20b, 520g,h
CAGCACUU 8 512-3p AGCACUUU
8 520a,b,c,d,e, 526b GCACUUUG 8 519d
AGCACUU 7 302a,b,c,d, 373
UUGGGAG 7 150 b)
AGCACUUU 8 106a, 20a AGCACUU 7 520f
GCACUUU 7 519a,b,c,e
30
(No Transcript)
31
Hill et al. do introns have miR-like effects?
  • Transfected isolated introns into HeLa cells
    (incorporated into genome)
  • Lots of genes went up and down, some pattern to
    the effects
  • Reminiscent of Lim et al. with miRs, but.
  • Did not show if drosha/dicer is involved..
  • Did not show if there is a sequence
    complementarity involved with targets.
  • Did not show if introns are processed to small
    RNAs
  • Much less whether 22 nt. in length.
  • But, introns could be superset of small RNA
    world?!
Write a Comment
User Comments (0)
About PowerShow.com