Identification of Transposable Elements Using Multiple Alignments of Related Genomes PowerPoint PPT Presentation

presentation player overlay
1 / 12
About This Presentation
Transcript and Presenter's Notes

Title: Identification of Transposable Elements Using Multiple Alignments of Related Genomes


1
Identification of Transposable Elements Using
Multiple Alignments of Related Genomes
  • I690 Paper Presentation
  • Yin Wu

2
Transposable Elements
  • Transposable Elements (TEs) are the chief cause
    of gapped regions in up to 10 of currently
    sequenced genomes.
  • Alignment gaps which have little or no alignment
    to other genomes lead to signatures within
    multiple alignments that can be used to identify
    TEs.

3
Multiple Alignments BTW Related Genomes
By aligning genomes of related species, it is
possible to identify TEs. Consider a speciation
event causing the recent divergence of genomes S1
and S2. We expect to see some gaps in the
alignment due to small insertions and deletions.
Those long and repeated gaps are likely to be
TEs. When additional related genomes are added
to the alignments, the chance of mis-alignment
decreases.
4
Method
  • Multiple alignment of homologous regions of
    related genomes to find Insertion Regions (IR)
  • Local alignment of each set of IRs to find
    Repeated Insertion Regions (RIR)
  • Filter and assemble RIRs.

5
Types of Insertions
Micr-satellite (NOT TE)
Tandem Repeats
Nested Repeats
Concatenated Repeats
6
Filter and assemble RIRs
  • Micro-satellite regions Short (lt20 bp) repeats
    with close and sequential hits to self.
  • Tandem repeats Long (gt30 bp) repeats which
    sequentially align to both self and to
    subcomponents in other IRs.

7
Filter and assemble RIRs (contd)
  • Nested repeats Long non-overlapping (gt30 bp)
    that sequentially align to other IRs, where there
    is no intersection between the set of IRs to
    which each subcomponent aligned.
  • Concatenated repeats IRs within a certain
    genomic distance (lt700 bp) that align
    sequentially to other insertion regions.

8
Case Study
  • Case study on four drosophila genomes
  • Melanogaster, Yakuba, Pseudoobscura, and Virilis
  • The result is compared against the BDBP natural
    TE annotation set. (http//www.fruitfly.org/p_disr
    upt/TE.html)

9
Case Study (contd)
Conserved Region
Insertion Region (gap)
Annotated TEs
10
Case Study (contd)
Chr arm BDGP Trans Not in Alignment Not in RIR (false neg) In RIR (true pos)
X 276 8 52 216
2L 305 16 62 227
2R 312 3 52 257
3L 288 13 65 210
3R 288 5 64 219
4 102 21 33 48
1571 66 349 1156
100 4.2 22 74
11
Case Study (contd)
  • Identification of new instances of known TE
    families
  • 355 instances of recent elements
  • 232 instances of ancient elements
  • Proposed new families in euchromatin
  • Define a cluster of RIRs to be a new family if
    the intracluster variability is within certain
    threshold
  • Six new families of TEs are proposed, each
    containing more than five instances.

12
Limitation
  • The proposed method is dependent on the
    annotation of homologous regions (to avoid gene
    rearrangements).
  • Inversion and translocation are not detected by
    this method.
Write a Comment
User Comments (0)
About PowerShow.com