Sequence Alignment I Dot Matrices - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Sequence Alignment I Dot Matrices

Description:

To find whether two (or more) genes or proteins are evolutionarily related to ... Reverse diagonals crossing diagonals (Xs) indicate palindromes. Interpretation ... – PowerPoint PPT presentation

Number of Views:203
Avg rating:3.0/5.0
Slides: 16
Provided by: ScottE156
Category:

less

Transcript and Presenter's Notes

Title: Sequence Alignment I Dot Matrices


1
Sequence Alignment IDot Matrices
2
Reading
  • Mount, Chapters 1, 2, and 3 (up to page 94)

3
Why compare sequences?
  • To find whether two (or more) genes or proteins
    are evolutionarily related to each other
  • To find structurally or functionally similar
    regions within proteins

4
Similar genes arise by gene duplication
  • Copy of a gene inserted next to the original
  • Two copies mutate independently
  • Each can take on separate functions
  • All or part can be transferred from one part of
    genome to another

5
Sequence Comparison Methods
  • Dot matrix analysis
  • Dynamic Programming
  • Word or k-tuple methods (FASTA and BLAST)

6
Dot matrices
c
g
g
a
c
a
c
a
c
g
7
Dot matrix comparison
8
Interpretation
  • Regions of similarity appear as diagonal runs of
    dots
  • Reverse diagonals (perpendicular to diagonal)
    indicate inversions
  • Reverse diagonals crossing diagonals (Xs)
    indicate palindromes

9
Interpretation
  • Can link separate diagonals to form alignment
    with gaps
  • Each a.a. or base can only be used once
  • Can't double back
  • A gap is introduced by each vertical or
    horizontal skip

10
Filtering
  • Dot matrices for long sequences can be noisy due
    to insignificant matches
  • Solution use a window and a threshold
  • compare character by character within a window
    (have to choose window size)
  • require certain fraction of matches within
    window in order to display it with a dot

11
Dot plot comparison using windows
Window size 11 Stringency 7
(Put a dot only if 7 out of next 11 positions are
identical.)
12
Uses for dot matrices
  • Aligning two proteins or two nucleic acid
    sequences
  • Finding amino acid repeats within a protein by
    comparing a protein sequence to itself
  • Repeats appear as a set of diagonal runs stacked
    vertically and/or horizontally

13
Repeats
Human LDL receptor protein sequence (Genbank
P01130) W 1 S 1 (Mount, Fig. 3.6)
14
Repeats
W 23 S 7 (Mount, Fig. 3.6)
15
Using substitution matrices
  • Dots can have weights
  • Some matches are rewarded more than others,
    depending on likelihood
  • Use PAM or BLOSUM matrix (more on these later)
  • Put a dot only if a minimum total or average
    weight is achieved
  • See Mount, Fig. 3.5
Write a Comment
User Comments (0)
About PowerShow.com