CLUSTALW - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

CLUSTALW

Description:

Monkey. 90.8. 122.4. 84.7. 0.0. 3.3. Human. 86.3. 122.6. 80.8. 3.3. 0.0. 7 ... PAM distance 3.3 (Human - Monkey) is the minimum. Simple average of distances: ... – PowerPoint PPT presentation

Number of Views:2647
Avg rating:3.0/5.0
Slides: 20
Provided by: agos
Category:
Tags: clustalw | monkey

less

Transcript and Presenter's Notes

Title: CLUSTALW


1
CLUSTALW
  • A Multiple Sequence Alignment Algorithm
  • -Prashanth Athri

2
Importance
  • Essential prelude to Molecular Evolutionary
    Analysis
  • Detect conserved sequence regions

3
Improvisations

  • When comparing sequences.
  • - downweight near duplicate sequences
  • - upweight divergent ones
  • Choice of substitution Matrix is dynamic,depends
    on divergence of sequences
  • Variations in Gap penalties

4
Algorithm
  • The Distance Matrix / Pairwise Alignments
  • The Guide tree
  • Progressive Alignment

5
Distance Matrix
  • No. of matches in Best Alignment / No. of
    Comparisons
  • Steps to calculate both values

6
Distance Matrix
7
The Guide Tree
  • Use the Distance Matrix and implement a
    Neighborhood Join Method

8
Neighbour Join Method
  • First Step
  • PAM distance 3.3 (Human - Monkey) is the minimum.
  • Simple average of distances
  • DistSpinach,MonHum (DistSpinach, Monkey
    DistSpinach, Human)/2
  •                                          
    (90.8 86.3)/2 88.55
  • DistRice, MonHum (DistRice, Monkey
    DistRice, Human)/2
  •                                        
    (122.4 122.6)/2 122.5
  • DistMosquito, MonHum (DistMosquito, Monkey
    DistMosquito, Human)/2
  •                                         
    (94.7 80.8)/2 82.75

9
Neighbour Join Method
10
Intermediate Guide tree
11
Final Guide Tree
12
Progressive Alignment
  • Basic Procedure Use series of pairwise
    Alignments to align larger groups,following the
    Branch Tree.(Leaves to Root)
  • Dynamically Aligned at each stage

13
Gap Penalties
  • Score between a Position in sequence/Alignment to
    a set of other sequences(figure)
  • Types
  • Initial Gap penalties
  • Position Specific Gap penalties

14
Gap Penalties
  • Gap Opening Penalty (GOP)
  • Gap Extension Penalty (GEP)
  • Factors affecting GOP
  • Dependence on Weight Matrix
  • Dependence on similarity of sequences
  • Dependence on sequence lengths
  • GOP GOP log(MIN(length)) Avg. Residue
    Mismatch
  • Percent identity scaling factor

15
Gap Penalties
  • Factors affecting GEP
  • Dependence on difference in lengths of sequences
  • GEP GEP(1 log(N/M))
  • N,M lengths of two sequences

16
Position Specific Gap Penalties
  • Lowered Gap Penalties at Existing Gaps(subsequent
    sequences)
  • Increased Gap Penalties near existing Gaps(within
    sequence)
  • Reduced penalties near Hydrophilic runs

17
Progressive Alignment
  • The use of different weight Matrices as the
    alignment progresses makes it possible to keep
    from aligning divergent sequences to the end,
    after aligning all the closely related ones.

18
(No Transcript)
19
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com