Title: Automated Methods for Structure Comparison
1Automated Methods for Structure Comparison
- Basic problem how are any two given structures
to be automatically compared in a meaningful way? - How are distant relationships to be recognized?
- program method
- DALI distance matrix comparison (basis
- for FSSP structural classification)
- SSAP dynamic programming (used in CATH
- to classify topologies)
- VAST convert secondary structures to vectors
- and align vectors
2Structure comparison is pretty easy when two
proteins are very similar
- when two proteins are so similar that the
sequences can be reliably aligned, say gt35
identical, structure comparison can proceed from
the seq. alignment - 1. Align the sequences
- sequence 1 YIREV-GKL
- sequence 2 YITQVRNKA
- 2. Superpose the structures to minimize the RMSD
for equivalent residue pairs in the alignment
note these structures do not correspond to the
sequences above
3it is harder when the proteins are very
different...
- if one cannot align the sequence reliably, how
does one establish which residues, if any, play
equivalent structural roles in the two proteins? - the answer is to attempt to align the structures
directly in such a way that structural
equivalencies in the two proteins are revealed - we will discuss how the distance-matrix based
algorithm of DALI solves this problem
4Distance Matrices
- 2D representation of 3D structure
- plot sequence against itself
- identify pairs of residues which are close in
space to each other - usually distance between C-alpha carbons is used
- identify closeness between residues as dark parts
of the matrix
5Distance matrices
6Different substructures, such as secondary or
supersecondary structures, give rise to distinct
patterns in the matrix
e.g. antiparallel vs. parallel beta-sheets
in principle, one could recognize structural
similarity in two proteins by comparing
patterns in distance matrices, but its not that
simple
7Problem two structures with the same topology
may differ in the precise location of secondary
structure elements along the sequence, i.e. loop
lengths may differ
same fold, different matrices
8Or two common architectures may differ in
connectivity (topology)...
both three-stranded antiparallel beta-sheets
how might we compare their distance matrices to
reveal this similarity?
9DALI algorithm
- not useful to compare entire matrices
- instead, chop distance matrices into all possible
submatrices of 6x6 amino acids - compare this set of submatrices for pattern
similarities rather than comparing entire matrix
101. identify a pair of matching submatrices within
the two matrices
make an initial sequence alignment from this
match...
112. Identify a second pair which overlaps the
first(contains one common structural element)
123. Combine overlapping pairs
overall alignment of structurally equivalent
sequence regions
134. Rearrange and collapse the matrixaccording
to the aligned regions of the sequence
now the common structural elements are aligned as
are the structurally equivalent residues in the
sequence!
14All together now...
15The Power of DALI
- DALI is quite powerful because it can recognize
architectural similarities even when topologies
are different. - It is also flexible because it can be made more
topologically restrictive (i.e. no swapping of
segments in chain allowed) to focus on closer
relationships