Title: Structural alphabets, from protein structure description
1Structural alphabets, from protein structure
description to comparison
Manoj Tyagi LBGM Université de la Réunion
2Synopsis
- Classical method of protein structure analysis
- Structural alphabets protein blocks
- Reduction of protein space
- Generation of PB substitution matrix
- Applications of PB matrix
- Structure comparison and mining
- Identification of rigid body movement or local
conformation changes - Conclusion
3Analysis of protein structures
4Secondary structures 3 states
5Structural Alphabets / Protein Blocks
- A structural alphabet is a set (or library) of
small structural motifs/prototypes which
approximate every part of the protein structures.
- They are composed by a limited number of
recurrent structural elements of proteins. i.e.
are recurring in protein space. - Protein blocks are set of 16 short structural
motifs of 5 consecutive residues/amino acids. - Each motif is represented by vector/set of 8
dihedral angles (f,y). - Denoted by letter a, b, ,p.
6Structural Alphabets / Protein Blocks
- PBs can be characterized by their secondary
structure composition. e.g.PB m forms the central
part of helix and PB d is ideal for sheet - PBs from a to c and d to f are mainly concerned
with N and C caps of sheet - Similarly PBs k, l and n, o, p form N and C
caps of helix. - PBs labelled from g to j are mainly concerned
with coils.
7 yn-2 , fn-1 (41.14,75.53) yn-1 , fn
(13.92,99.80) yn , fn1 (131.88,96.27) yn1 ,
fn2 (122.08,99.68)
8Encoding 3D structure as 1D PB sequence
PB sequence
KBCCDDDDFBFKLMMMMMMMMMMNOPABDCDDFBFKL MMMMMMNGOIAB
DCDDFBDGHILMLMMMMMMMMPMKL MMPCCDDDDFBDCFKLMMMMMMNO
PABDCDDDDDFKL MMMMMMMMMMMNO
What we found ? PBs when combined together gives
back regular structures of a protein and also
highlight variable regions present between
regular structure elements
3D structure
9What can we do with PB sequence ?
- PB sequence comparison to identify equivalent
regions - Structure comparison based on sequence alignment
algorithm - Identification of conformational change or rigid
body displacement/shift in proteins - Extension of above to study active inactive
states of enzymes
What do we need now? A substitution matrix !
10Generation of PBs substitution matrix
- Required large number of structurally aligned
proteins - PALI database was used
- Structure alignments encoded into PB sequence
alignments - Calculations of substitution frequency of each
PB in conservered regions - Conversation to log odds score
Balaji S, Sujatha S, Kumar SS, Srinivasan N.
PALI-a database of Phylogeny and ALIgnment of
homologous protein structures. Nucleic Acids Res
200129(1)61-65.
11Substitution table calculation
PALI aa alignment
PB alignment
Substitution count
Log odd score
PB matrix (16x16)
12PB substitution matrix
Tyagi M, Venkataraman SG, Srinivasan N, de
Brevern AG, Offmann B A substitution matrix for
structural alphabet based on structural alignment
of homologous proteins and its applications.
Proteins in press.
13Potential applications of PB substitution table
- Use of classical sequence alignment methods (DP)
to align and compare two PBs sequences to
identify structurally equivalent and non
equivalent regions - Extension of above to structure mining in large
database - Combination of substitution table with rigid body
superimposition to identify conformational
changes or rigid body displacment in homologous
proteins - Extension of above approach to study enzymes
active/inactive form
14Structure comparison
3chy
2fox
?
15Structure comparison
3chy_ KBCCDDDDFBF--KLMMMMMMMMMMNOPABDCDDFBFKLMMMM
MMNGOIABDCDDFBDGHI 2fox_ ---DDDDFKOMMMMMMMMMMMMMM
MNOPACDDDD--FKLPCFKLMMM-PFBDCDDDDDEHJ 3chy_
LM--LMMMMMMMMPMKLMMPCCDDDDFBDCFKL-----MM-MMM-M-NOP
A--BDCDDDD- 2fox_ LPACFKLMMMMMMMMMMMMGHIACDDEHIAF
KLNOMMMMMMMMMMMMNOPACFBACDDDEH 3chy_
-DFKLMMMMMMMMMMMNO 2fox_ JACKLMMMMMMMMMMMMM
16Contd
- Metallohydrolase superfamily members 1qh5a and
1smla of equivalent length
17Contd
- 1bnka 1fmtb from all beta class FMT
C-terminal domain like superfamily
18Structure mining
Score based ranking
- 98.2 89 of success rate in finding true
class and fold resp. within top 10 hits
19Structure mining
Based on 7259 x 7259 pairwise PB alignments
20Protein Block Expert (PBE)
- Pair wise PB sequence alignment to compare two
protein structure - Mining of structurally similar proteins from SCOP
at 95 database - Preprocessed pair wise PB alignments at family
and superfamily level - Provides both local and global alignment
algorithm falvors - Available at http//bioinformatics.univ-reunion.fr
/PBE/
Tyagi M, Sharma P, Swamy CS, Cadet F, Srinivasan
N, Brevern AG, Offmann B Protein Block Expert
(PBE) A web-based protein structure analysis
server using a structural alphabet. Nucl Acids
Res in press.
21Identification of conformational changes
- Rigid body superimposition methods e.g. STAMP
report sequence alignment based on structurally
equivalent and variable regions - A cutoff/threshold residue residue rmsd value is
used to define variable regions - Are these high rmsd values, due to difference in
conformations or rigid body displacment of
equivalent regions ? - We dont know from simple rigid body structure
alignment
22Rigid body superimposition
Corresponding sequence alignment
Two proteins
Study substitution scores in variable regions
PB alignment
-ve
ve
Rigid body displacment
Conformation change
23- Analysis of two distantly related tRNA synthetase
1eiy and 1set
24- Example of two zinc metallo proteinases 1hov and
1fbl
25Cyclic AMP dependent protein kinase
26(No Transcript)
27Conclusion
- Use of simple PB sequence representation of 3D
structure enables us to align and compare protein
structures in simple and intuitive way - PB alignment enables us to identify structurally
equivalent regions and highlights subtle
differences - Efficient and fast mining process gt structural
genomics - Methodology is simple and can be used in other
applications e.g. realignment of variable regions
from rigid body superimposition alignment
28Thank you