An Efficient ProteinProtein Docking Algorithm - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

An Efficient ProteinProtein Docking Algorithm

Description:

This work is partially supported by the Army High Performance Computing Center ... model is filtered out of the set of putative near native structures. ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 2
Provided by: sal59
Category:

less

Transcript and Presenter's Notes

Title: An Efficient ProteinProtein Docking Algorithm


1
An Efficient Protein-Protein Docking Algorithm
____Physicochemical and Residue Conservation
Approach  Yuhua Duan1, Boojala Reddy2 and
Yiannis Kaznessis1,2  1Department of Chemical
Engineering and Materials Science, 2Digital
Technology Center, University of Minnesota
1BRS
1BRC
Results
1ATN
Free Energy and Renormalized Rank
Introduction
  • With some approximation, the free energy change
    can be divided into several terms
  • ?G ?Ges ?Gcav ?Gbonding ?Gcoulomb
    ?Gpol SskAk ?Gbonding
  • The individual terms can be calculated
    separately.
  • ?Gcoulomb and ?Gpol are calculated by the
    Generalized Born model with the Debye-Huckel
    approximation.
  • The desolvation term SskAk can be obtained by
    calculating the solvent-accessible-surface-area
    Ak for each residue k and the optimizing weight
    sk
  • The bonding term ?Gbonding can be expressed with
    by using self-consistent Lennard-Jones form in
    which parameters (e, s ) took from AMBER and
    CHARMM force feild.
  • The normalized rank is obtained by the value span
    of each descriptor.
  • The global ranking is weighted the normalized
    rank where the weights were obtained by
    correlation-coefficent calculations.

1KXQ
1MEL
We have employed docking calculations and
atomistic simulations to determine the structure
and the binding affinity of protein-protein
complexes. By exploring the interaction
interface, we find that the conservation
information can improve the docking rank. Here we
present our docking scheme and apply on a
59-benchmark complexes.
1FIN
1SPB
1STF
1PPE
Docking Procedure and Energy Minimization
Figure 1. The improvement after filtering. The
results are (1) 48 out of 60 complexes have
I_factgt1 (2) there are 5 complexes (1AVW, 1BQL,
1EFU, 1FIN, 1GOT) with I_fact1 because FTDock
did not generate hits to begin with (3) There
are 3 complexes for which our filters worsen the
results (1FSS, 1IGC, 1MAH) with 1gtI_factgt0 after
filtering There are 4 complexes (1EO8, 1L0Y,
1NCA, 1QFU) for which our combined filter failed
with I_fact0.0 since all of near-native
structures (2,1,7 and 5 respectively) were
filtered out. After applying only filter II for
these 7 complexes, I_facts were improved.
2SIC
4HTC
2MTA
  • We studied with 59 benchmark complexes suggested
    by Chen et al1.
  • For each protein complex, we employ docking
    calculations using FTDock package2,3 to get
    10,000 possible complexes and we obtained the
    shape complementarity rank and pair potential
    rank.
  • For each possible complex, using CHARMM
    molecular mechanics simulations4 we minimized
    the side-chain structure, and obtained an
    estimate of the free energy for the generated
    complexes.
  • Appliy the residue conservation filter to reduce
    the number of possible strucutres to a small
    number.
  • Using normalized ranking scheme for all
    descriptors to get a global rankfor the subset of
    complexes5,6.

Figure 3. Selected structures from our global
predictions. Red and blue indicate the
experimental co-crystal. Green and purple
indicate the best prediction of rank less than 10
determined by equation (12). For 1FIN, the
bound-bound (green and purple) and
unbound-unbound (black and white blue) results
are shown in the same figure.
Conclusion
  • We described the considerable improvement in
    ranking of the Ftdock generated model complexes
    using the residue conservation filter. Using
    conservation information we significantly reduce
    the number of docking solutions(Table 2).
  • We also achieve ranking improvement for low RMSD
    structures with our renormalized ranking scheme.
    As we determine residue conservation in the
    functionally interacting natural proteins, such
    as enzyme-inhibitor complexes, we need to give
    higher ranks for the models with higher number of
    conserved positions in the interface region. In
    the case of unnatural interactions such as
    antigen-antibody complexes the interacting
    regions are highly variable, and we need to give
    higher ranks for the models with low numbers of
    conserved positions(Table 1).
  • Our algorithms can be adopted with our docking
    software to improve the rank.

Our Filters
Residue Conservation
  • Filter I Using homologous sequences we
    calculated conservation indices for each docked
    model. We have identified the top 8 (defined as
    group 1) and top 17 (defined as group 2) of
    highly conserved and well-exposed surface
    residues, in each polypeptide chain of the
    interacting complex. We counted the total number
    of group 1 and group 2 positions in each modeled
    complex interface region. Using the group 1 and
    group 2 conservation positions as a filter, the
    total number of docked models are reduced. We
    selected only the models, which have at least 4
    of group 1 positions or 6 of group 2 positions in
    the interface region of the enzyme-inhibitor
    model complexes. In the case of antigen-antibody
    complexes, we have reversed the selection,
    limiting to 2 or less group 1 positions and 4 or
    less group 2 positions(see Table 1). Sum all
    conservation indices for all residues and use it
    as conservation rank.
  • Filter II If the rank of a complex is worse than
    1,200 in any of the four rankings(conservation
    rank, shape-complementarity, pair-potential, and
    desolvation energy) then the corresponding model
    is filtered out of the set of putative near
    native structures. Filter II is performed with
    only three ranks if conservation information is
    not available.

Homologous sequences Using the FASTA3 sequence
similarity search tool we obtained homologous
sequences from an annotated non redundant protein
sequence data base (SWALL). Homologous sequences
with less than 30 gaps in the sequence and
greater than 35 sequence identity to the parent
sequence were used for analysis. Evolutionary
Distance Evolutionary distance among the
sequences is calculated using the structure based
amino acid substitution matrix7. A similarity
score Sii for sequence i is calculated by summing
the identical substitution values of the residues
a and b from the substitution matrix M(a,b). An
evolutionary distance (EDij) between the two
sequences is calculated
Current and Future Work
Conservation Index of Residue Position As
described above evolutionary distances between
the reference sequence and its homologues were
used to calculate residue conservation index
(CIl) for each position l using amino acid
substitution matrix, similar to the amino acid
conservation used by Valdar and
Thornton8.Conservation Index (CIl) is a
weighted sum of all pair wise similarities
between all residues present at the position. The
CIl value is calculated in a given alignment and
takes a value in the range 0.0 to 1.0.
  • Combine our filters into docking model generator
    to get more hits within the modeled structures.
  • Dissecting the structures of known
    repressor-operator(TetR/TetO) complexes we use
    computationally efficient simulations to
    calculate the binding affinity of
    repressor-operator complexes and identify the
    protein residues that play a central role in
    binding and are amenable to mutations.

References
  • Chen, R., Mintseris, J., Janin, J., and Weng, Z.,
    Proteins. 52, 88-91(2003)
  • Gabb, H.A., Jackson, R.M., and Sternberg, M.J. J.
    Mol. Biol. 272, 109-120(1997)
  • Moont, G., Gabb, H.A., and Sternberg, M.J.,
    Proteins. 35, 364-373(1999)
  • Brooks, B.R., Bruccoleri, R.E., Olfson, B.D.,
    States, D.J., Swaminathan, S., and Karplus, K.,
    J. Comp. Chem. 4, 187-217(1983)
  • Reddy,B.V.B. and Kaznessis, Y., Submitted to
    Protein Engineering, 2004
  • Duan, Y., Reddy, B.V.B. and Kaznessis, Y.,
    Protein Science, in press, 2004
  • Gonnet, G.H., Cohen, M.A., and Benner, S.A.,
    Science 256, 1443-1445(1992)
  • Valdar, W.S., and Thornton, J.M., Proteins. 42,
    108-124(2001)

Where N is the number of homologous sequences in
the alignment si(l) and sj(l) are the amino
acids at the alignment position l of sequences si
and sj respectively ED(si) and ED(sj) are the
average evolutionary distance of s(i) and s(j)
from the remaining homologues. Mut(a,b) measures
the similarity among the amino acids a and b as
derived from amino acid substitution matrix
M(a,b) and defined as
Acknowledgements
a,b are the pairs of amino acids at a given
alignment position l. M(a,b)low is the lowest
value in the substitution matrix and M(a,b)max is
the maximum value among all the possible
substitution pairs in that position. Thus the
Mut(a,b) takes a value in the range 0 to 1. We
have identified the top 8 and 17 of highly
conserved residues, which have solvent
accessibility greater than 25 of their total
surface area.
This work is partially supported by the Army High
Performance Computing Center (AHPCRC) under the
auspices of the Department of the Army, Army
Research Laboratory (ARL) under contract number
DAAD19-01-2-0014. We also thanks the University
of Minnesota Digital Technology Center for
support.
Figure 2. Global ranking for 18 complexes. RMSD
versus rank score (eq.(12)) for those decoys from
FTDock and filtered by our filters.
Write a Comment
User Comments (0)
About PowerShow.com