Title: Performance of RosettaDock in CAPRI Rounds 35
1Performance of RosettaDock in CAPRI Rounds 3-5
Michael Daily, Arvind Sivasubramanian, David
Masica, Sony Somarouthu, Lian Guo Jeffrey J.
Gray Second CAPRI Evaluation Meeting Gaeta,
Italy, 8 December 2004
PROGRAM IN
Molecular Biophysics
2Biomolecular Nanoscale Modeling LabProf.
Jeffrey J. Gray JHU ChemBE / PMCB
Protein-Surface Interactions
Allostery / Intramolecular Signal Transduction
Protein-Protein Docking
TherapeuticAntibodies
Genome-ScaleDocking Predictions
3Docking Algorithm Overview
Random Start Position
Low-Resolution Monte Carlo Search
High-Resolution Refinement
105
Clustering
Predictions
4IBM BladeCenter Supercomputing Facility
60 CPUs 0.5 TB storage 1.5 GB RAM/node 1 GB
network Capable of producing105 protein
structures/day
5Low-Resolution Search
- Monte Carlo Search
- Rigid body translations and rotations
- Residue-scale interaction potentials
- Protein representation backbone atoms average
centroids
- Mimics physical diffusion process
6Low-Resolution Decoy
High-Resolution Refinement
Small Rigid-Body Move
Repack Side Chains
Rigid-Body Minimization
Monte Carlo Accept?
Filter
Reject
50x
Clustering
7Full-Atom scoring
8Clustering
- Compare all top-scoring decoys pairwise
- Cluster decoys hierarchically
- Decoys within 2.5Ã… form a cluster
Represents ENTROPY
9Target 15 colicin D immD, model 7
- 88 contacts, rmsd 0.55 Ã…, interface rmsd 0.24 Ã…
Bluecolicin D
RedPredicted immD GreenActual immD
Colicin H611 constrained at interface Bound-bound
no SC
10Docking a Homology Model
- CAPRI T11/12 Cohesin Dockerin
- Model 6 (T11) 42 contacts, 6.1 Ã… rmsd, 1.9 Ã…
interface rmsd - Dockerin coordinates modeled by homology via the
Robetta server
Prediction using bound coordinates of dockerin
Prediction using 52 homology model of dockerin
(1DAQ, NMR structure)
RedPredicted dockerin GreenExperimental dockerin
Bluecohesin
Prediction by Mike Daily Methods in Gray et al.
2003 JMB
Xtal by Romao, Carvalho, Fontes et al., Lisbon
11Target 19 prion Fab, model 2
- 64 contacts, rmsd 3.64 Ã…, interface rmsd 1.27 Ã…
RedPredicted prion GreenActual prion
BlueFab
Prion constructed manually from a 95 identical
homologue
12Target 8 Laminin Nidogen, model 2
- 53 contacts, rmsd 4.6 Ã…, interface rmsd 0.66 Ã…
BlueNidogen
RedPredicted laminin GreenActual laminin
Laminin D800, N802, V804 constrained near
interface
13Target 17 GH11 xylanase XIP, model 5
Narrow active site groove difficult to penetrate
with MC search
T17 XIP xylanase predicted xtal 7 contacts,
12.91A L_rmsd, 8.78A I_rmsd
Xylanase built from 60 identical xylanase Active
site thumb constrained near interface
14Target 13 SAG1 Fab, model 12
- Not submitted - Small cluster- Misinterpreted
epitope discarded this structure
BlueFab
RedPredicted SAG1 GreenActual SAG1
15Target 18 GH11xylanase TAXI
Large target small barrier to penetrating active
site
T18 TAXI xylanase Not predicted!
16Target 16 GH10 xylanase XIP, model 7
- 14 contacts, rmsd 8.13 Ã…, interface rmsd 11.64A
RedPredicted XIP GreenActual XIP
Bluexylanase
A notable performance for a 575-residue target
Xylanase built from 67 identical
xylanase Xylanase active site constrained near
interface
17Target 14 PP1 MYPT1
Extended interface is difficult to predict with
MC search
BluePP1
GreenMYPT1
18RosettaDock correctly predicts binding sites in
6/10 non-difficult targets
Standard targets homology targets not
submitted NP not predicted
19RosettaDock can predict small targets to pinpoint
accuracy
T12 cohesin dockerin predicted xtal 87
contacts, 0.99A L_rmsd, 0.51A I_rmsd
T15 colicin D immD predicted xtal 88
contacts, 0.55A L_rmsd, 0.24A I_rmsd
20Rosetta integrates comparative modeling and
docking in four targets
- Robetta produces adequate homology models
- RosettaDock can predict complexes from
approximate monomers - First step toward integrated folding/ docking
package
50
67
60
95
Predicted / xtal
21Large targets require a better low-resolution
search
T08 top model with full laminin
T13 top model with full SAG1
22Which model is correct?Could we reject false
models before refinement?
T15 colicin D immD correct model (0.55A),
7 other colors 9 other immD models
23Typical requirements and assumptions of docking
predictions
- Monomer structures known, or known homolog
(gt50-80?) - Strong binding (Kd lt µm)
- Small proteins (1-2 domains per partner)
- Little significant backbone movement
- Some experimental data are helpful
24RosettaDock past successes and future
improvements
- Past successes
- High resolution accuracy (gt 50 contacts) on many
small targets - Can work with approximate starting structures,
e.g. homology models - Future improvements
- Improve low-resolution search to deal with large
complexes - Improve high-res scoring -gt discrimination
between true and false models - Flexible backbones
25An Introductory Flexible Docking ProblemCAPRI
T01 HPr HPr Kinase (Round 1, Sep 2001)
HPr
Kinase I
C-terminal helix a4
Terminal helix swings upon docking, nuzzling HPr
in a pocket
26Flexible Docking Results With torsion angle
perturbations and explicit minimizations
score
HPr rmsd
score
18/36 contacts, translation 1.8Å, rotation 18º
helix rmsd
L. Guo
27Flexible loop docking
Loop model vs native Docked hirudin
Decoy 119 Score -232.02 RMSD 0.97 Ã…
(hirudin)
28Acknowledgments
CAPRI Organizers EvaluatorsCrystallographers
NMR SpectroscopistsExperimentalists (for
biological information)NIH K01-HG02316JHU,
IBM, Gigatrendgraylab.jhu.edu
29(END)
30Low-Resolution Search
- Monte Carlo Search
- Rigid body translations and rotations
- Residue-scale interaction potentials
- Protein representation backbone atoms average
centroids
- Mimics physical diffusion process
31Residue-scale scoring
32Low-Resolution Decoy
High-Resolution Refinement
Small Rigid-Body Move
Repack Side Chains
Rigid-Body Minimization
- Simultaneous rigid-body and side-chain
refinement
Monte Carlo Accept?
Filter
Reject
50x
Clustering
33Side Chain Packing
- Build amino acid side chains
- Choose side chains from Dunbracks
backbone-dependent rotamer library - Vary c1, c2, c3, c4 angles
- Minimize a full-atom energy function w.r.t. all
rotamer combinations
Phenylalanine rotamers (Richardson, 2000)
(Brian Kuhlman David Baker, Nature Struct.
Biol. 2001)
34Minimization
- Full atom rigid-body minimization
- Use a conjugate-gradient search to find the local
score minimum relative to a rigid body
translation and rotation
35Refinement Cycle
- Simultaneous rigid-body displacement and side
chain minimization
36Full-Atom scoring
37Scoring Weights
38Hydrogen Bonding Energy
- Based on statistics from high-resolution
structures in the Protein Data Bank (rcsb.org)
39Score correlates with Binding Energy
Filled symbols targets with funnels Open
symbols targets without funnels
? score for bound backbone docking
40Clustering
- Compare all top-scoring decoys pairwise
- Cluster decoys hierarchically
- Decoys within 2.5Ã… form a cluster
Represents ENTROPY