Title: Lessons from CASP targets
1(No Transcript)
2Lessons from CASP targets
ShuoYong Shi, Lisa Kinch, Jimin Pei, Ruslan
Sadreyev, and Nick V. Grishin
http//prodata.swmed.edu/CASP8
Howard Hughes Medical Institute, Department of
Biochemistry, University of Texas Southwestern
Medical Center at Dallas
3 1. A few known folds are predicted no better
than new folds 460 2. Short motif recognition
success 465 3. Short motif recognition
failure 467 4. Structural changes not
predicted 510 5. Inspect your alignments
carefully 480
4Know fold some predicted no better than new!
E.g.1 T0460
First models for T0460 Gaussian kernel density
estimation for GDT-TS scores of the first server
models, plotted at various bandwidths (standard
deviations). The GDT-TS scores are shown as a
spectrum along the horizontal axis each bar
represents first server model. The bars are
colored green, gray and black for top 10, bottom
25 and the rest of servers. The family of curves
with varying bandwidth is shown. Bandwidth varies
from 0.3 to 8.2 GDT-TS units with a step of
0.1, which corresponds to the color ramp from
magenta through blue to cyan. Thicker curves
red, yellow-framed brown and black, correspond to
bandwidths 1, 2 and 4 respectively.
5T0460 very difficult target
Jumping through 20 NMR models of 2k4n
Cartoon diagram of 460 2k4n model 1 residues
1-52,67-10
6T0460 is homologous to Nqo5
Cartoon diagram of NADH-quinone
oxidoreductase2fug chain 5 residues 1-106
Cartoon diagram of 460 2k4n model 1 residues
1-52,67-10
7 1. A few known folds are predicted no better
than new folds 460, 407_2 2. Short motif
recognition success 465 3. Short motif
recognition failure 467 4. Structural
changes not predicted 510 5. Inspect your
alignments carefully 480
8T0465 who found the template?
HHpred !!!
9T0465 is a diverged FYSH domain
FYSH domain of hypothetical protein AF0491 1t95
chain A residues 11-94
Cartoon diagram of T0465 3dfd chain A residues
21-136
10T0465 fold is predicted by HHpred
Cartoon diagram of T0465 3dfd chain A residues
21-136
HHpred2 TS1
11 1. A few known folds are predicted no better
than new folds 460, 407_2 2. Short motif
recognition success 465 3. Short motif
recognition failure 467 4. Structural
changes not predicted 510 5. Inspect your
alignments carefully 480
12T0467 most interesting target !
Bioinfo.pl provides these predictions
13T0467 is bioinfo.pl correct ?
14You can say so (if you want)
Sso7d SH3-fold C-terminal fragment2bf4 chain
A residues 30-64
T0467 OB-fold C-terminal fragment2k5q model 1
residues 64-97
15However, only local prediction is correct
extending it to cover the domain results in a
wrong fold prediction !
T0467 OB-fold 2k5q model 1 residues 7-97
Sso7d SH3-fold 2bf4 chain A
16 1. A few known folds are predicted no better
than new folds 460, 407_2 2. Short motif
recognition success 465 3. Short motif
recognition failure 467 4. Structural
changes not predicted 510 5. Inspect your
alignments carefully 480
17 T0510 server only target with a twist
Cartoon diagram of 510 domains 3doa, N-, middle
and C-domains are shown in blue, green and red,
respectively.
Cartoon diagram of MutM domains 1ee8_A, N-,
middle and C-domains are shown in blue, green
and red, respectively.
18 Closer look at the N-domains reveals large
topological differences
N-domain of MutM 1ee8 chain A residues 1-121
N-domain of 510 3doa residues 1-165
insertion close to the N-terminus is red
insertion in the middle of the domain is blue
19N-domains are nevertheless homologous
201. New folds 397_1, 496_1 2. A few known folds
are predicted no better than new folds 460,
407_2 3. Short motif recognition success
465 4. Short motif recognition failure
467 5. Structural changes not predicted 510
6. Inspect your alignments carefully 480
21T0480 easy alignment with templates
22T0480 most predictions had an error
NADH pyrophosphatase intervening domain 1vk6
residues 94-127
Ribbon diagram of 480 2k4x model 1 residues
17-50. Zinc ion is shown in magenta and side
chains of its ligands (four Cys) are displayed.
23T0480 unusual bulge
Jumping through 20 NMR models of 2k4x
Ribbon diagram of 480 2k4x model 1 residues
17-50. Zinc ion is shown in magenta and side
chains of its ligands (four Cys) are displayed.
24T0480 bulge could have been predicted
25Summary 1. New folds constitute less than 2
of newly solved non-redundant
structures. 2. Many known folds cannot be
predicted because templates are
impossible to find. 3. Globalization of correct
local alignment may or may not yield
correct fold prediction. 4. Large structural
changes
happen in protein cores. 5. Careful inspection
of alignments may solve
some modeling problems.
26Acknowledgement
Our group
Collaborators
Shuoyong Shi Jing Tong Ruslan Sadreyev
Lisa Kinch Jimin Pei Ming Tang Sasha
Safronova Yuan Qi Hua Cheng
Jamie Wrabl Indraneel Majumdar Erik
Nelson Yong Wang S. Sri
Krishna Bong-Hyun Kim Dorothee Staber
David Baker U. Washington Kimmen
Sjölander UC Berkeley William Noble
U. Washington
HHMI, NIH, UTSW, The Welch Foundation