Title: Modeling the Structures of Proteins and Macromolecular Assemblies
1Modeling the Structures of Proteins
andMacromolecular Assemblies
Andrej Å ali http//salilab.org/
- Depts. Of Biopharmaceutical Sciences and
Pharmaceutical Chemistry - California Institute for Quantitative Biomedical
Research - University of California at San Francisco
3/25/03
2- Yeast and E. coli ribosomes Electron microscopy,
comparative modeling, and structural genomics. - Yeast Nuclear Pore Complex Low-resolution
modeling of large assemblies bridging the gaps
between structural biology, proteomics, and
system biology. - Comments.
4/6/03
3S. cerevisiae ribosome
Fitting of comparative models into 15Ã… cryo-
electron density map. 43 proteins could be
modeled on 20-56 seq.id. to a known
structure. The modeled fraction of the proteins
ranges from 34-99.
C. Spahn, R. Beckmann, N. Eswar, P. Penczek, A.
Sali, G. Blobel, J. Frank. Cell 107, 361-372,
2001.
3/25/03
4E. coli ribosome
H.Gao, J.Sengupta, M.Valle, A. Korostelev,
N.Eswar, S.Stagg, P.Van Roey, R.Agrawal,
S.Harvey, A.Sali, M.Chapman, and J.Frank. Cell,
in press.
Upon EF-G binding, the ribosome becomes less
compact. In contrast to mRNA, many protein
contacts undergo large conformational changes,
suggesting ribosomal proteins facilitate the
dynamics of translation.
4/6/03
5MODPIPE Large-Scale Comparative Protein
Structure Modeling
START
1
Get profile for sequence (NR)
Expand match to cover complete domains
PSI-BLAST
Scan sequence profile against representative PDB
chains
Align matched parts of sequence and structure
MODELLER
For each template structure
For each target sequence
Scan PDB chain profiles against sequence
Build model for target segment by satisfaction of
spatial restraints
Evaluate model
END
R. Sánchez A. Šali, Proc. Natl. Acad. Sci. USA
95, 13597, 1998. N. Eswar, M. Marti-Renom, M.S.
Madhusudhan, B. John, A. Fiser, R. Sánchez, F.
Melo, N. Mirkovic, A. Å ali.
3/25/03
6http//salilab.org/modbase
Pieper et al., Nucl. Acids Res. 2002.
3/25/03
7Comparative modeling of the TrEMBL database
Unique sequences processed 733,239 Sequences
with fold assignments or models 415,937 (57)
4/03/02 4 weeks on 500 Pentium III CPUs
3/25/03
8Model Accuracy
Marti-Renom et al. Annu.Rev.Biophys.Biomol.Struct.
29, 291-325, 2000.
MEDIUM ACCURACY
LOW ACCURACY
HIGH ACCURACY
NM23 Seq id 77
CRABP Seq id 41
EDN Seq id 33
X-RAY
/ MODEL
4/6/03
9Future directions
- Make sure we have building blocks (structural
genomics). - Develop methods for simultaneous fitting of
proteins into EM density and conformational
modeling (induced fit, comparative modeling, ab
initio). - Need large computing (eg, cluster of hundreds of
nodes with gt1GB memories).
4/6/03
10Modeling of the yeast nuclear pore complex by
satisfaction of spatial restraints (MODELLER)
F. Alber, T. Suprapto, J. Kipper, W. Zhang, L.
Veenhoff, S. Dokudovskaya, M. Rout, B. Chait,
A. Å ali
- Rockefeller University, New York
- UCSF
4/6/03
11Modeling macromolecular assemblies by
satisfaction of spatial restraints
- Representation of a system.
- Scoring function (spatial restraints).
- Optimization.
There is nothing but points and restraints on
them.
Sali, Ernst, Glaeser, Baumeister. From words to
literature in structural proteomics. Nature 422,
216-225, 2003.
3/25/03
12Modeling of NPC
- Stochiometry.
- Parse proteins into domains.
- Protein and sub-complex shapes from Stokes
radii. - Excluded volume of proteins.
- Symmetry of NPC (EM).
- Radial and axial localization of proteins (IEM).
- Protein-protein proximity (immuno-purification).
- Binary protein-protein contacts from overlay
experiments. - Modeling in the context of the nuclear envelope.
- Comparative models for some domains.
- Structural genomics of nucleoporins.
4/6/03
13Schematic structure of yeast NPC
C.W. Akey, M. Rout
TOP VIEW
SIDE VIEW
cytosolic side
nuclear membrane
nuclear side
half-spoke contains 30 nucleoporin proteins
(NUPs). 480 NUPs in NPC.
spoke
half-spoke
3/25/03
14Protein-protein contacts
For each protein pair within a half-spoke, the
upper bound on the center-center distance is the
estimated maximal complex diameter
3/25/03
15Successful optimization
11/18/02
3/25/03
16NUP84-complex
Experiment
Model
4/6/03
17Structural proteomics aims to characterize
structures of most macromolecular complexes, in
space and time.
On average, a domain may interact with a few
other domains. The function of a complex is
determined by its structure and dynamics. There
are too many complexes to be determined directly
by high-resolution experimental structure
determination. Thus, just like in structural
genomics, an efficient combination of experiment
and computation is required.
4/6/03
18Structural Proteomics versus Structural Genomics
Potential targets not clear clear Target
selection not clear clear Scope not
clear clear Structure determination hybrid X-ra
y or NMR Functional Annotation essential not a
major focus
There are additional sciences and technologies in
structural proteomics, relative to structural
genomics.
4/6/03
19Structural Genomics
Sali. Nat. Struct. Biol. 5, 1029, 1998. Sali et
al. Nat. Struct. Biol., 7, 986, 2000. Sali. Nat.
Struct. Biol. 7, 484, 2001. Baker Sali. Science
294, 93, 2001.
Characterize most protein sequences based on
related known structures.
There are 16,000 30 seq id families
(90) (Vitkup et al. Nat. Struct. Biol. 8, 559,
2001)
3/25/03
20Target selection for structural proteomics(scope)
- Comprehensive coverage.
- What targets
- Stable assemblies? Transient complexes? Pairs of
domains? - Can they be organized into groups?
- How many such groups are there?
4/6/03
21Target selection for structural proteomics?
- There are 3,000 folds containing 90 of all
sequences. - A target Binary domain-domain interface.
- There may be a finite number of domain-domain
interfaces, largely defined by the domain fold
types. - Given pairwise interactions, a large assembly may
be reconstructed.
4/6/03
22Protein and assembly structure by experiment and
computation
Sali, Ernst, Glaeser, Baumeister. From words to
literature in structural proteomics. Nature 422,
216-225, 2003.
3/25/3
3/25/03
23Modeling macromolecular assemblies by
satisfaction of spatial restraints
- Representation of a system.
- Scoring function (spatial restraints).
- Optimization.
There is nothing but points and restraints on
them.
Sali, Ernst, Glaeser, Baumeister. From words to
literature in structural proteomics. Nature 422,
216-225, 2003.
3/25/03
24Future directions
- Development of general/flexible/hierarchical
representation of assemblies. - Quantifying spatial information from experiments.
- Optimization of structure.
- Toy models.
- Space, time.
4/6/03
25Role of NIH
- Structural proteomics is timely and feasible.
- Humongous integration of concepts, sciences,
methods, tools, people. - Thus, need research centers, but also R01
research. - Significant computing.
- Bridging the gaps between structural biology,
proteomics, and system biology.
4/6/03