Elucidating the Protein Folding Core - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

Elucidating the Protein Folding Core

Description:

The pebble game, a constraint counting algorithm, identifies independent and ... Pebble game algorithms and sparse graphs. European Conf. ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 2
Provided by: shawna7
Category:

less

Transcript and Presenter's Notes

Title: Elucidating the Protein Folding Core


1
Elucidating the Protein Folding Core PARASOL
Lab, Department of Computer Science, Texas AM
University, http//parasol.tamu.edu/
The folding core is the portion of the proteins
structure that is first to form during folding
and last to break during denaturation
Woodward93.
Rigidity analysis determines a structures rigid
and flexible components. It labels bonds in the
protein as rigid, independently flexible, or as
part of a dependently flexible set. It also
groups atoms into rigid clusters where all atoms
in a rigid cluster are fixed with respect to each
other.
Protein Model
Potential Landscape
The Pebble Game
Each amino acid is modeled as a pair of torsional
angles (f,j). A protein conformation then
The pebble game, a constraint counting algorithm,
identifies independent and redundant constraints
by assigning and rearranging pebbles
Jacobs95,Lee05.
Landscape topology affects folding behavior.
The folding core can be determined experimentally
from hydrogen exchange rates Kim93,Woodward93
through continuous labeling and pulse labeling
experiments. These experiments observe which
parts of the structure are the most protected,
the last to unfold, and thus part of the folding
core. In this work, we combine rigidity analysis
and our approximate energy landscape models to
simulate relative hydrogen exchange rates and
identify folding cores of several small proteins.
We first extract hundreds of feasible folding
pathways from our landscape models using
Map-based Monte Carlo simulations Tapia07. We
then analyze the rigidity changes along those
pathways to compute relative hydrogen exchange
rates. Finally, we use the average relative
exchange rates to determine folding core
membership. We compare our results experimental
data and other simulation methods
Hespenheide02,Rader04.
Modeling the Landscape
Extracting Pathways
We approximate the landscape with a graph (or
map) that captures the main features. It
contains thousands of folding pathways.
We can stochastically extract folding pathways
using Map-based Monte Carlo simulation Tapia07,
a random walk on the landscape graph. We have
used this technique to study population kinetics
and relative folding rates.
  • - Bias sampling around the known folded state and
    retain them based on their energy.
  • Connect neighboring samples together an assign
    an edge weight to reflect its energetic
    feasibility.

It can be applied to 3D bond-bending networks
like molecules Jacobs98
In Amato02,Amato03,Thomas07 we correctly
identified secondary structure formation order
for several small proteins.
We monitor rigidity changes along a folding
pathway and assign each residue at each pathstep
a score. Amino acids that remain rigid the
longest (high average scores) are more likely to
experience slower exchange and amino acids that
are the most flexible (low average scores) are
more likely to experience faster exchange. We
then define the relative exchange rate of an
amino acid as 1 - its average score. Here we
study two different scoring functions.
We define the folding core as those residues
whose exchange rate (either simulated or
experimental) falls below a threshold t. To
select t, we partition the exchange rates into k
clusters by selecting the k-1 thresholds that
minimize the cluster variances. The folding core
threshold, t, is simply the smallest threshold in
the set of k-1 thresholds.
We select the number of clusters k using the
elbow criterion. Notice that as additional
clusters are used to partition a data set, the
percentage of the data set variance explained by
the cluster variance increases and eventually
reaches 100. At some point, adding more
clusters does not significantly improve the
variance explained. This point is called the
elbow because there is a sharp slope change in
the variance explained plot. We select k as this
elbow point.
Scoring By Residue Flexibility
Scoring By Rigid Cluster Decomposition
We assign a rigidity score (RS) to each amino
acid at each pathstep based on its rigidity
classification 0 (not colored) for independently
flexible, 0.5 (green) for dependently flexible,
and 1 (red) for rigid.
We assign a folding core score (FCS) to each
amino acid at each pathstep based on its rigid
cluster membership 0 (not colored) if it is not
in a rigid cluster and 1 (black) if it is in a
rigid cluster.
Amato02 N.M. Amato and G. Song. Using motion
planing to study protein folding pathways. J.
Comput. Biol., 9(2)149-168, 2002. Special issue
of Int. Conf. Comput. Molecular Biology (RECOMB)
2001. Amato03 N.M. Amato, K.A. Dill, and G.
Song. Using motion planning to map protein
folding landscapes and analyze folding kinetics
of known native structures. J. Comput. Biol.,
10(3-4)239-256, 2003. Special issue of Int.
Conf. Comput. Molecular Biology (RECOMB)
2002. Hespenheide02 B.M. Hespenheide, A. Rader,
M. Thorpe, and L.A. Kuhn. Identifying protein
folding cores from the evolution of flexible
regions during unfolding. J. Mol. Gra. Model.,
21195-207, 2002. Jacobs95 D. Jacobs and M.
Thorpe. Generic rigidity percolation The pebble
game. Phys. Rev. Lett., 75(22)4051-4054,
1995. Jacobs98 D. Jacobs. Generic rigidity in
three-dimensional bond-bending networks. J. Phys.
A Math. Gen., 316653-6668, 1998. Kim93 K.S.
Kim, J.A. Fuchs, and C.K.Woodward. Hydrogen
exchange identifies native-state motional domains
important in protein folding. Biochemistry,
329600-9608, 1993. Lee05 A. Lee and I.
Streinu. Pebble game algorithms and sparse
graphs. European Conf. On Combinatorics, Graph
Theory and Applications, 2005. Rader04 A.J.
Rader and I. Bahar. Folding core predictions from
network models of proteins. Polymer,
45(2)659-668, 2004. Tapia07 L. Tapia, S.
Thomas, and N.M.Amato. Kinetics analysis methods
for approximate folding landscapes.
Bioinformatics, 23(13)539-548, 2007. Thomas07
S. Thomas, X. Tang, L. Tapia, and N.M. Amato.
Simulating protein motions with rigidity
analysis. J. Comput. Biol., 2007. Special issue
of Int. Conf. Comput. Molecular Biology (RECOMB)
2006. Thomas08 S. Thomas, L. Tapia, and N.M.
Amato. Elucidating the protein folding core.
Technical Report TR08-001, Parasol Lab, Dept. of
Computer Science, Texas AM University, Jan.
2008. Woodward93 C. Woodward. Is the slow
exchange core the protein folding core? Trends
Biochem. Sci., 18359-360, 1993. We would like
to thank A.J. Rader from Indiana
University-Purdue University for helpful
discussions.
We studied 21 different proteins ranging in size
from 54 to 155 residues long (see Thomas08 for
a complete listing). Our simulated relative
exchange rates compare favorably to experimental
data (references provided in Thomas08). We
also out-perform existing computational methods
for folding core identification in terms of
sensitivity and specificity.
Folding Core Identification
We compare the sensitivity and specificity of our
folding core identification method to 4 other
methods Gauss Network Model (GNM)-based
Rader04 and Floppy Inclusions and Rigid
Substructure Topography (FIRST)-based
Hespenheide02.
Relative Exchange Rates
FIRST monitors rigidity changes as hydrogen bonds
are iteratively removed from the weakest to the
strongest. The backbone is fixed during
simulation.
GNM models the protein as a mass-spring system
subject to Gaussian fluctuations. Motions are
limited to the vicinity of the native state.
The plots compare simulated exchange rates from
rigidity scores (red) and folding core scores
(green) to experimental data (open/filled circles
and diamonds). When available, normalized
experimental data is also plotted. On the
protein structures, residues are colored based on
their simulated rate from fastest/blue to
slowest/red or colored red if experimentally
identified as in the folding core.
(Images from http//en.wikipedia.org/wiki/Gaussia
n_network_model)
(Images from Jacobs et. al, Proteins
44(2)150-165, 2001.)
Performance has high variability due to noise and
missing data in experimental measurements (on
average, lt50 of the protein was measured).
Overall performance statistics indicate that our
methods perform better than FIRST and GNM in
sensitivity and error and similarly to GNM-G in
specificity.
Error is the normalized distance to perfect
sensitivity and specificity.
This research supported in part by NSF Grants
EIA-0103742, ACR-0081510, ACR-0113971,
CCR-0113974, ACI-0326350 and by the DOE. Thomas
supported in part by an IBM TJ Watson PhD
Fellowship, a Department of Education GAANN
Fellowship, a PEO Scholarship, and an NSF
Graduate Research Fellowship. Tapia supported in
part by a NIH Molecular Biophysics Training Grant
(T32GM065088) and a Department of Education GAANN
Fellowship.
Write a Comment
User Comments (0)
About PowerShow.com