A dimensionality reduction approach to modeling protein flexibility - PowerPoint PPT Presentation

About This Presentation
Title:

A dimensionality reduction approach to modeling protein flexibility

Description:

RII molecular 'handshake' (donut with two holes) ... Transform the basis of representation of molecular motion. ... based forcefield such as molecular dynamics. ... – PowerPoint PPT presentation

Number of Views:102
Avg rating:3.0/5.0
Slides: 33
Provided by: compN
Category:

less

Transcript and Presenter's Notes

Title: A dimensionality reduction approach to modeling protein flexibility


1
A dimensionality reduction approach to modeling
protein flexibility
  • By Miguel L. Teodoro, George N. Phillips J
  • and Lydia E. Kavraki
  • Rice University and University of
    Wisconsin-Madison
  • Presented by Zhang Jingbo

2
Outline
  • Motivation, Background and Our goal
  • Protein flexibility
  • The problems in current methods and the benefit
    of our methods in this paper
  • Dimensionality reduction techniques
  • Obtaining conformational Data
  • Application to Specific Systems
  • Summary

3
Motivation
  • Introduce a method to obtain a reduced basis
    representation of protein flexibility.

4
Background
  • Proteins are involved either directly or
    indirectly in all biological processes in living
    organisms.
  • Conformational changes of proteins can critically
    affect their ability to bind other molecules.
  • Any progress in modeling protein motion and
    flexibility will contribute to the understanding
    of key biological functions.
  • Today there is a large body of knowledge
    available on protein structure and function and
    this amount of information is expected to grow
    even faster in the future.

5
Our method and goal
  • Method
  • A dimensionality reduction technique
    Principal Component Analysis
  • Goal
  • Transform the original high dimensional
    representation of protein motion into a lower
    dimensional representation that captures the
    dominant modes of motions of the protein.
  • Obtain conformations that have been observed in
    laboratory experiments.

6
The focus of this paper
  • How to obtain a reduced representation of protein
    flexibility from raw protein structural data

7
What is Protein flexibility ?
  • Definition A crucial aspect of the relation
    between protein structure and function.
  • Proteins change their three-dimensional shapes
    when binding or unbinding to other molecules.

8
(No Transcript)
9
(No Transcript)
10
Why we want to modeling protein flexibility?
  • Several applications for our work
  • 1. Pharmaceutical drug development
  • 2. To model conformational changes that
    occur during protein-protein and protein-DNA/RNA
    interactions.

11
RII molecular "handshake" (donut with two holes).
Models for the binding of RII to the glycophorin
A receptor on red blood cells (erythrocytes).
Backbone of the RII dimer showing glycan binding
sites.
12
The problems in current methods
  • The computational complexity of explicitly
    modeling all the degrees of freedom of a protein
    is too high.
  • Modeling proteins as rigid structures limits the
    effectiveness of currently used molecular docking
    mithods.

13
The benefit of our method in this paper (1)
  • Using the approximation
  • Make including protein flexibility in the drug
    process a computationally efficient way.

14
Two most common structural biology experimental
methods in use today
  • Protein X-ray crystallography
  • Nuclear magnetic resonance (NMR)
  • Limits

15
An alternative to experimental methods
  • Computational methods based on classical or
    quantum mechanics to approximate protein
    flexibility.
  • Limits

16
The benefit of our method in this paper (2)
  • Transform the basis of representation of
    molecular motion.
  • The new degrees of freedom will be linear
    combinations of the original variables.
  • Some degrees of freedom are significantly more
    representative of protein flexibility than
    others.
  • Consider only the most significant dof and the
    transformed dof are collective motions affecting
    the entire configuration of the protein.
  • Some tradeoff between the loss of information and
    effectively modeling protein flexibility in a
    largely reduced dimensionality subspace.

17
What we acutually do in this paper?
  • Start from initial coordinate information from
    different data sources
  • Apply the principal component analysis method of
    dimensionality reduction.
  • Obtain a new structural representation using
    collective degrees of freedom.
  • Here, we will focus on
  • a. the interpretation of the principal
    components as biologically relevant motions
  • b. how combinations of a reduced number of
    these motions can approximate alternative
    conformations of the protein.

18
Dimensionality reduction techniques
  • Aim find a mapping between the data in a space
    and its subspace.
  • Two methods
  • a. Multidimensional scaling (MDS)
  • b. Principal component analysis (PCA)
  • Merits
  • Limits

19
PCA of conformational data
  • Merits 1). the most established method
  • 2). the most efficient algorithms
  • 3). guaranteed convergence for
    computation
  • 4). a upper bound on how much we
    can
  • reduce the representation of
    conformational
  • flexibility in proteins.
  • 5). the principal components have
    a direct
  • physical interpretation.
  • 6). can readily project the high
    dimensional
  • data to a low dimensional space
    and do it in
  • the inverse direction
    recovering a
  • representation of the original
    data with
  • minimal reconstruction error.

20
PCA of conformational data (continued)
  • Linear and non-linear

21
PCA of conformational data (continued)
  • Conformational Data
  • 1. The input data for PCA Several atomic
    displacement vectors (3N dimension) corresponding
    to different structural conformations, which as
    the form
  • corresponds to Cartesian coordinate
    information for the ith atom.
  • 2. All atomic displacement vectors constitute
    the conformational vector set.

22
Singular value decomposition (SVD)
  • We use the singular value decomposition (SVD) as
    an efficient computational method to calculate
    the principal components.
  • The SVD of a matrix, A, is defined as
  • where U and V are orthonormal matrices and
  • is a nonnegative diagonal matrix whose
    diagonal elements are the singular values of A.
  • the columns of matrices U and V are called
    the left and right singular vectors,
    respectively.
  • the square of each singular value corresponds
    to the variance of the data in A.
  • The SVD of matrix A was computed using the ARPACK
    library.

23
Obtaining conformational Data
  • The most common data sources
  • 1. experimental laboratory methods
  • a. X-ray crystallography
  • b. NMR,
  • 2. computational sampling methods based
    forcefield such as molecular dynamics.
  • laboratory methods VS computational methods
  • - laboratory methods generate less data
  • - computational methods have a lower
    accuracy.

24
Application to Specific Systems
  • Now, lets see about

25
HIV-1 Protease
26
The advantages of using the PCA methodology to
analyze protein flexibility
  • Can be used at different levels of detail
  • 1. the overall motion of the backbone.
  • 2. the simplified flexibility of the
    protein as a whole.
  • 3. include only the atoms that
    constitute the binding site.

27
In the first experiment situation
28
The second situation
  • The advantage of PCA

29
The last situation
  • As a validation of our method.

30
Another application Aldose Reduction
31
Summary
32
  • Thank you
Write a Comment
User Comments (0)
About PowerShow.com