A Probabilistic Approach to Protein Backbone Tracing in Electron Density Maps

About This Presentation

Title:

A Probabilistic Approach to Protein Backbone Tracing in Electron Density Maps

Description:

Each label is location orientation. Evidence y is the ... Store Fourier coefficients in Cartesian space. At each location x, store a single orientation r ... – PowerPoint PPT presentation

Number of Views:46

Avg rating:3.0/5.0

Slides: 36

Provided by: frankd86

Category:

more less

Transcript and Presenter's Notes

Title: A Probabilistic Approach to Protein Backbone Tracing in Electron Density Maps

1
A Probabilistic Approach to Protein Backbone
Tracing in Electron Density Maps

Frank DiMaio, Jude Shavlik
Computer Sciences Department
George Phillips
Biochemistry Department
University of Wisconsin Madison
USA

Presented at the Fourteenth Conference on
Intelligent Systems for Molecular Biology (ISMB
2006), Fortaleza, Brazil, August 7, 2006
2
X-ray Crystallography
FFT
X-ray beam
ProteinCrystal
CollectionPlate
ElectronDensity Map (3D picture)
3
Given Sequence Density Map
Sequence
Electron Density Map
4
Find Each Atoms Coordinates
5
Our Subtask Backbone Trace
Ca
Ca
Ca
Ca
6
The Unit Cell

3D density function ?(x,y,z) provided over unit
cell
Unit cell may contain multiple copies of the
protein

7
The Unit Cell

3D density function ?(x,y,z) provided over unit
cell
Unit cell may contain multiple copies of the
protein

8
Density Map Resolution
2Å
4Å
3Å
ARP/wARP (Perrakis et al. 1997)
TEXTAL (Ioerger et al. 1999) Resolve (Terwilliger
2002)
Our focus
9
Overview of ACMI (our method)

Local Match
Algorithm searches for sequence-specific 5-mers
centered at each amino acid
Many false positives

Global Consistency
Use probabilistic model to filter false positives
Find most probable backbone trace

Global Consistency
Use probabilistic model to filter false positives
Find most probable backbone trace

10
5-mer Lookup and Cluster

VKH V LVSPEKIEELIKGY

PDB
Cluster 1
Cluster 2
NOTE can be done in precompute step
wt0.67
wt0.33
11
5-mer Search

6D search (rotation translation)
forrepresentative structures in density map
Compute similarity
Computed by Fourier convolution (Cowtan 2001)
Use tuneset to convert similarity score to
probability

12
Convert Scores to Probabilities
5-mer representative
13
In This Talk

Where we are now
For each amino acid in the protein, we have a
probability distribution over the unit cell

Where we are headed
Find the backbone layout maximizing

14
Pairwise Markov Field Models

A type of undirected graphical model
Represent joint probabilities as product of
vertex and edge potentials
Similar to (but more general than) Bayesian
networks

y
u1
u3
u2
15
Protein Backbone Model
ALA
GLY
LYS
LEU

Each vertex is an amino acid
Each label is location
orientation
Evidence y is the electron density map
Each vertex (or observational) potential
comes from the 5-mer matching

16
Protein Backbone Model
ALA
GLY
LYS
LEU

Two types of edge (or structural) potentials
Adjacency constraints ensure adjacent amino acids
are 3.8Å apart and in the proper orientation

17
Protein Backbone Model
ALA
GLY
LYS
LEU

Two types of structural (edge) potentials
Adjacency constraints ensure adjacent amino acids
are 3.8Å apart and in the proper orientation
Occupancy constraints ensure nonadjacent amino
acids do not occupy same 3D space

18
Backbone Model Potential
Constraints between adjacent amino acids

x
19
Backbone Model Potential
Constraints between nonadjacent amino acids
20
Backbone Model Potential
Observational (amino-acid-finder) probabilities
21
Probabilistic Inference

Want to find backbone layout that maximizes

Exact methods are intractable
Use belief propagation (BP) to approximate
marginal distributions

22
Belief Propagation (BP)

Iterative, message-passing method (Pearl 1988)
A message, , from amino acid i toamino
acid j indicates where i expects to find j
An approximation to the marginal (or belief)
,is given as the product of incoming messages

23
Belief Propagation Example
ALA
GLY
24
Technical Challenges

Representation of potentials
Store Fourier coefficients in Cartesian space
At each location x, store a single orientation r
Speeding up O(N2X2) naïve implementation
X the unit cell size ( Fourier coefficients)
N the number of residues in the protein

25
Speeding Up O(N2X2) Implementation

O(X2) computation for each occupancy message
Each message must integrate over the unit cell
O(X log X) as multiplication in Fourier space
O(N2) messages computed stored
Approx N-3 occupancy messages with a single
message
O(N) messages using a message product accumulator
Improved implementation O(NX log X)

26
1XMT at 3Å Resolution
prob(AA at location)
HIGH
0.82
0.17
1.12Å RMSd 100 coverage
LOW
27
1VMO at 4Å Resolution
prob(AA at location)
HIGH
0.25
0.02
3.63Å RMSd 72 coverage
LOW
28
1YDH at 3.5Å Resolution
prob(AA at location)
HIGH
0.27
0.02
1.47Å RMSd 90 coverage
LOW
29
Experiments

Tested ACMI against other map interpretation
algorithms TEXTAL and Resolve
Used ten model-phased maps
Smoothly diminished reflection intensitiesyieldin
g 2.5, 3.0, 3.5, 4.0 Å resolution maps

30
RMS Deviation
ACMI
ACMI
Textal
Resolve
Ca RMS Deviation
Density Map Resolution
31
Model Completeness
chain traced
residues identified
ACMI
ACMI
Textal
Resolve
Density Map Resolution
32
Per-protein RMS Deviation
TEXTAL RMS Error
Resolve RMS Error
ACMI RMS Error
33
Conclusions