Title: Macromolecular Crystallography and Structural Genomics
1Macromolecular Crystallography and Structural
Genomics Recent TrendsProf. D.
VelmuruganDepartment of Crystallography and
BiophysicsUniversity of MadrasGuindy Campus,
Chennai 25.
2- Structural Genomics aims in identifying as many
new folds as possible. - This eventually requires faster ways of
determining the three dimensional structures as
there are many sequences before us for which
structural information is not yet available. - Although Molecular Replacement technique is still
used in Crystallography for solving homologous
structures, this method fails if there is not
sufficient percentage of homology. - The Multiwavelength Anomalous Diffraction (MAD)
techniques have taken over the conventional
Multiple Isomorphous Replacement (MIR) technique.
3- With the advent of high energy synchrotron
sources and powerful detectors for the diffracted
intensities, developments in methodologies of
macromolecular structure determination, there is
a steep increase in the number of macromolecular
structures determined and on an average eight
new structures are deposited in the PDB every day
and the total entries in the PDB is now around
29,000. - Instead of using the three wavelength strategies
in MAD experiments, the use of single wavelength
anomalous diffraction using Sulphur anomalous
scattering is recently proposed. This will reduce
the data collection time to 1/3rd. - Also, the judicious use of the radiation damage
during redundant data measurements in second
generation synchrotron source and also during
regular data collection in the third generation
synchrotron source has been pointed out recently
(RIP RIPAS).
4Protein Structure Determination
- X-ray crystallography
- NMR spectroscopy
- Neutron diffraction
- Electron microscopy
- Atomic force microscopy
5As the number of available amino acid sequences
exceeds far in number than the number of
available three-dimensional structures,
high-throughput is essential in every aspect of
X-ray crystallography.
6(No Transcript)
7Procedure
Protein Crystal
8(No Transcript)
9The 14 Bravais lattices
2 Monoclinic
1 Triclinic
(Blue numbers correspond o the crystal system)
10The 14 Bravais lattices
3 Orthorhombic
(Blue numbers correspond to the crystal system)
11The 14 Bravais lattices
4 Rhombohedral
5 Tetragonal
6 Hexagonal
(Blue numbers correspond to the crystal system)
12The 14 Bravais lattices
7 Cubic
(Blue numbers correspond to the crystal system)
13(No Transcript)
14(No Transcript)
15 Synchrotron radiation
More intense X-rays at shorter wavelengths mean
higher resolution much quicker data collection
16(No Transcript)
17(No Transcript)
18Diffraction Apparatus
19Diffraction Principles
nl 2dsinq
20The diffraction experiment
21The amplitudes of the waves scattered by an atom
to that of an single electron atomic
scattering factor The amplitude of the waves
scattered by all the atoms in a unit cell to that
of a single electron (The vector (amplitude and
phase) representing the overall scattering from a
particular set of Bragg planes) Fhkl
structure factor
The structure factor magnitude F(hk/) is
represented by the length of a vector in the
complex plane.
The phase angle a(hk/) is given by the angle.
measured counterclockwise, between the
positive real axis and the vector F.
22 unit cell F (h,k,l) V?x0 ?y0 ?z0
?(x,y,z).exp2?I(hx ky lz).dxdydz A
reflection electron density
V the volume of the unit cell Fhkl the
structure-factor amplitude (proportional to the
square-root of reflection intensities) ahkl the
phase associated with the structure-factor
amplitude FhklWe can measure the amplitudes,
but the phases are lost in the experiment. This
is the phase problem.
23(No Transcript)
24Fourier Transform requires both structure factors
and phases
Electron density calculation
S
S
S
a
?
p
Unknown
25(No Transcript)
26(No Transcript)
27(No Transcript)
28Patterson function
- Patterson space has the same dimension as the
real-space unit cell - The peaks in the Patterson map are expressed in
fraction coordinates - To avoid confusion, the x, z and z dimensions of
Patterson vector-space are called (u, v, w).
29What does Patterson function represent?
- It represents a density map of the vectors
between scattering atoms in the cell - Patterson density is proportional to the squared
term of scattering atoms, therefore, the electron
rich, i.e., heavy atoms, contribute more to the
patterson map than the light atoms.
30Patterson function no phase info required
Consider phaseless term (h, k, l, F2)
S
S
S
P
No phase term
31Patterson map
32Patterson map symmetry
Patterson map with symmetry
Harker vectors u, v, w 2x, 1/2, 2z
P21 x, y, z -x, y1/2, -z
33(No Transcript)
34(No Transcript)
35Diffracting a Cat
Diffraction data with phase information
Real Diffraction Data
36Reconstructing a Cat
FT
Easy
FT
Hard
37The importance of phases
38Phasing Methodsall assume some prior knowledge
of the electron density or structure
39The Phase Problem
- Diffraction data only records intensity, not
phase information (half the information is
missing) - To reconstruct the image properly you need to
have the phases (even approx.) - Guess the phases (molecular replacement)
- Search phase space (direct methods)
- Bootstrap phases (isomorphous replacement)
- Uses differing wavelengths (anomolous disp.)
40Acronyms for phasing techniques
- MR
- SIR
- MIR
- SIRAS
- MIRAS
- MAD
- SAD
41Direct methods
- Based on the positivity and atomicity of electron
density that leads to phase relationships between
the (normalized) structure factors (E). - Used to solve small molecules structures
- Proteins upto 1000 atoms, resolution better than
1.2 Å - Used in computer programs (SnB, SHELXD SHARP) to
find heavy-atom substructure.
Jerome Karle and Herbert A. Hauptman Nobel prize
1985 (chemistry)
42Dm cycle
Density modification procedures (e.g. solvent
flattening and averaging) can be carried out as
part of a cyclic process
43Molecular Replacement (MR)
Used when there is a homology model available
(sequence identity gt 25).
- 1. Orientation of the model in the new unit cell
(rotation function) - 2. Translation
44Molecular Replacement (MR)
- MR works because the Fourier transform works in
both directions. - Reflections model (density)
- Have to be careful of model bias
New Protein
Coordinates in PDB
MR solution
45Isomorphous replacement
- Why isomorphous replacement, making heavy atom
derivatives? - Phase determination
- Calculating FH
- FH FPH-FP
- If HA position is known, FH can be calculated
from ?(xH, yH, zH) by inverse FT - HA position determination Patterson function
46(No Transcript)
47(No Transcript)
48(No Transcript)
49HA shifts FP by FH
50Isomorphous Replacement (SIR, MIR)
- Collect data on native crystals (no metals)
- Soak in heavy metal compounds into crystals, go
to specific sites in the unit cell. - e.g. Hg, Pt, Au compounds
- The unit cell must remain isomorphous
- Collect data on the derivatives
- As a result, only the intensity of the
reflections changes but not the indices - Measure the reflection intensity differences
between native and derivative data sets. - Find the position of the heavy atoms in the unit
cell from the intensity differences. - generate vector maps (Patterson maps)
- FP HA FP FHA
- Must have at least two heavy atom derivatives
- The main limitations in obtaining accurate
phasing from MIR is non isomorphism and
incomplete incorporation (low occupancy) of the
heavy atom compound.
Native and heavy-atom derivative diffraction
patterns superimposed and shifted
vertically. Note intensity differences for
certain reflections. Note the identical unit
cell (reflection positions). This suggests
isomorphism.
51Isomorphic HA derivatives only changes the
intensity of the diffraction but not the indices
of the reflections
Native crystal HA derivative crystal
52Harker diagram
Once we have an heavy atom structure rH(r), we
can use this to calculate FH(S). In turn, this
allows us to calculate phases for FP and FPH for
each reflection.
Harker construction for SIR
The phase probability distribution shows that SIR
results in a phase ambiguity
53(No Transcript)
54(No Transcript)
55MIR
We can use a second derivative to resolve the
phase ambiguity
Harker construction for multiple isomorphous
replacement (MIR)
56(No Transcript)
57(No Transcript)
58 AS Anomalous scattering leads to a breakdown
of Friedels law
59Anomalous scattering data can also be used to
solve the phase ambiguity
Note that the anomalous differences are very
small thus very accurate data are necessary
60(No Transcript)
61?02p
m
62Steps in MAD
- Introduce anomalous scatterer
- Incorporate SeMet in replace of Met
- Incorporate HA eg Hg, Pt, etc
- Take your crystals to a synchrotron beam-line
(tunable wavelength). - Collect data sets at 3 separate wavelengths the
Se (or other HA) absorption peak, edge and
distant to the peak. - Measure the differences in Friedel mates to get
an estimate of the phases for the Se atoms. - These differences are quite small so one need to
collect a lot of data (completeness, redundancy)
to get a good estimate of the error associated
with each measurement. - Use the Se positions to obtain phase estimates
for the protein atoms.
Atomic scattering factor 3 terms
63Advantages of MAD
- All data is collected from one crystal
- Perfect isomorphism
- Fast
- Easily interpretable electron density maps
obtained right away.
64 SADSingle-wavelength anomalous diffraction
(SAD) phasing has become increasingly popular in
protein crystallography.Two main steps 1)
obtaining the initial phases 2) improving the
electron density map calculated with
initial phases.
- The essential point is to break the intrinsic
phase ambiguity. - Two kinds of phase information enables the
discrimination of phase doublets from SAD data
prior to density modification. - From heavy atoms (expressed by Sim distribution)
- From direct methods phase relationships
(expressed by Cochran distribution)
65(No Transcript)
66(No Transcript)
67Breaking the OAS phase ambiguity
68OASIS (CCP4 Supported Program) DESCRIPTION
OASIS is a computer program for breaking phase
ambiguity in One-wavelength Anomalous Scattering
or Single Isomorphous Replacement (Substitution)
protein data. The phase problem is reduced to a
sign problem once the anomalous-scatterer or the
replacing-heavy-atom sites are located. OASIS
applies a direct method procedure to break the
phase ambiguity intrinsic to OAS or SIR data.
REFERENCES Fan, H. F. and Gu, Y. X. (1985)
Combining direct methods with isomorphous
replacement or anomalous scattering data III.
The incorporation of partial structure
information, Acta Cryst. A41, 280-284. Fan H.
F., Hao, Q., Gu, Y. X., Qian, J. Z., Zheng, C. D.
and Ke, H. (1990) Combining direct methods with
isomorphous replacement or anomalous scattering
data VII. Ab initio phasing of the OAS data from
a small protein, Acta Cryst. A46, 935-939. Y. -D.
Liu, I. Harvey, Y. -X. Gu, C. -D. Zheng, Y. -Z.
He, H. -F. Fan, S. S. Hasnain and Q. Hao (1999)
Is single-wavelength anomalous scattering
sufficient for solving phases? A comparison of
different methods for a 2.1 A structure solution,
Acta Cryst. D55, 1620- 1622. AUTHORS Q. Hao (1,
2), Y. X. Gu, C. D. Zheng H. F. Fan (2)
(1) School of Applied Sciences, De Montfort
University, Leicester LE1 9BH, England.
(2) Institute of Physics, Chinese Academy of
Sciences, Beijing 100080, P. R. China. Email
qhao_at_dmu.ac.uk or fan_at_aphy.iphy.ac.cn
69The first example of solving an unknown protein
by direct-method phasing of the 2.1Å OAS data
Rusticyanin, MW 16.8 kDa SG P21
a32.43, b60.68, c38.01Å b107.82o
Anomalous scatterer Cu
70Comparison of OAS and MAD phasing
(data from Dr. S. Ealick)
MAD phasing Direct-method
OAS phasing
Ompdc
Pure
71Radiation damage Induced Phasing (RIP)
- Radiation damage has been a curse of
macromolecular crystallography from its early
days. - The X-ray radiation damage of cystals can be
caused by he breakage of covalent bonds as an
immediate consequence of the absorption of an
X-ray quantum (a primary effect) of by the
destructive effect of the propogation of radicals
throughout the crystal (a secondary effect). - Total dose and dose rate play a role in the
amount of radiation damage inflicted on a protein
crystal.
72- The most pronounced structural changes observed
were disulphide-bond breakage and associated
main-chain and side-chain movements as well as
decarboxylation of aspartate and glutamate
residues. - The structural changes induced on the sulphur
atoms were successfully used to obtain
high-quality phase estimates through an RIP
(Radiation damage Induced Phasing) procedure.
73(No Transcript)
74Radiation damage Induced Phasing with Anomalous
Scattering (RIPAS)
- Substructure solution and phasing procedure using
a combination of anomalous scattering and
radiation damage induced isomorphous differences. - RIPAS strategy is beneficial for both locating
the substructure and subsequent phasing.
75 Experimental electron density before
solvent flattering with SAD (left), RIP (middle)
and RIPAS (right) phases for the (a) CS
(thaumatin crystal soaked in a diluted
N-iodisuccinamide solution) thaumatin data
(b) IC thaumatin (iodinated crystallized
thaumatin)
76Methods of phase improvement
It is not always (!) possible to recognise
features in a first electron density map. There
are however ways of improving the map (phases)
- Solvent Flattening
- Histogram matching
- Non-crystallographic symmetry (NCS) Averaging
- these methods can result in dramatic
improvements in the clarity of the electron
density map.
771. Solvent flattening. Protein crystals contain
large amounts of solvent this will in general be
disordered, and so will not contribute to the
crystal diffraction. By knowing the protein
content of the crystal, it is therefore possible
to determine the threshold density below which is
noise points with density below the threshold
are set to a suitable average value. This is
particularly useful for locating molecular
boundaries.
2. Averaging. If the asymmetric unit possesses
more than one molecule, the equivalencing of the
various copies can lead to dramatic improvement
in the map and the phases.
78Improvement in electron density after solvent
flattening and histogram matching
Before
Green solvent envelope
After
79Interpretation of the Electron Density(Building
the Model)
- Lots of fun!
- Trace the main-chain
- Try to recognize the amino acid sequence in the
density. - Programs- Xtal view, O
80The effect of resolution of the quality of the
electron density map
2.0 Å
1.5 Å
1.2 Å
5.0 Å see shape of molecule 3.0 Å see
main-chain and some side chains 2.5 Å see
main-chain carbonyls 1.5 Å atomic resolution.
81Resolution
1.2 Å
2 Å
3 Å
82Atomic resolution
83Fitting side chains, adding waters
- If the density is good enough you can recognize
alternate conformations for side-chains. - Hydrogens are not seen in the density, except in
ultra-high resolutions structures lt 1.0 Å. - Ordered Waters are seen on the surface and
occasionally in the interior of the protein. - At 2.0 Å resolution or better 1 water /
residue. - Waters molecules play a big role in protein
stability and enzyme catalysis. - Because the density depends on experimental
phases which has error associated with them. - The first model can have many errors.
- Therefore it is essential to refine the atomic
positions and their thermal parameters.
84Chain Tracing
Electron Chain Final Density Trace
Model
85Maps coefficients used to minimize model bias
2Fo Fc most common map seen in paper. Fo
Fc (difference map) used with the above map
to detect errors
86Refinement Cycle
Refinement Improving the agreement between the
model and the experimental density. Compare Fobs
(From reflection Intensities) to Fcalc
(Calculated from the model) Least squares
minimization Simulated Annealing / Molecular
dynamics Rfactor numerical indicator to follow
progress of refinement agreement between data
and model
data
model
data
87Refinement
88Refinement
iterations
R
R S(Fo-Fc)/S(Fo)
Fc calculated structure factor
Fo observed structure factor
89(No Transcript)
90Protein Data Base growth
Molecular Biology cloning of genes / over
expression of proteins Synchrotron Radiation
MAD phasing, smaller crystals Cryo-cooling of
crystals collect data from 1 crystal,
increase order. Instrumentational and
software improvements Increase in the number of
labs using the technique
91(No Transcript)
92(No Transcript)
93- Due to the advent of synchrotron radiation and
due to the seleno-methionine derivatization
technique, the total number of protein structures
deposited in the PDB from 1980 onwards has
increased catastrophically. - MAD technique played a major role in this. At
present nearly 100 new structures are deposited
every week.
94