Xray Crystallography Workshop - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Xray Crystallography Workshop

Description:

Every atom's diffracting power is further reduced at higher ... all atoms in the unit cell relative to the origin. The name of the software we will use for ... – PowerPoint PPT presentation

Number of Views:257
Avg rating:3.0/5.0
Slides: 22
Provided by: laurie110
Category:

less

Transcript and Presenter's Notes

Title: Xray Crystallography Workshop


1
X-ray Crystallography Workshop
  • DAY 3
  • Software
  • Data processing from crystal
  • Lunch
  • Lecture on Molecular Replacement
  • Everybody run CCP4 programs

2
Before we solve the structure, we needto
convert the scaled output data (Intensities)to
Amplitudes
  • Scaled data are h, k, l, Intensity, std.
    Deviation, of Intensity (s)I
  • A CCP4 program called Truncate does the
    following
  • Analyses the data by calculating a Wilson plot
    (calculates an absolute scale and temperature
    factor for a set of observed intensities, using
    the theory of A C Wilson. (WE TELL THE PROGRAM
    THE APPROXIMATE NUMBER OF AMINO ACIDS IN
    THEASYMMETRIC UNIT, AND IT USES A FORMULA TO FIND
    THE AVERAGE f2. IF the atoms are randomly
    distributed through the asymmetric unit THEN
    ltf2gt should equal scaleltFobs2gt exp -2B(sin
    theta/lambda)2. By fitting a least squares
    line through ln(ltf2gt/ltFobs2gt) vs 2(sin
    theta/lambda)2 the program derives the scale
    and B value. For real structures the assumption
    that the atoms are randomly distributed is
    obviously incorrect. The effect of this is most
    obvious in the low resolution reflections. The
    Wilson plot will deviate from a straight line
    from about 3.0A - 4.0A downwards. Although all
    the points on the Wilson plot are plotted, the
    scale and B are only determined from a limited
    resolution range.
  • Truncates the data by a method devised by
    French and Wilson based on Bayesian statistics.
    This has the effect of forcing all negative
    observations to be positive, and inflating the
    weakest reflections (less than about 3 sd),
    because an observation significantly smaller than
    the average intensity is likely to be
    underestimatedThe F's are calculated using the
    prior knowledge of Wilson's distributions for
    acentric or centric data (calculated in shells of
    reciprocal space in a first pass through the
    data) and the mean intensity and standard
    deviation values. The F's output are all positive
    and follow Wilson's distribution.
  • Analyses the cumulative intensity distribution of
    the data to test for twinning or anisotropy in
    the data

3
Example of a Wilson Plot
  • Every atom's diffracting power is further reduced
    at higher resolution by any atomic vibration
    (i.e. temperature factor
  • The deviations from the scattering of a single
    atom in protein crystals come from the non-random
    distribution of atoms in the unit cell -
    including things like alpha-helices and beta
    sheets

4
The CCP4 programs use a data file format called
mtz (filename.mtz), a binaryfile format
  • Truncate outputs an mtz file with h, k, l, F, sF,
    plus the original Intensity for each reflection,
    including anomalous data if we choose to
    (remember h,k,l and -h, -k, -l NOT equal in the
    case of heavy atoms
  • This is the file we will use for the Molecular
    Replacement solution for lysozyme

5
The phase problem can be solved in several
different ways
  • If you have the structure of the same or a
    closely related protein
  • Determine the orientation of the model in the
    unit cell of the new structure
  • Determine the three angles and the translations
    that will place the model correctly
  • Need some kind of target score to evaluate this
    placement
  • The original target uses the Patterson function
    (see
  • http//www-structmed.cimr.cam.ac.uk/Course/MolRep/
    molrep.html

6
Remember that the equation for electrondensity
includes the Amplitude and phase, which
comes from the positions of all atoms in the
unit cell relative to the origin
7
The name of the software we will use for
molecular replacement is called PHASER
  • Link to PHASER Website http//www-structmed.cimr.
    cam.ac.uk/phaser/
  • This is part of the CCP4 software collection

8
PHASER uses a different target functionthan the
older programs, takes into accountmore
probabilistic approach
  • If you are very mathematically inclined, here it
    is
  • Randy Reads paper

9
Lets go back to the main window that opens
when you start CCP4i
  • You see three sections
  • The left-hand one is where you select the program
    you want to run
  • The middle gives a list of jobs that have been or
    are being run under the current project, and
    their status (running, finished)
  • The right-hand section gives a list of tools

10
To run a particular program, select it from
themenu on the left-hand side
  • Select the Data Reduction menu option
  • Select the import scaled option this will open a
    new window
  • These steps are the same for running any CCP4
    program

11
  • Now we enter the required data in the yellow
    boxes
  • The Browse button allows you to select a file
    rather than having to type the name in
  • We will NOT use anomalous data
  • This is the window for running the Truncate
    program discussed previously

12
Now lets run PHASER with the output mtzfrom
that last step, but first
  • We need a model for lysozyme that we can use as
    our search model for molecular replacement
  • We can go to the PDB to retrieve a model
  • http//www.rcsb.org/pdb/home/home.do
  • We used this one (1W6Z) successfully in our
    Spring Crystallography course

13
Its a good idea when you use a solved
structureto edit it to get rid of things like
waters and ionsthat may not be in your structure
in the same place
  • Use a text editor to remove the lines after the
    protein atoms - the lines starting with ATOM
    for example they used holmium (HO) to solve the
    structure, and they also modeled some Cl ions
    (which we should also be able to do)
  • This will be your search model file
  • More on pdb files ???

14
Now lets go back to the PHASER window
  • You select your data file at the MTZ in line
  • You put your model file in the PDB 1 Line (I
    want you to use a special version of the file
    where I jumbled up the coordinates (moved the
    molecule around)
  • What about these other items???
  • Mode - automated search
  • Resolution range - only need to 2.5
  • Component - fill in
  • Sequence identity -1 for us
  • Search details- ensemble1
  • We will use the defaults for this simple problem

15
When you have filled in all the blanks,you click
on the Run or Run and view com filebutton
  • This shows you the script that the GUI has
    written into the tmp for temporary, directory
  • These were the scripts that we used back before
    the GUI to run each program individually from the
    Unix terminal now you are spared all that
    editing
  • Click continue and it will execute the program

16
When PHASER is finished running, you can look at
the various output files
  • Click on the right-hand side of the main CCP4
    window you can look at the log file, and the
    output files.
  • First, lets look at the workshop_2.sol

17
  • The one solution is shown on the last line - the
    three angles that the search model are rotated
    through to match the data, and the three
    translations (fractional part of the unit cell)
  • The scores are shown on the line above - LLG is
    large and positive, Z values are way above the
    benchmark of 7
  • We can also go through the log file

18
What happens when we use a model thathas all the
side chains replacedby Ala
  • The scores are lower but basically it finds the
    same solution - the backbone is so similar that
    the side chains are not so important.

19
What happens when we use a model thatis related
but not 100 identical
  • Try bob-white quail lysozyme
  • These numbers LOOK different but its actually
    almost superimposed onto the solution with
    hen-egg-white lysozyme - the original model for
    bob-white quail is in an entirely different unit
    cell I could have tried to superimpose the
    models first, but this molecule has only side
    chain differences and no insertions or main chain
    differences

20
What happens when we use a model thatis not
related closely enough
  • Try T4 lysozyme (phage)
  • The scores are a lot lower, and you get more than
    one top solution
  • Negative LLG are a pretty good sign of a bad
    solution, as are low Z scores
  • If you look at the log file for this run, you
    will see that it is long and has many many tried
    to fit SOMETHING

21
PHASER outputs a pdb file that containsthe
search model rotated and translatedby the
appropriate amounts indicated in thesolution
file.
  • It also outputs an mtz file with the original
    structure factor amplitudes, but also the
    coefficients for two types of electron density
    maps
  • 2Fo-Fc
  • Fo-Fc

2Fo-Fc (think of as Fo (Fo-Fc)) uses phases
calculated from the model and amplitudes from the
measured data minus the the model and amplitudes
from the measured data minus the calculated data.
Gives you the model electron density PLUS the
differences between the REAL data and the
CALCULATED data Fo Fc (difference map) uses
phases calculated from the model and amplitudes
from the REAL minus the CALCULATED data. Tells
you where you either need atoms (positive
difference density) or where you need to get rid
of atoms (negative density).
Write a Comment
User Comments (0)
About PowerShow.com