Shape determination of proteins in solution using high throughput computing PowerPoint PPT Presentation

presentation player overlay
1 / 32
About This Presentation
Transcript and Presenter's Notes

Title: Shape determination of proteins in solution using high throughput computing


1
Shape determination of proteins in solution using
high throughput computing Donna
Lammie Structural Biophysics Group School of
Optometry and Vision Sciences Cardiff University
2
Outline of Talk Setting the scene Data
collection Data analysis software High
throughput computing through Condor
3
Proteins are large organic compounds made of
amino acids arranged in a linear chain and joined
together by peptide bonds. Each protein has a
unique, genetically defined amino acid sequence
which determines its specific shape and
function. Proteins can work together to achieve
a particular function, and they often associate
to form stable complexes.
4
Roles and Functions Enzymes Structural Hormones
Immunoglobulins Involved in oxygen transport
Muscle contraction Cell signalling
5
Techniques to investigate protein
structures X-ray crystallography is the science
of determining the arrangement of atoms within a
crystal from the manner in which a beam of X-rays
is scattered from the electrons within the
crystal. Restrictions X-ray crystallography
requires good quality crystals Therefore, a
significant fraction of proteins cannot be
analysed
6
Structure of haemoglobin - the iron-containing
oxygen-transport metalloprotein in the red blood
cells of the blood in vertebrates and other
animals.
7
  • Importance of understanding structure of proteins
  • how individual components fit together to build
    complex systems
  • structure - function
  • Possibility of manipulation
  • Drug design
  • Drug therapies

8
  • Why scatter solutions?
  • Main advantage - the possibility to study the
    structure and structural dynamics of native
    particles in physiological solutions.
  • Broad range of sizes and conditions
  • Shape
  • Complexes

9
Research Interests Structural organisation at
the nanometer length scale. Systems in solution
/ in situ X-ray Scattering ideal for
investigating the structure and organisation of
particles/molecules in a system. provides
information about size, shape and arrangement of
particles/molecules.
10
X-rays interact with molecules and are
deflected. We can interpret the deflection.
11
The shape and distribution of the scattering
provides information such as size, shape and
arrangement of the scattering particles.
12
Synchrotron sources
13
The two-dimensional data was converted into
one-dimensional linear profiles.
Background corrected - buffer subtracted
from protein using PRIMUS.
Konarev, P.V., Volkov, V.V., Sokolova, A.V.,
Koch, M.H.J. and Svergun, D.I. (2003) J. Appl.
Cryst, 36, 1277-1282.
14
Subtracted/corrected data
15
GNOM was used to estimate the particle
distance distribution function, ?(r) from the
experimental scattering data. GNOM
output is entered into DAMMIN and GASBOR.
Semenyuk, A.V. and Svergun, D.I. (1991) J. Appl.
Cryst, 24, 537-540.
16
Size and shape of molecules in solution can be
extracted from the scattering pattern using a
series of computer algorithms. DAMMIN uses an
ab initio method to build models of the protein
shape by simulated annealing using a single-phase
dummy-atoms model (Svergun, 1999). GASBOR uses
similar parameters to DAMMIN however, instead of
the dummy-atom model, an ensemble of dummy
residues are used to form a chain-compatible
model (Svergun et al., 2001).
Svergun, D.I. (1999) Biophys. J., 76,
2879-2886. Svergun, D.I., Petoukhov, M.V. and
Koch, M.H.J. (2001) Biophys. J., 80, 2946-2953.
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
START
FINISH
21
Output files from Dammin and Gasbor are entered
into a series of programs (DAMAVER), which align
the models and produce an average of the models.
In order to obtain a reliable representation of
the protein shape, DAMMIN and GASBOR need to be
repeated a number of times and averaged. The
greater the number of repetitions the more
accurate model is produced.
22
Transglutaminases are a family of enzymes that
are capable of introducing isopeptide bonds in or
between polypeptide chains
The average shape of 20 independent simulations
produced from DAMMIN.
23
Using Condor, Dammin and Gasbor can be run
multiple times for the same protein, and also
multiple times for a number of different proteins
simultaneously. Before Condor, the total time
for 20 repeat runs of approx. 36 mins would have
been approximately 12 h on one PC. Using Condor,
20 repeat runs were performed in approximately 36
mins. Representing a significant performance
gain in terms of accessibility.
24
A Submit Script Generator (SSG) was developed by
James Osborne to assist running DAMMIN and GASBOR
using the Condor toolkit. The SSG asks the
user only once for the necessary information to
prepare and submit multiple jobs to Condor
thereby reducing the time taken to submit and
process multiple proteins.
25
Running Dammin on Condor
1) put your gnom.out files into the input
directory 2) double click on make.bat This runs
makesubmit.exe which will ask you some
questions 3) double click on submit.bat The
jobs are submitted You can check the progress of
your jobs using "condor_q" When your jobs are
finished 4) copy the input and output
directories somewhere safe 5) double click on
clean.bat 6) Go to 1
26
(No Transcript)
27
(No Transcript)
28
Central Manager
master, collector, negotiator
Execute Nodes
1600 Workstations
Submit Nodes
30 Workstations
master, schedd, shadow
master, startd, starter
29
gtRun gtcmdgt condor_q
30
Summary User friendly Easy to use Overall,
Condor has proved invaluable to our research
since the work is completed rapidly and
efficiently
31
Related Publications Lammie D., Osborne J.,
Aeschlimann D., Wess T.J. (2007) Rapid shape
determination of tissue transglutaminase using
high-throughput computing. Acta crystallographica
section D-biological crystallography, 63
1022-1024. Mankelow T.J., Burton N.,
Stefansdottir F. O., Spring F. A., Parsons S. F.,
Pedersen J. S., Oliveira C. L., Lammie D, Wess
T., Mohandas N., Chasis J. A., Brady R.L., Anstee
D.J. (2007) The Laminin 511/521-binding site on
the Lutheran blood group glycoprotein is located
at the flexible junction of Ig domains 2 and 3.
Blood, 1103398-406. Dyksterhuis L. B., Baldock
C., Lammie D., Wess T. J., Weiss A.S. (2007)
Domains 17-27 of tropoelastin contain key regions
of contact for coacervation and contain an
unusual tum-containing crosslinking domain.
Matrix Biology, 26 125-135. Baldock C., Siegler
V., Bax D. V., Cain S. A., Mellody K. T., Marson
A., Haston J. L., Berry R., Wang M.C., Grossmann
J. G., Roessle M., Kielty C. M., Wess T. J.
(2006) Nanostructure of fibrillin-1 reveals
compact conformation of EGF arrays and mechanism
for extensibility. Proc Natl Acad Sci U S A,
10311922-7.
32
Acknowledgements Cardiff University Tim
Wess Daniel Aeschlimann James Osborne (E-mail
condor_at_cardiff.ac.uk )
Write a Comment
User Comments (0)
About PowerShow.com