Condor in CryoEM image processing - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Condor in CryoEM image processing

Description:

Condor in CryoEM image processing – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 15
Provided by: Csw5
Category:

less

Transcript and Presenter's Notes

Title: Condor in CryoEM image processing


1
Condor in Cryo-EM image processing
Weimin Wu, Wen Jiang
Department of biological sciences Purdue
University 04/30/2008
2
Cryo-EM low temperature electron
microscopy Image processing get the 3D
reconstruction from 2D images. Introduction Viral
infections have been and remain one of the major
threats to human health. Viruses are large
assemblies of proteins and nucleic acids that
rely on infection of hosts to complete their life
cycle and sustain their propagation. High
resolution 3-D structure of the virus particles
will provide important insights to understanding
of these processes and the development of
effective prevention and treatment strategies.
Recently we have demonstrated, in collaboration
with researchers in Baylor College of Medicine
and MIT, the 3-D reconstruction of the infectious
bacterial virus Epsilon15 (e15) at 4.5 Å
resolution, which allowed tracing of the
polypeptide backbone of its major capsid protein
gp7 (Jiang et al., Nature 451(7182)1130-4,
2008).
3
For many of the tailed dsDNA viruses, for example
the bacterial viruses T7, T3 and e15, one of the
12 icosahedral 5-fold vertices is occupied by a
unique 12-fold portal protein complex. This
unique portal vertex is responsible for the
packaging of dsDNA genome into the protein shell
during assembly and the ejection of the dsDNA
genome out of the virus and into the host cell
during infection. However, high resolution
structure of these virus particles, especially
the non-icosahedrally organized components such
as the portal complex, the tail and the
encapsulated dsDNA genome, are lacking. I am
working on this kind of project without enforcing
any symmetry on virus. Now we get a sub-nanometre
resolution result which enables us to visualize
the secondary structure of portal, tail hub and
tail spikes.
4
(A) Schematic diagram of the T7/T3 phage particle
assembly and dsDNA genome packaging pathway.
Adapted from (Serwer, 2004). (B) A cryo-EM
micrograph of T3 phage showing the particles
representing each of the major stages during
assembly and genome packaging.
5
Image processing is a critical step for
generating the macromolecule 3D structure from
the 2D images taken with cryo-EM technique. This
step includes 2D alignment and 3D reconstruction.
Both need intensive computing power. High
performance computing (HPC) resources supported
by RCAC enable us to work on huge datasets for
getting high resolution results and therefore
learning more details of biological system.
6
Scientific needs Two major steps are involved in
the cryo-EM image processing. One is the 2D
alignment step, which is to find the orientation
and center information of the sample particles by
matching the images (2D projection of the sample
particles) with the reference, the other step is
3D reconstruction step, which generates the 3D
map by collecting all the particles orientation
and center information and averaging them.
1second
1 raw image vs 1 projection
22K CPU hours
7
GroEL as example to show the 3D reconstruction
and many iterations needed for high resolution.
For our E15 project, even we started with an
intermediate resolution map (7?), more than 10
iterations were continued for achieving 4.5?.
Features as a function of resolution to show how
to evaluate the resolution qualitatively from
density map
8
Condor Performance
We feel lucky in Purdue to get so many resources
supported by RCAC, otherwise our research will
take forever. Here I list the condor jobs we
submitted and CPU hours we used.
each job took about half a hour.
each job took about one hour due to different
algorithm and other reasons.
9
Running jobs versus Time. This is a long time
job, about 64hours. It is obvious there are three
major peaks. These three periods are overnight
time. At daytime, the number of running jobs drop
a lot due to owner use. The three peaks are
getting smaller mean the user priority is getting
lower. Now it is summer holiday, I can get more
than 3,000 nodes for my condor jobs.
10
We tried to use all the platforms to run our
condor jobs. How about the performance of
different platforms?
The LINUX 64-bit machines are not as fast as we
expected. Why?
11
We checked the remote host condor jobs submitted
to in this test, 90 of LINUX 64-bit machines
were from ccl00.cse.nd.edu.
The condor jobs could go to the nodes out of
campus and the performance was just slightly
worse. It made us more confident to seriously
think about the Teragrid, although we have tried
Teragrid but still used the resources in campus.
Anyway it is a problem when the files to be
transferred are large, for example, more than
700M.
12
High quality Alpha-helix ,Beta sheet and Side
chain, which enabled us to do the modeling and
get the backbone structure.
With icosahedral symmetry
13
  • Our problem/concern about Condor
  • Operation the best thing for us is to submit the
    condor jobs from our desktop, and let condor
    itself to find resources, but now we need specify
    where to go if using Teragrid.
  • File transfer in the case of large file
    transfer, the network becomes bottleneck which
    will easily overload the head node and crash it,
    especially when the file goes outside of campus.
    This is due to large amount of reading from the
    only copy of large dataset. However this might be
    circumvented by applying P2P client into the
    condor because in our image processing 2D
    alignment step, one image will be compared to all
    the reference projections, those projections
    might have been sent to neighboring computers to
    run another condor job, therefore for this condor
    job, the file could be transferred from
    neighboring nodes. Based on this, the number of
    reading from original copy will drop a lot, in
    theory, might be just a few times. The file
    transfer speed will also increase dramatically.

14
Acknowledgment
Preston Smith David Braun Steve Wilson Pia
Mikeal Bruce L. Fuller
  • Reference
  • Jiang et.al Vol4392 February 2006/Nature 04487
  • Jiang et.al Vol45128 February 2008/Nature 06665
Write a Comment
User Comments (0)
About PowerShow.com