High-Throughput Crystallography at Monash - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

High-Throughput Crystallography at Monash

Description:

X-ray image. Mounted crystal Streaming Video (SV) Lab SV. Lab Still Pics. Sensor Data ... Automated X-ray data reduction. Automated processing of the diffraction data ... – PowerPoint PPT presentation

Number of Views:77
Avg rating:3.0/5.0
Slides: 17
Provided by: noel9
Category:

less

Transcript and Presenter's Notes

Title: High-Throughput Crystallography at Monash


1
High-Throughput Crystallography at Monash
  • Noel Faux
  • Dept of Biochemistry
  • and Molecular Biology
  • Monash University

2
Structural Biology Pipe Line
Cloning
Purification
X-ray diffraction
Determine the structure
Expression
Crystallisation
Australian synchrotron online in 2007
Data processing and structural determination
major bottle neck
High throughput robots and technologies Tecan
Freedom Evolution ÄKTAxpress Trialing crystal
storage and imaging facilities
Target tracking / LIMS
Data Management
Phasing (CCP4/CNS GRID computing)
3
The problems
  • Target-tracking/Data management
  • The process of protein structure determination
    creates a large volume of data.
  • Storage, security, traceability, management and
    backup of files is ad-hoc.
  • Remote access of the files is limited and
    requires different media formats.
  • Structure determination
  • CPU intensive

4
  • Part of a National Project for the development
    of eResearch platforms for the management and
    analysis of data for research groups in
    Australia.
  • Aim establish common standardised software /
    middleware applications that are adaptable to
    many research capabilities

5
Solution
  • Central repository of files
  • Attach metadata to the files
  • World wide secure access to the files
  • Automated collection and annotation of the files
    from in-house and synchrotron detectors

6
The infrastructure
X-ray image
Collection PC
Mounted crystal Streaming Video (SV)
Lab Temp
Instrument Rep
Crystal Temp
Sensor Data
Lab SV
Kepler
Lab Still Pics
Monash University ITs Sun GRID 54 dual 2.3 GHz
CPUs 208.7 GB (3.8 GB per node) gt10 TB storage
capacity Running Gridsphere
Lab PC
Storage Resource Broker
7
Central web portal
8
Central web portal
9
Automated X-ray data reduction
  • Automated processing of the diffraction data
  • Investigating the incorporation of Xia2
    Automated Data Reduction
  • New automated data reduction system designed to
    work from raw diffraction data and a little
    metadata, and produce usefully reduced data in a
    form suitable for immediately starting phasing
    and structure determination (CCP4)

1
1. (Graeme Winter) The CCP4 suite programs for
protein crystallography. (1994). Acta
Crystallogr. D50, 760-763.
10
Divide and Conquer
  • A large number of CPUs available across different
    computer clusters at different locations
  • Monash ITs Sun grid
  • VPAC
  • (Brecca 97 dual Xeon 2.8 GHz CPUs, 160 GB (2 GB
    per node) total memory Edda 185 Power5 CPUs,
    552 GB (8-16 GB per node) total memory)
  • APAC
  • 1680 processors, 3.56 terabytes of memory, 100
    terabytes of disk
  • Personal computers

11
DART and CCP4
  • Aims Use the CCP4 interface locally but run the
    jobs remotely across a distributed system
  • Nimrod to distribute the CCP4 jobs across the
    different Grid systems
  • Investigating the possibility of incorporating
    the CCP4 interface into the DART web portal

12
Exhaustive Molecular Replacement
  • No phasing data
  • No sequence identity (lt20)
  • No search model
  • Is there a possible fold homolog
  • Exhaustive Phaser scan of the PDB
  • Exhaustive searches with different parameters and
    search models

2
2. Acta Cryst. (2005). D61, 458-464.
Likelihood-enhanced fast translation functions A.
J. McCoy, R. W. Grosse-Kunstleve, L. C. Storoni
and R. J. Read.
13
Exhaustive Molecular Replacement
  • Proteins building blocks are domains
  • Use subset of SCOP as search models in a PHASER
    calculation.
  • The use of Grid computing will make this possible
    1000 CPUs days for typical run
  • Search at the family level
  • Take the highest resolution structure
  • Mutate to poly-alanine, and delete loops and
    turns
  • Phaser
  • Families with z-score ? 6 search with each of
    their domain members

14
Exhaustive Molecular Replacement
  • Each node runs a perl script
  • Requests a job
  • Launch phaser
  • Returns the results
  • Repeats until the list is exhausted
  • Database containing
  • ToDo list
  • Parameters
  • Results

ITs Sun GRID
56 dual dual AMD OpteronCPUs 208.7 GB (3.8 GB per
node) gt10 TB storage capacity, 160 GB (2 GB per
node) total memory
Will be extended to use Nimrod to gain access to
APAC and the Pacific Rim Grid (Pragma)
15
Final Pipeline
Cloning
Purification
X-ray diffraction
Determine the structure
Expression
Crystallisation
High through put robotics and technologies
Xia2
Data collection, management, storage, and remote
access DART
Data processing, exhaustive experimental (e.g.,
SAD, SIRAS, MIRAS) and MR phasing for final
refinement Grid Computing NIMROD PHASER AutoSHARP
CCP4 DART
16
Acknowledgments
  • Monash University
  • Anthony Beitz
  • Nicholas McPhee
  • James Whisstock
  • Ashley Buckle
  • James Cook University
  • Frank Eilert
  • Tristan King
  • DART Team
Write a Comment
User Comments (0)
About PowerShow.com