Title: Develop a Costeffective Data Preprocessing System for the Secondary Storage
1DART Storage Interoperability WP SI-7
- Develop a Cost-effective Data Pre-processing
System for the Secondary Storage - By
- Lead Investigator A.B.M. Russel
- Chief Investigator Dr. Asad I. Khan
- Monash University
- 12th April, 2006
- Melbourne
2DART Data Pre-processing System Goals
- Identify a cost-effective solution
- Open standards-based
- Open-source software wherever possible
- Develop a cost-effective infrastructure for
- Data processing
- Data Security
- Data Transfer
- Data Archiving
- Integration with other DART WPs
3DART Data Pre-processing System Services
- Data processing service Globus GRAM, SGE
- Data security service CA, GSI, MyProxy
- Data transfer service GridFTP, RFT
- Data archiving service Metadata, OGSA-DAI
- Data digitization service NetCDF, CIF, tar, zip
- Data replication service RLS, SRB, LDR
- This work package would initially provide an
infrastructure for pre-processing Protein
Crystallography and Climate Modelling static
data.
4DART Pre-processing System Data Flow
2b. Acquiring Instruments/Sensors dynamic data
2a. Acquiring CD/DVD static data
2c. Acquiring SAN static/dynamic data
01010
01010
01010
01010
01010
01010
01010
01010
01010
5. Storing pre-processed data into Secondary
storage
4. Pre-processing raw data
1. User requests for data acquisition
3. Storing raw data into Primary storage
5User Web Client interface for data pre-processing
system
User login into the web interface
Grid status check
6Grid status display
7User Web Client interface for data pre-processing
system
Retrieving proxy certificates from Online
credential repository (MyProxy)
8User Web Client interface for data pre-processing
system
Proxy certificates retrieved for the session
9User Web Client interface for data pre-processing
system
User sets job submission parameters
10User Web Client interface for data pre-processing
system
Job submitted to the Grid for pre-processing
11User Web Client interface for data pre-processing
system
Job completed, output saved on Secondary storage
12Sample Data Pre-processing Job Output
3D atomic structure of protein after processing
Protein crystallography raw data
13DART Data Pre-processing System Technology
- Middleware
- Globus Toolkit (GSI, GridFTP, Web Services /
Pre-WS GRAM) - Sun N1 Grid Engine
- MPI
- SRB DART WP SI-9
- CIMA DART WP SI-4
- OGSA-DAI, MyProxy and Shibboleth
- Compilers
- JAVA, ANT, GCC, gFortran
- INTEL compilers (C, Fortran)
- Software
- PHASER, RAPPAER, BLAST, CCP, CNS
- NetCDF, CIF
- FastTCP DART WP SI-8
- Ganglia
14Thank you for your attention!
Links http//grid.its.monash.edu.au8080/grids
phere http//grid.its.monash.edu.au/ganglia