Title: HighThroughput Proteome Annotation in a Grid Portal Environment
1High-Throughput Proteome Annotation in a Grid
Portal Environment
- Steps towards routine use of the Grid
-
- Wilfred Li
- San Diego Supercomputer Center
- UCSD
2Growth of Non-Redundant (NR) Database
3Tools for Proteome Annotation
- Protein Annotation Toolbox
- A C library for structural annotation of
proteins - Integrative Genome Annotation Pipeline
- A wrapper for bioinformatics applications
- AppLeS Parameter Sweep Template
- A grid application environment
- Bioinformatics Workflow Management System
- A generic WMS for bioinformatics applications
- Grid Portal Environment
- Interactive Use and Monitoring of Grid Resources
4Protein sequences
structure info
sequence info
Prediction of signal peptides (SignalP,
PSORT) transmembrane (TMHMM, PSORT) coiled
coils (COILS) low complexity regions (SEG)
NR, PFAM
SCOP, PDB
Step 1
Building FOLDLIB PDB chains SCOP domains PDP
domains CE matches PDB vs. SCOP 90 sequence
non-identical minimum size 25 aa coverage (90,
gaps lt30, endslt30)
Structural assignment of domains by WU-BLAST
Step 2
Structural assignment of domains by PSI-BLAST
profiles on FOLDLIB
Step 3
Structural assignment of domains by 123D on
FOLDLIB
Step 4
Functional assignment by PFAM, NR assignments
FOLDLIB
Step 5
Domain location prediction by sequence
Step 6
Data Warehouse
5Grid Middleware
MDS/NWS/Ganglia
SCP/GASS/SRB/FTP
SSH/GRAM/GASS PBS/Loadleveler/Condor
6BWMS
7Encyclopedia of Life A Global Collaborative
Project
BeSC
TiTech
CNIC
JLU
SDSC/US
BII
YMU/NTU
UFCG
MU
8GridSpeed Architecture
9(No Transcript)
10EOL Workflow Web Interface
11(No Transcript)
12EOL Book Interface
13EOL Workflow
BWMS
Users
Routine Use of the Grid, one genome at a time,
one resource at a time
Status
Output
Tasks
Japst (Grail Lab)
Local Cluster
Grid Speed (Titech)
Prediction
Loading
Genome Database
GridMonitor (BII)
External Data Source
Web Services
Status update
Job Status Database
iGAP tasks
14PRAGMA Partners
- BII (Singapore)
- Grid middleware deployment
- Workflow web interface
- http//blast.bii-sg.org8090/eol/
- Resource sharing
- Viper cluster-BII
- Scientific Exchange
- Titech (Japan)
- Condor pool
- GridSpeed
- A grid portal environment for speedy deployment
of applications - http//www.gridspeed.org
- New partners
- Australia, Brazil, Ireland, China.
15SDSC/UCSD Partners
- Integrative Biosciences Department
- ROCKS
- Rocks 3.1 Grid Roll
- The EOL cluster
- http//saxicolous.sdsc.edu/ganglia/
- DAKS
- DataStar
- Advanced Database Laboratory
- SRB/Data Matrix
- UCSD
- Life Sciences Initiatives
- Campus collaborations
16Acknowledgement
Acknowledgement
- SDSC
- Fran Berman
- Director
- Philip E. Bourne
- Mark Miller
- Project Coordinator
- Ilya N. Shindyalov
- CE
- Greg Quinn
- Web service
- Coleman Mosley
- Vicente Reyes
- Robert Byrnes
- Kim Baldrige
- iCC Director
- Jerry Greenberg
- CE portal
- SDSC
- Philip Papadoplous
- Rocks
- Mason Katz
- Greg Bruno
- Chaitan Baru
- David Archbell
- Adam Birnbaum
- UCSD
- Peter Arzberger
- PRAGMA
- Henri Casanova
- Jim Hayes
- Ceres Inc.
- Nickolai Alexandrov
- 123D
- Richard Flavell
17Acknowledgment
- BII, Singapore
- Larry Ang
- Kishore Sakharkar
- Arun Krishnan
- Atif Shahab
- Other BII members
- Titech, Japan
- Satoshi Matsuoka
- Toyotaro Suzumura
- Kouji Tanaka
- University of Monash, Australia
- David Abramson
- Colin Enticott
- Univ. Federal de Campina Grande, Brazil
- Zane Cirne,
- Eliane Cristina de Araujo
- Queen's University, UK
- Terence J Harmer
- David R Simpson