Title: Advances in the biomedical applications of the EELA Project
1Advances in the biomedical applications ofthe
EELA Project
- Rafael Mayo
- CIEMAT
- HealthGrid Conference
- Geneva, 24-27.04.2007
2Some of the slides have been taken fromprevious
presentations made byVicente Hernández
(UPV)(EELA Biomed Task Leader)
3Contents
- The EELA Project
- Biomedical Applications
- BiG (Blast in Grids)
- MrBayes
- GATE
- WISDOM (Wide In Silico Docking Of Malaria)
- Summary of Achievements
- Future Plans
4BUILD A BRIDGE BETWEEN CONSOLIDATED
E-INFRASTRUCTURE INITIATIVES IN EUROPE AND
EMERGING ONES IN LATIN AMERICAREINFORCE
COLLABORATION BETWEEN LATIN AMERICA AND EUROPE
AIMS
5 PARTNERS
Italy INFN Portugal LIP Spain CIEMAT, CSIC,
Red.ES, UC, UPV CERN
Argentina UNLP Brazil CEDERJ, RNP, UFF,
UFRJ ChileREUNA, UDEC, UTFSM CubaCUBAENERGIA Me
xicoUNAM PeruSENAMHI VenezuelaULA CLARA
6MANAGEMENT STRUCTURE
- 4 WORK PACKAGES DIVIDED IN TASKS
- WP1 PROJECT ADMINISTRATIVE AND TECHNICAL
MANAGEMENT - WP2 PILOT TESTBED AND SUPPORT
- Task 2.1 Coordination of e-Infrastructure
- Task 2.2 Certification Authorities and Virtual
Organizations - Task 2.3 Pilot Testbed Operations
- Task 2.4 Network Support and Operation
- WP3 IDENTIFICATION AND SUPPORT OF GRID ENHANCED
APPLICATIONS - Task 3.1 Biomedical Applications
- Task 3.2 High Energy Physics Applications
- Task 3.3 Additional Applications (Education and
Climate in the Grid Environment) - WP4 DISSEMINATION ACTIVITIES
- Task 4.1 Dissemination
- Task 4.2 Knowledge Dissemination
7THE EELA WEB SITE
http//www.eu-eela.org
8THE EELA e-INFRASTRUCTURE
9WORK PACKAGE 2 STATUS
10EELA CAs
11THE EELA PILOT TEST-BED
RC Resource Center T-I t-Infrastructure Center
12WORK PACKAGE 4 STATUS
- 15 TUTORIALS More than 1200 participantsdays
delivered so far in 2006! - For grid users (total of participants 300)
- Madrid - Spain (20-22 February 2006)
- Merida - Venezuela (27-29 April 2006)
- Itacuruçá - Brazil (26-28 June 2006)
- Mexico City - Mexico (28-30 August 2006)
- Santiago de Chile (6-7 September 2006)
- Madrid Spain (16-18 October 2006)
- Merida Spain (7-9 November 2006)
- La Plata Argentina (11-12 December 2006)
- Bogota Colombia (6-7 March 2007)
- For grid system administrators (total of
participants 160) - Madrid - Spain (20-24 February 2006)
- Itacuruçá - Brazil (26-30 June 2006)
- Mexico City - Mexico (28 August - 01 September
2006) - Madrid Spain (19-20 October 2006)
13WORK PACKAGE 4 STATUS
- 1st EELA GRID SCHOOL. Itacuruçá - Brazil (4 -15
December 2006) - Training for gridification of new applications
to be run on EELA infrastructure (23
participants) 7/8 applications gridified - Climate applications, proposed by UC-SP,
SENAMHI-PE and UDEC-CL (EELA applications) - Dist Simulation of Multiple Events, proposed by
UNESP-BR, UNICAMP-BR and USP-BR (non-EELA
application) - EMBOSS, proposed by UNAM-MX (non-EELA
application) - LEMDist, proposed by UNAM-MX (EELA application)
- LMS in Grid Env, EELA application proposed by
CITMATEL-CU (non-EELA partner) and CUBAENERGIA-CU
(EELA partner) - SATyrus, proposed by UFRJ-BR (EELA application)
- SegHidro, proposed by UFCG-BR (non-EELA
application) - VoD, proposed by UFRJ-BR and CEDERJ-BR (EELA
application)
14WORK PACKAGE 4 STATUS
- DISSEMINATION 2 bulletins, 2 brochures, 4
posters, 17 press releases, 10 Inf. Sheets, 72
news
15WORK PACKAGE 4 STATUS
- 5 WORKSHOPS
- 1 CONFERENCE
- Future events
- 2 WORKSHOPS (Cuba Mexico)
- 1 CONFERENCE (with BELIEF Brazil)
- Potential new EELA collaborators or partners 51
16WORK PACKAGE 4 STATUS
- Survey of Communities results
- 46 LA communities
- 5 European communities from France,
Luxembourg, Spain, UK
Brazil
Chile
Peru
Mexico
EU
Colombia
Ecuador
Nicaragua
Venezuela
El Salvador
17Biomedical Applications
WORK PACKAGE 3
18Status of the Applications
- EELA Applications
- New Applications
- BiG (BLAST in Grid) is a Grid-enabled BLAST
Interface - BLAST (Basic Local Alignment Search Tool) is a
Bioinformatics Procedure Applied to Identify
Compatible Protein and Nucleotids Sequences in
Protein and DNA Databases. - MrBayes is a Tool for Phylogeny Studies
- A Phylogeny is a Reconstruction of the
Evolutionary History of a Group of Organisms. - EGEE-Ported Applications
- GATE is an Environment for the Monte-Carlo
Simulation of Particle Physics Emission in the
Medical Field. - It is Focused Towards Thyroid Cancer and
Treatment of Metastasis with P32. - WISDOM (Wide In-Silico Docking Of Malaria) is a
Deployment of a High-Throughput Virtual Screening
Platform in the Perspective of In-Silico Drug
Discovery for Neglected Diseases.
19- GATE
- Geant4 Application for Emission Tomography
20GATE
- Gate is Installed on
- tochtli.nucleares.unam.mx GATE-1.0.0-3
- ramses.dsic.upv.es GATE-1.0.0-3
- grid012.ct.infn.it GATE-1.0.0-3 and
VO-biomed-GATE-2.2.0-3 - ce-eela.ciemat.es VO-biomed-GATE-2.1.0-1
- ce01.eela.if.ufrj.br
- ce-eela.ciemat.es
- A demonstration was performed in the frame of
the EU-LAC Summit, Lisbon 28-29 April 2006
21GATE
- User Community
- It is focused towards two main oncological
problems - Thyroid cancer.
- Treatment of metastasis with P32.
- 9 centers in Cuba are interested
- Due to the Lack of Bandwidth from/to Cuba a Local
Configuration is Being Set-up. - A dark fibre is settled to Venezuela
22- WISDOM
- Wide in Silico Docking on Malaria
23First EELA WISDOM Data Challenge
- Objective and Means
- Docking of Two Targets of a Segment of the
Plasmodium vivax Genoma With Respect to the
Non-Redundant Ligand Database. - Execute a First Test on the Infrastructure.
- Using the WISDOM Scripts Developed in EGEE.
- UPV coordinated
- Outcome
- Last Job of the First Target Finished on January
the 1st, 2007. - First Data Challenge Experiment on Reaching 2421
from 2422 Jobs. - 53 Gbytes of Results Obtained.
24First EELA WISDOM Data Challenge
- Problems Faced and Lessons Learnt
- Difference of Scale Between EGEE and EELA. The
Scripts were Not as Efficient for a Reduced
Infrastructure in Which the Sites are Less
Powerful - Automatic Resubmission of Scheduled Jobs Make
Throughput to be Reduced as the Number of Jobs
Remaining Decreased. - Automatic Resubmission Delay was Too Short
Considering the Number of Available Resources.
Changes on the Scripts Were Made on-the-fly. - Two Unexpected Power Cuts, one Internet Cut.
- A New Status was Created to Restart in a
Controlled Way the System.
25First EELA WISDOM Data Challenge
- Problems Faced and Lessons Learnt
- Lack of Response of BDII
- The BDII at the CIEMAT Experimented Some
Malfunctions that Reduced the Performance in the
RB and the Catalogue. - Expiration of Certificates in Hosts and Users,
Change on the Certification Authority. - Launching User Certificate Expired, and Change on
the CA Subject Produced Several Problems. The
Scripts Needed to be Re-Launched. - Several Hosts Expired the Certificates and
Required to Pause the Process During some Hours. - The CRLs of Spanish Sites were Outdated for Two
Days. - Corruption of Files in SEs
- During the Copying Process, Several Fragment of
the ZINC Database were Unsuccessfully Copied. - One File is Corrupted in the Original
Repository!! - No one has Reached the 100 of Completion so Far.
26First EELA WISDOM Data Challenge
- Lessons Learnt for New DCs
- Scripts Must be Adapted to Deal with the
Conditions of EELA Infrastructure. - Double Check the Input Files, even if it Takes
Several Days. - Check the Evolution of the Process Several Times
Per Day. - Check Wall-Time in All Resources.
- Unavoidable Problems
- Unexpected Power Cuts or Internet Cuts.
- Other Site-Specific Administration Problems.
- All These Problems (Normal in a First Large-Scale
Execution) were Solved with the Strong Support
of the Infrastructure Managers of EELA and the
Team of Vincent Breton.
27First EELA WISDOM Data Challenge
- Statistics
- Starting Date October the 23rd, 2006.
- Ending Date January the 1st, 2007.
- Number of Original Jobs of the First Target
2422. - Number of Jobs Successfully Completed 2421.
- Total Number of Submitted Jobs 109551.
- Most of the Submitted Jobs Did Not Reached the
Execution Queues. - Average Computing Time per Job 2065 Minutes.
- Total Effective Running Time 228 cpu/days.
- About Half of the Time the System were Not
Producing Results. Throughput got an Important
Decay As DC was Progressing. - Results Obtained 53 Gbytes.
28First EELA WISDOM Data Challenge
2500
Changes on the resubmission.
2000
CRL Problem
1500
Script does not Restart Properly
1000
CA Change
Power Cut
500
0
23/10/06
01/11/06
12/11/06
22/11/06
02/12/06
12/12/06
22/12/06
01/01/07
29First EELA WISDOM Data Challenge
- Sites Used
- CIEMAT, Since Beginning.
- GRYCAP-UPV, Since Beginning.
- INFN , Since Beginning.
- UNAM, Since Beginning.
- UFRJ, Since Beginning.
- UNICAN, LIP and CERN During the DC Execution.
- More than 40 of the Jobs Were Effectively
Executed in LA Sites.
30 31Blast in Grids Approach
- Use of MPI-Blast kernel
- Enhanced security through a MyProxy server
- Fault tolerant on the client and server side
- Embeddable on a stand-alone application or web
portal - Splitting of input sequences and reference
databases into multiple jobs. Deals with multiple
databases simultaneously
32Blast in Grids Improvements on the Service
- Updates on the Grid Service
- Security Model
- In the Initial Design, The Private Key of
Operation Users Must be Hard-Coded on the Portal
and This Represents a Security Risk. - The New Approach is to Use a MyProxy Repository
To Store Long-Living Credentials and Download
Delegated Proxies on Require. - This Will Also Solve the Problem of Long-Living
Jobs by using Proxy Renewal. - Postprocessing of the Output of the Execution
- Improve the Output Format of the Processing.
33Blast in Grids Improvements on the Security Model
- Main Problems Faced
- Uploading VOMS Credentials A myProxy Credential
is Uploaded in a Proxy Server. - It is Suggested that VOMS Attributes Should be
Added After the Retrieval of a Delegated Copy of
the Proxy gt It Does not Work. - VOMS Attributes Could not be Uploaded With
Standard MyProxy Commands gt Use a New Version
from INFN. - VOMS Credentials Duration and Proxy Renewal A
Delegated myProxy Credential Needs to Be Renewed
for a Long-Living Job. - It Does Not Work with VOMS Credentials. VOMS
Life-Time is 24 Hours gt Unsolved Problem for
Long-living Executions. - Other Problems
- Incorrect Configuration of Automatic Renewal on
RBs - Not All Resource Brokers Have these Features
Correctly Implemented gt EELA RB is Correctly
Working. - Missing Documentation
- Available Recipes are Incorrect.
- There is no Official Documentation lt Important
Assistance from Catania.
34Blast in Grids Improvements on the Postprocessing
- Fixed Number of Returned hits MPIBLAST Bug
- MPIBLAST has a bug and it Does not Return the
Selected Number of Hits but All of them Instead. - This Bug has Been Fixed in the BiG Service.
- New BiG System Monitoring Methods
- Methods to Explore the Installed Databases.
- To Obtain the Input Arguments of a Query.
- A more Accurate Number of Alive Processes Status.
- New Postprocess Methods
- Four new Methods Added
- Two for Processing multiFASTA Results
- Return the Obtained Results (Total and
Non-downloaded Results) in multiFASTA format. - Two for Hits Summary.
- Similar to Above but Using Hits Definitions.
35Blast in Grids Usage Report
- Period Jul06-Dec06.
- Usage Statistics
- Number of Jobs 284.
- CPU Consumed 173 CPU/Days.
- Resources Used ramses.dsic.upv.es2119/jobmanager
-pbs-biomedg. - BiG is Being Used at the University of Los
Andes to Work on the Complete Genome of the
Plasmodium falciparum for the Identification of
DHFR Antigenic Proteins.
36- MrBayes
- Bayesian Inference of Phylogeny
37Phylogeny
- A Phylogeny is a Reconstruction of the
Evolutionary History of a Group of Organisms. - Most Approaches Depend Upon a Mathematical Model
Describing the Evolution of Characters Observed
in the Species Included, and are Usually Used for
Molecular Phylogeny Where the Characters are
Aligned Nucleotide or Amino Acid Sequences. - Proposal Use MrBayes
- MrBayes is a Widely Used Bayesian Inference
Application for Phylogeny. - It is MPI-Enabled.
- Consumes Much Memory and Computing Time.
38Phylogeny
- A Grid-Service to Run Parallel MrBayes Executions
is Being Implemented. - It Requires MrBayes to be Installed at the Sites
and Hides the Parallel Configuration for the
Execution. - The Structure is Similar to the BiG Service, Even
Simplified. - Portal Interface in Progress.
Interface
39WP3 web pagehttp//www.eu-eela.org/eela_wp3.php
WP3 documentshttp//documents.eu-eela.orgCont
actsVicente Hernández vhernand_at_dsic.upv.esRafa
el Mayo rafael.mayo_at_ciemat.es
40Thanks for your attention