Title: Experiences through Grid Challenge Event
1Experiences through Grid Challenge Event
2Grid Challenge
- A competition for programming on a Grid
- Main objectives
- For participants
- To provide opportunities to use real Grid (for
participants) - For us
- To understand obstacles/problems to make a Grid
production level (1000cpus are shared by many
users) - To have an opportunity to encourage participants
to use our software (e.g. Ninf-G, GXP) - 30 students/graduates were participated in this
event - Provide 960cpus testbed for participants
- Schedule
- Preliminary Feb. 1 Feb. 28
- Final round Mar. 5 Mar. 20
3Grid Challenge
- Two categories
- Regular routine
- A problem is provided
- Graphic image analysis
- count the number of objects
- Ranked by the performance, i.e. which is the
fastest program? - Free routine
- Can do anything interesting
- Could have experiences on running his own
software on real Grid
4Software
- Software provided by the organizer
- ssh
- GXP
- GT2 batch jobmanager
- MPICH (p4)
- Ninf-G2
- Other software can be installed by participants
5Contributed resources
Sites nodes/cpus IP addresses Administrated by
TITECH / Matsuoka 100/200 Public Tanaka-san
TITECH / Aida 30/60 Private Prof. Aida students
Tokushima U. 50/100 Private Prof. Ono students
U. Tsukua 20/40 Public students
UEC 50/100 Private AIST Support
U. Tokyo 40/40 Public AIST Support
U. Tokyo 63/126 Private AIST Support
U. Tokyo 107/214 Public AIST Support
AIST 40/80 public AIST Support
Total 500/960
6Preparation ( Feb. 1)
- Administrators installed software in every site
- Participants sent ssh public key
- Administrators created accounts for all
participants - Participants tested each cluster
- login
- compile
- test run
- Participants obtained Globus certificates from
AIST GTRC CA (if necessary) - Participants sent Subject DN and administrators
added their entries to grid-mapfile
7Preparation ( Feb. 1) (contd)
- AIST provided
- A document for obtaining Globus certificate
- Test script for Globus
- A how-to document and sample programs of
Ninf-G2 - How to develop Ninf-G apps step-by-step
- Obtain certificate
- Test globus
- Develop and run Ninf-G apps
- client configuration file for the Grid challenge
environment
8Problems
- 30 participants shared 960 cpus for one month
- Some used ssh for process invocation
- Some used GXP for process invocation
- Some used Ninf-G2 for process invocation
- Need to take care (many) trouble shooting
- Some nodes went down
- pbs daemon died
- students usually made experiments in midnight
- Interactive use of backend nodes (via ssh/GXP)
was allowed - F32 prohibits interactive use
- AIST could not provide F32
9Problems (contd)
- Participants expected that all processes would be
launched immediately (co-allocation) - ssh/GXP enables it
- Ninf-G2 could not expect
- In order to keep fairness, we decided to change
the configuration of batch queuing system - For each processor, set the max number of
processes per user to 1 - Increased the max number of processes per
processor to the number of participants (30) - This is an unusual configuration!!
10Insights valuable for PRAGMA
- Mixture of batch and interactive use introduce a
problem - batch is expected to provide
- dedicated environment
- load balancing
- Interactive use (via ssh) may disturb batch
- But some middleware/apps require interactive use
- co-allocation / grid-level scheduler is hard to
solve - (basically) Applications should not expect all
resources are available - Application developers need extra work for this
feature - Possible solutions
- Make application capable for using only available
resources in as-is strategy - Implement co-allocation based on reservation
- No grid-level reservation system yet
- Should be done manually
- Do we have the same problem in PRAGMA
routine-basis experiments?