Title: DUROC DynamicallyUpdated Request Online Coallocator
1DUROC (Dynamically-Updated Request Online
Co-allocator)
- August 8, 2002
- MPI Seminar
- Distributed Computing Communication Laboratory
- Jiyoung song
2Introduction of DUROC
- What is DUROC ?
- Dynamically-Updated Request Online Co-allocator
- Bring up the distributed pieces of the jobs
- Interface to obtain resources and execute jobs
across multiple management pools - Similar to GRAM, with addition of the subjob-add,
subjob-delete, and barrier-release operation for
managing resources
3Globus components
4Structure of duroc runtime module
Duroc_runtime_module
Duct_runtime_module
Gram_myjob_module
Nexus_module
Globus_io_module
5Flow of duroc_runtime_module initialization
Globus_duroc_runtime_activate()
s_duroc_runtime_ activate()
s_intra_subjob_init()
s_inter_subjob_init()
s_inter_subjob_duct_init()
globus_duct_runtime_init()
Globus_duroc_runtime_activate()
6Globus_duroc_runtime_activate()
- Call s_duroc_runtime_activate ()
- Error handling of the call
7S_duroc_runtime_activate()
- Globus_module_activate()
- GLOBUS_COMMON_MODULE
- GLOBUS_THREAD_MODULE
- GLOBUS_NEXUS_MODULE
- GLOBUS_DUCT_RUNTIME_MODULE
- GLOBUS_GRAM_MYJOB_MODULE
- Call s_intra_subjob_init ()
- Call s_inter_subjob_init ()
8S_intra_subjob_init()
- Call globus_hashtable_init()
- Create the hash table whose name is
s_tagged_gram_myjob_hasht
9S_inter_subjob_init()
- Check gram_rank
- If gram_rank 0 then
- Create the hash table whose name is
s_inter_subjob_tagged_duct_hasht - Call s_inter_subjob_duct_init()
- Else
- just return 0
10DUROC
DUROC
CO-ALLOCATION
Subjob_size4
Subjob_size2
Subjob_size2
Subjob_size4
SUBJOB_INDEX0
SUBJOB_INDEX1
SUBJOB_INDEX2
SUBJOB_INDEX3
0
0
0
0
1
1
1
1
2
2
3
3
Grid4.sogang.ac.kr
Grid1.sogang.ac.kr
Grid2.sogang.ac.kr
Grid3.sogang.ac.kr
11S_inter_subjob_duct_init()
- Get the environmental variables
- GLOBUS_DUROC_DUCT_CONTACT, GLOBUS_DUROC_DUCT_ID
- Initialize the fifo queue
- s_inter_subjob_duct_fifo
- Initialize mutex and cond
- s_inter_subjob_duct_mutex
- Call globus_duct_runtime_init()
- s_inter_subjob_duct_cond
12Globus_duct_runtime_init()
- Get the startpoint to the server (maybe
subjob_index0, gram_rank0) - globus_duct_runtime_make_startpoint()
- url or LSP(linearized startpoint) depend on
(const char ) GLOBUS_DUROC_DUCT_CONTACT - Use globus_io module
- Set the argument
- globus_duct_runtime_t s_inter_subjob_duct_runtime
- Data_port, config_port, callback function
- Make the endpoint(data_port, config_port) to
communicate - Bind sp and ep
- Send the rsr(checkin message) to the server
- Handler id CHECKIN_MSG_ID
13Other functions of duroc runtime module
- globus_duroc_runtime_intra_subjob_rank()
- globus_duroc_runtime_intra_subjob_size()
- globus_duroc_runtime_inter_subjob_structure()
- globus_duroc_runtime_inter_subjob_send()
- globus_duroc_runtime_inter_subjob_receive()
- globus_duroc_runtime_intra_subjob_send()
- globus_duroc_runtime_intra_subjob_receive()
14globus_duroc_runtime_intra_subjob_rank()
- Obtain the rank of the local subjob process
- Wrapper function
- gram_myjob_rank()
- Error handling
15Globus_duroc_runtime_intra_subjob_size()
- Obtain the size of the local subjob process
- Wrapper function
- gram_myjob_size()
- Error handling
16globus_duroc_runtime_inter_subjob_structure()
- The DUROC inter-subjob communication routines can
only be called on the subjob node where
globus_duroc_runtime_intra_subjob_rank() reports
the rank as zero - Arguments
- Int local_addressp
- Int remote_addressesp
- Initialize local_addressp with the local
subjob's communication address - Initialize remote_addressesp with a
freshly-allocated array containing the remote
subjobs' communication addresses
17Flow of globus_duroc_runtime_inter_subjob_structur
e()
globus_duroc_runtime_inter_subjob_structure()
s_inter_subjob_duct_structure()
globus_duct_runtime_structure()
static globus_duct_runtime_t s_inter_subjob_duct_r
untime In globus_duroc_runtime.c
18globus_duroc_runtime_intra_subjob_send()
- Send a byte-vector to another process in the
DUROC subjob - Make the message the form of GLOBUS_DUROC_RUNTIME_
INTRA_SEND_PROTOCOL_VERSION - Call gram_myjob_send()
- Arguments
- Int dst_rank
- the rank of the destination process
- Cont char tag
- a nul-terminated string which must match that
provided to the receive call on the destination
process - Example "globus_duroc_runtime run status"
- Globus_byte_t msg
19globus_duroc_runtime_intra_subjob_receive()
- Receive a byte-vector sent by another process in
the DUROC subjob - Messages are queued and reordered if the process
receives messages with a different tag than the
one requested by the receiving call (by hash
table look up) - Call gram_myjob_receive()
20globus_duroc_runtime_inter_subjob_send()
- Send a byte-vector to another subjob in the DUROC
job - The DUROC inter-subjob communication routines can
only be called on the subjob node where
globus_duroc_runtime_intra_subjob_rank() reports
the rank as zero
21globus_duroc_runtime_inter_subjob_send()
s_inter_subjob_duct_send()
globus_duct_runtime_send()
22globus_duct_runtime_send()
- Look up the hash table
- Hash table runtimep-gtremote_data_spst, key
dst_addr - Obtain the startpoint to communicate
- Sent the RSR(message) to the context
- Handler id DATA_MSG_ID
23globus_duroc_runtime_inter_subjob_receive()
- The same as globus_duroc_runtime_intra_subjob()
except the hash table name - s_inter_subjob_tagged_duct_hasht
24(No Transcript)
25(No Transcript)
26(No Transcript)