Filebased MPI Initialization Tutorial - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Filebased MPI Initialization Tutorial

Description:

File-based MPI Initialization (Tutorial) Yonsei Univ. Kyung-Lang Park ... Declare variables. Module activation. Tcp buffer size. Check the size of. MPIR_SHANDLE ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 28
Provided by: supercom
Category:

less

Transcript and Presenter's Notes

Title: Filebased MPI Initialization Tutorial


1
File-based MPI Initialization (Tutorial)
  • Yonsei Univ.
  • Kyung-Lang Park
  • (2005.4. 23)

2
Contents
  • Introduction
  • DUROC-based Initialization (MPICH-G2)
  • File-based Initialization (MPICH-GX)
  • Details of Modification
  • New File Format

3
What is the MPI initialization?
  • A procedure performed in each process
  • To understand the topology
  • What rank am I?
  • How many processes were in the program?
  • To get information of other processes
  • Protocol Type
  • Hostname
  • Listening port
  • It is done by MPI_Init(argc, args)
  • All MPI program should start from MPI_Init

4
DUROC-based Initialization
  • MPICH-G2 uses DUROC when initializing MPI
    processes
  • What is the DUROC?
  • Dynamically-Updated Request Online Coallocator
  • Allocate a job across multiple resource managers
  • What does the DUROC do?
  • Request Processing (requestor-side,
    inglobusrun)
  • Support Runtime Communication (process-side, in
    cpi)
  • Based on NEXUS
  • Each MPI process obtains topology information by
    using DUROC API
  • DUROC_INTRA_SUBJOB_RANK()
  • DUROC_INTRA_SUBJOB_SIZE()
  • MPI Processes exchange information by using DUROC
    API
  • DUROC_INTER_SUBJOB_SEND()
  • DUROC_INTRA_SUBJOB_SEND()
  • Before the initialization, processes do not know
    about protocol information of other processes

5
DUROC-based Initialization Steps
Declare variables
Mpich_globus2_debug_init()
Module activation
Build_channels()
Duroc_barrier
Create my_commworld_id
Tcp buffer size
Distributed_byte_array (from my_commworld_id To
commword_id_vector))
Check the size of MPIR_SHANDLE
Getting MyGlobusJobContact
Check G2_MAXHOSTNAMELEN
Getting rank_in_my_subjob My_subjob_size
Distributed_byte_array (from MyglobusJobContact T
o Gramjobcontactsvectors)
Get_topology()
Create_my_miproto()
Distribute_byte_array (from miproto To
miproto_vector)
Depend on DUROC API
6
File-based Initialization
  • Each process obtains all necessary information
    from a given file
  • Topology information
  • Protocol information
  • Delete dependency of Globus toolkit
  • We can run MPI program without globusrun and DUROC

7
File-based Initialization Steps
Declare variables
Mpich_globus2_debug_init()
Module activation
Build_channels()
Duroc_barrier
Create my_commworld_id
Create_my_commm_world_id_vector()
Tcp buffer size
Distributed_byte_array (from my_commworld_id To
commword_id_vector))
Check the size of MPIR_SHANDLE
Getting MyGlobusJobContact
Socket_Barrier
Check G2_MAXHOSTNAMELEN
Getting rank_in_my_subjob My_subjob_size
Distributed_byte_array (from MyglobusJobContact T
o Gramjobcontactsvectors)
Get topology from the file ()
Get_topology()
Create_my_miproto() (Using a given port)
Distribute_byte_array (from miproto To
miproto_vector)
Get mi_protos_vectors_from_file()
Depend on the File
8
Modified Initialization Steps
9
File Format
Rank_in_my_subjob
My_subjob_size
Unique value for commworld_id
MPID_MyWorldSize
Barrier_port
nsubjobs
hostname
MPI process listen port
Front nodes hostname
0
2
4
2
33501
9292
36000
dccsaturn
dccsun.sogang.ac.kr
dccneptune
1
2
4
2
33501
9292
36000
dccsun.sogang.ac.kr
cluster203.yonsei.ac.kr
cluster203.yonsei.ac.kr
0
2
4
2
33501
9292
36000
cluster202.yonsei.ac.kr
cluster202.yonsei.ac.kr
1
2
4
2
33501
9292
36000
dccsun.sogang.ac.kr
1
0
dccsaturn
33501
12
LAN_ID_foo
0
dccsun.sogang.ac.kr
1
0
dccneptune
33501
12
LAN_ID_foo
0
cluster203.yonsei.ac.kr
1
0
cluster203.yonsei.ac.kr
33501
12
LAN_ID_foo
1
1
0
cluster202.yonsei.ac.kr
33501
12
LAN_ID_foo
1
cluster202.yonsei.ac.kr
Front nodes hostname
lan_id
localhost_id
s_tcptype
port
wan_id_length
hostname
s_nprotos
10
File Format Description
  • in_my_subjob means rank in ones subjob.
  • my_subjob_size means size of ones subjob.
  • MPID_MyWorldSize means total size of MPI job.
  • nsubjobs means number of subjob.
  • MPI process listen port means listening port of
    MPI process, which must besame value with the
    port of second part.
  • unique value for commworld_id means unique id to
    construct COMMWORLD.
  • barrier_port means listening port for
    synchronization between processes in COMMWORLD.
  • hostname means hostname of computational node
    running each a MPI process.
  • front nodes hostname
  • means hostname of front node connected with
    computational node running each a MPI process, in
    case with environment to use private IP
    addresses. Otherwise, it means hostname of
    computational node running each a MPI process.
    For instance, while first line of Figure 3
    describes topology information which execution
    node is dccsaturn where front node is
    dccsun.sogang.ac.kr and has private IP address,
    third line of Figure 3 shows information which
    computational node is cluster203.yonsei.ac.kr
    where has public IP address..
  • s_nprotos means kinds of used protocol.
  • s_tcptype means type of used protocol 0 is tcp,
    1 is mpi, and 2 is unknown.Currently, tcp type
    could be supported.
  • hostname means hostname of computational node
    running each a MPI process.
  • port means listening port of MPI process.
  • lan_id_lng means length of lan_id
  • lan_id means identification that node is on the
    designated LAN.
  • localhost_id means identification that node is on
    the designated intra-machine area rather than LAN
    or WAN.
  • front nodes hostname
  • means hostname of front node connected with
    computational node running each a MPI process, in
    case with environment to use private IP
    addresses. Otherwise, it means hostname of
    computational node running each a MPI process.

11
Global Declaration
  • define MAX_PENDING 5
  • define PROXY_MSG_SIZE 128
  • struct gp_guid_t my_guid
  • unsigned short Proxy_Port 12269
  • extern globus_bool_t g2_proxy_connect( struct
    gp_guid_t dest_gp_guid, struct gp_guid_t
    src_gp_guid,
  • unsigned short dest_port, globus_io_attr_t
    attr, globus_io_handle_t handle)
  • static unsigned short user_port (unsigned
    short)0
  • char master_hostnameG2_MAXHOSTNAMELEN
  • char front_nameG2_MAXHOSTNAMELEN
  • int get_topology(char file_contents, int
    MPID_MyWorldRank, int rank_in_my_subjob, int
    my_subjob_size, int MPID_MyWorldSize,
  • int nsubjobs, unsigned short user_port, int
    channel_id, unsigned short barrier_port, char
    front_name)
  • void get_mi_protos_vector(char file_contents,
    int MPID_MyWorldSize, globus_byte_t
    mi_protos_vector, int mi_protos_vector_lengths,
  • char master_hostname)
  • void get_commworld_id(char master_hostname, int
    channel_id, int MPID_MyWorldSize, globus_byte_t
    my_commworld_id_vector)

12
Local Declaration in globus_init()
  • int i
  • globus_byte_t my_miproto
  • globus_byte_t mi_protos_vectors
  • int mi_protos_vector_lengths
  • int nbytes
  • int rank_in_my_subjob
  • int my_subjob_size
  • int nsubjobs
  • int subjob_addresses
  • int rc
  • file ifp
  • char file_name
  • Int channel_id, string_count
  • Unsigned short barrier_port
  • char file_contents
  • Create_my_gp_guid()

Additional variables for reading the file
Obtaining the frontname and my hostname
13
Create_my_gp_guid()
  • void create_my_gp_guid( struct gp_guid_t
    my_guid )
  • struct gp_guid_t tmp_guid
  • char front
  • / Allocating my_guid /
  • tmp_guid (struct gp_guid_t)globus_libc_mall
    oc(sizeof(struct gp_guid_t))
  • tmp_guid-gtcompute_name (char)globus_libc_ma
    lloc(sizeof(char)G2_MAXHOSTNAMELEN)
  • / Front_name querying get_env based /
  • front globus_libc_getenv("FRONT_NAME")
  • if (front ! GLOBUS_NULL)
  • strcpy(tmp_guid-gtfront_name, front)
  • globus_libc_gethostname(tmp_guid-gtcompute_nam
    e, G2_MAXHOSTNAMELEN)
  • else

14
Module activation barrier
  • Act GLOBUS_DUROC_RUNTIME_MODULE
  • globus_duroc_runtime_barrier() --gtDeleted
  • Act GLOBUS_COMMON_MODULE
  • Act GLOBUS_IO_MODULE
  • if (sizeof(MPIR_SHANDLE) gt globus_dc_sizeof_u_lon
    g(1)) ERROR
  • if (G2_MAXHOSTNAMELEN lt MAXHOSTNAMELEN) ERROR

15
File open
  • File_name gobus_libc_getenv(FILENAME)
  • While(!(ifp fopen(file_name, r))
  • globus_libc_fprintf(stderr, Cannot open)
  • globus_libc_usleep(1000000)
  • Char input_c int count 0
  • if(!(file_contents (char)globus_libc_malloc(5120
    sizeof(char))))
  • globus_libc_fprintf(stderr,"ERROR failed
    malloc of d bytes for input_string\n",5119sizeof
    (char))
  • exit(1)
  • while((input_c fgetc(ifp)) ! EOF)
  • (file_contents count) input_c
  • if(count gt 5119)
  • globus_libc_fprintf(stderr,"fail process
    information file is too big.\n")
  • exit(1)

16
Getting basic information
  • Globus_duroc_runtime_intra_subjob_rank(rank_in_my
    _subjob)
  • Globus_duroc_runtime_intra_subjob_size(my_subjob_
    size)
  • - These statements can be removed because
    the process reads information from the file
  • get_topology(rank_in_my_subjob,
    my_subjob_size, subjob_addresses,
    MPID_MyWorldSize, nsubjobs,
    MPID_MyWorldRank)
  • - Subjob_addresses is dynamic information which
    obtained only from the DUROC. But, if we remove
    all DUROC communications, it can be removed.
  • Others are obtained from the file.
  • string_count get_topology(file_contents,
    MPID_MyWorldRank,
  • rank_in_my_subjob, my_subjob_size,
  • MPID_MyWorldSize, nsubjobs,
  • user_port, channel_id,
  • barrier_port, front_name)

17
Get_topology()
  • int get_topology(char file_contents,int
    MPID_MyWorldRank,int rank_in_my_subjob,int
    my_subjob_size,int MPID_MyWorldSize,int
    nsubjobs,unsigned short user_port, int
    channel_id,unsigned short barrier_port,char
    front_name)
  • int i, j, k
  • char my_hostnameG2_MAXHOSTNAMELEN
  • char p_myinfo char file_index char
    s_rank_in_my_subjob5
  • char s_my_subjob_size5 char
    s_MPID_MyWorldSize5 char s_nsubjobs5
  • char s_user_port10 char s_channel_id10 char
    s_barrier_port10
  • file_index file_contents
  • sscanf(file_index, "s s s s s s s s s"
    , s_rank_in_my_subjob, s_my_subjob_size,
    s_MPID_MyWorldSize, s_nsubjobs,s_user_port,
    s_channel_id,
  • s_barrier_port, my_hostname, front_name)
  • sscanf(s_MPID_MyWorldSize, "d",
    MPID_MyWorldSize)
  • for(j0 j lt MPID_MyWorldSize j, i0)
  • sscanf(file_index, "d d d d u d u s s"
    ,rank_in_my_subjob, my_subjob_size,MPID_MyWorldSiz
    e, nsubjobs,user_port, channel_id, barrier_port,
    my_hostname, front_name)
  • if((0 strcmp(my_hostname, my_guid-gtcompute_na
    me)) (0 strcmp(front_name,
    my_guid-gtfront_name)))
  • MPID_MyWorldRank j

Finding my info
18
Create_my_miproto
  • Getting TCP information
  • Hostname
  • Struct in_addr net_addr,net_mask,if_addr
  • Globus_io_tcp_create_listener(port,...) //
    passive socket ready
  • Port number can be read from the file
  • Create char my_mi_proto
  • S_tcptype
  • hostname
  • globus_lan_id ? GLOBUS_LAN_ID
  • globus_wan_id ? GLOBUS_WAN_ID
  • Local_host_id ? GLOBUS_DUROC_SUBJOB_INDEX
  • Can be removed because it will be made from the
    file

19
distribute_byte_array(my_mi_proto,
mi_protos_vector)
  • Exchange my_mi_proto with other processes using
    DUROC API
  • Exchanged my_mi_proto are gathered into
    mi_protos_vector
  • These statements can be removed because
    mi_protos_vector are obtained from the file

20
Get_mi_protos_vector()
  • Void get_mi_protos_vector(char file_contents,
    int MPID_MyWorldSize, globus_byte_t
    mi_protos_vector,
  • int mi_protos_vector_lengths, char
    master_hostname)
  • int count 0, i 0
  • char temp file_contents
  • while(temp)
  • if(tempcount ! '\n')
  • else
  • mi_protos_vector_lengthsi count
  • mi_protos_vectori (globus_byte_t )
    globus_libc_malloc(mi_protos_vector_lengthsi)
  • memcpy((mi_protos_vector i),temp,
    mi_protos_vector_lengthsi)
  • if(i 0)
  • sscanf(temp 4, "s", master_hostname)
  • if(i ! MPID_MyWorldSize)
  • temp temp count
  • else

21
Build_channels
  • Build channel structure using mi_protos_vector
  • ? Build_channel_with_gp_guid()
  • Most are same, but it add guid in channel_table
  • g_malloc(tp-gtgp_guid, struct gp_guid_t ,
    sizeof(struct gp_guid_t))
  • sscanf(cp, "s", tp-gtgp_guid-gtfront_name)
  • tp-gtgp_guid-gtcompute_name tp-gthostname
  • tp-gtgp_guid-gtprocess_id channel_id

22
Create my_commworld_id dist..
  • Create my_commworld_id (root only)
  • hostname globus_libc_get_pid()
  • Read my_commworld_id from the file.
  • We use assigned number instead of dynamic PID.
  • Distribute_byte_array(my_commworld_id)
  • Distribute my_commworld_id to other processes
  • Above two statements can be removed. (we read
    it from the file)
  • Insert channel structure into CommWorldChannelsTa
    ble0 with my_commworld_id

23
MyGlobusGramJobContact and dist..
  • MyGlobusGramJobContact ? getenv(GLOBUS_GRAM_JOB_CO
    NTACT)
  • Distribute_byte_array(..)
  • Exchange MyGlobusGramJobcontact with other
    processes
  • MyGlobusGramJobcontact is dynamic information
    which is made by the globus-job-manager in globus
    2.X environments, so that it cant be obtained
    from the file.
  • However, I think it is not essential information,
    so that it can be deleted

24
Globus_barrier_with_proxy()
  • We move barrier from the head to the last
  • Barrier also should go through the proxy
  • So, namul changed the yonsei-barrier to
    proxy-barrier

25
Modifying the format
Mandatory information for each process Dont
change
Shared informationMove it to the first line
Mandatory information for each process Dont
change
dccsaturn.sogang.ac.kr
0
2
4
2
33501
9292
36000
dccsun.sogang.ac.kr
dccneptune.sognag.ac.kr
1
2
4
2
33501
9292
36000
dccsun.sogang.ac.kr
cluster203.yonsei.ac.kr
cluster203.yonsei.ac.kr
0
2
4
2
33501
9292
36000
1
2
4
2
33501
9292
36000
cluster202.yonsei.ac.kr
cluster202.yonsei.ac.kr
Give global rank for ease-of-understand
1
0
dccsaturn
33501
12
LAN_ID_foo
0
dccsun.sogang.ac.kr
dccsun.sogang.ac.kr
1
0
dccneptune
33501
12
LAN_ID_foo
0
cluster203.yonsei.ac.kr
1
0
cluster203
33501
12
LAN_ID_foo
1
1
0
cluster202
33501
12
LAN_ID_foo
1
cluster202.yonsei.ac.kr
Duplicated InformationDelete
Duplicated Information Delete
Mandatory information Move it to upper side
Optional Information. Read from environmental
variables
Constant Delete
26
New File Format
This is Init file example (You can use for
commenting something
Shared information (MyWorldSize, nsubjobs,
unique value)
4
2
9292
Protocol information of each process
2
0
0
33501
36000
dccsaturn.sogang.ac.kr
dccsun.sogang.ac.kr
2
1
1
33501
36000
dccneptune.sognag.ac.kr
dccsun.sogang.ac.kr
2
0
33501
36000
2
cluster203.yonsei.ac.kr
cluster203.yonsei.ac.kr
1
2
33501
36000
3
cluster202.yonsei.ac.kr
cluster202.yonsei.ac.kr
hostname
Global_rank
Barrier port
Listen port
Front hostname
My_subjob_size
Rank_in_my_subjob
27
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com