A Brief Look At MPI - PowerPoint PPT Presentation

About This Presentation
Title:

A Brief Look At MPI

Description:

Message container, called a buffer, which is frequently assumed to be specified ... not return until the receiving message buffer contains all of the message ... – PowerPoint PPT presentation

Number of Views:14
Avg rating:3.0/5.0
Slides: 25
Provided by: robertab2
Learn more at: http://www.carc.unm.edu
Category:
Tags: mpi | brief | buffer | look

less

Transcript and Presenter's Notes

Title: A Brief Look At MPI


1
A Brief Look At MPIs Point To Point Communication
  • Brian T. Smith
  • Professor, Department of Computer Science
  • Director, Albuquerque High Performance Computing
    Center (AHPCC)

2
Point To Point Communication
  • What is meant by this concept?
  • There is a sender and a receiver
  • The sender prepares a message in a package from
    the application storage area
  • The sender has a protocol on how it contacts and
    communicates with the receiver
  • The protocol is an agreement on how the
    communication is set up
  • The sender and receive agree to and how to
    communicate
  • The receiver receives the message package per its
    agreement with the sender
  • The receiver processes the packet and installs
    the data in the application storage area

3
Communication Models
  • Many models are feasible and have been
    implemented in various environments, past and
    current
  • MPIs goal is to be portable across all of the
    reasonable models
  • This means that essentially NO assumptions can be
    made either
  • by the implementation, or
  • by the user
  • as to which model is or can be used
  • Lets talk about two possible models
  • Models like these actually were used informally
    and differently by individual CPUs in our
    recent trial communications amongst the three
    institutions

4
MPIs Conventions
  • Messages have a format or a template
  • Message container, called a buffer, which is
    frequently assumed to be specified in user space
    the storage set up by the users code
  • Length in terms of number of objects of message
    type
  • The type of objects in the message (basic type or
    user defined type)
  • A message tag a user specified integer id for
    the message
  • Destination (for the sender) or source (for the
    receiver) of the message
  • The destination is the rank of the process in the
    process group
  • Communication world or group named arrangement
    established by calls to MPI

5
MPIs Conventions Continued
  • Kinds of communication
  • Blocking
  • Sender does not return from an MPI call until the
    message buffer (the users container for the
    message) can be reused without corrupting the
    message that is being sent
  • Receiver does not return until the receiving
    message buffer contains all of the message
  • Non-blocking
  • Sender call returns after sufficient processing
    has been performed to allow the processor in a
    separate and independent thread to complete
    sending the message in particular, changes in
    the sending tasks message buffer may change the
    message sent
  • Receiver call returns after sufficient processing
    has been performed to allow the processor in a
    separate and independent thread to complete
    receiving the message in particular, receiver
    tasks message buffer likely changes after the
    receiver call returns to the users code
  • Other MPI procedures test or wait for the
    completion of sends and receives

6
MPI Conventions Continued
  • Modes of communication (contact protcols and
    assumptions)
  • These are assumptions that may be made by the
    user and the implementation must follow these
    assumptions
  • Modes are determined by the name of the MPI SEND
    procedure used
  • Eg MPI_BSEND specifies a buffered send
  • Standard (no letter)
  • Assumes no particular protocol used see later
    modes for typical protocols
  • Because no protocol is assumed, the programmer
    must assume the most restrictive one is used
    namely Ready mode
  • Non-local operation another process may have
    to do something before this operation completes
  • Buffered (B letter)
  • Buffers created used by the protocol and
    allocated in user-space
  • Send can be started whether or not a receive has
    been posted
  • Local operation another process does not have
    to do anything before this operation completes

7
Modes Continued
  • Synchronous (S letter)
  • Rendezvous semantics implemented
  • Sender starts but does not complete until the
    receiver has posted a receive
  • Buffer may be created in the receivers space or
    may be a direct transfer
  • Non-local operation
  • Ready (R letter)
  • Sender starts only if the matching receive has
    been posted
  • Erroneous if receive not posted result is
    undefined
  • Non-local operation
  • Highest performance as it can be a direct
    transfer with no buffer

8
MPI Conventions Continued
  • Communication worlds or communicators
  • Specifies the domain of the processes within the
    group
  • A processor may be in more than one processor
    group
  • Each processor has a rank in each group
  • The rank of a particular process may be different
    in each group
  • The purpose of the groups is to arrange the
    processors so that it is convenient to
    send/receive message to the particular group and
    others processors do not see the message
  • Processors in a grid (north-south-east-west
    communication)
  • Processors distributed in a line or row or column
    of a grid
  • Processors in a circle
  • Processors in a hypercube configuration

9
Pictures of Implementation Models
Receiver
Send buffer used Receive buffer used
User data
Buffer
10
Pictures of Implementation Models
Receiver
Sender
No send buffer used No receive buffer used
User data
User data
Buffer
Buffer
Receiver
Sender
No send buffer used Receive buffer used
User data
User data
Buffer
Buffer
11
Blocking Communication Operations
  • MPI_SEND and MPI_RECV
  • Lets look at 3 reasonable ways to perform
    communication between 2 processors which exchange
    messages
  • One always works
  • One always deadlocks
  • That is, both processors hang waiting for the
    other to communicate
  • One may or may not work depending on the actual
    protocols used by the MPI implementation

12
This One Always Works
  • Steps
  • Determine what rank the process is
  • If rank 0
  • Send a message from send_buffer to process with
    rank 1
  • Receive a message into recv_buffer from process
    with rank 1
  • Else if rank 1
  • Receive a message into recv_buffer from process
    with rank 0
  • Send a message from send_buffer to process with
    rank 0
  • Pattern of communication (doesnt matter who (0
    or 1) executes first)

13
Example Code Always Works
  • Call MPI_Comm_rank( comm, rank, ierr)
  • If( rank 0 ) then
  • call MPI_Send( sendbuf, count, MPI_REAL,
  • 1, tag, comm, ierr )
  • call MPI_Recv( recvbuf, count, MPI_REAL,
  • 1, tag, comm, status, ierr )
  • Else if( rank 1 ) then
  • call MPI_Recv( recvbuf, count, MPI_REAL,
  • 0, tag, comm, status, ierr )
  • call MPI_Send( recvbuf, count, MPI_REAL,
  • 0, tag, comm, ierr )
  • Endif

14
This One Always Deadlocks
  • Steps
  • Determine what rank the process is
  • If rank 0
  • Receive a message into recv_buffer from process
    with rank 1
  • Send a message from send_buffer to process with
    rank 1
  • Else if rank 1
  • Receive a message into recv_buffer from process
    with rank 0
  • Send a message from send_buffer to process with
    rank 0
  • Pattern of communication (doesnt matter who (0
    or 1) executes first)

15
Example Code Always Deadlocks
  • Call MPI_Comm_rank( comm, rank, ierr)
  • If( rank 0 ) then
  • call MPI_Recv( recvbuf, count, MPI_REAL,
  • 1, tag, comm, status, ierr )
  • call MPI_Send( sendbuf, count, MPI_REAL,
  • 1, tag, comm, ierr )
  • Else if( rank 1 ) then
  • call MPI_Recv( recvbuf, count, MPI_REAL,
  • 0, tag, comm, status, ierr )
  • call MPI_Send( recvbuf, count, MPI_REAL,
  • 0, tag, comm, ierr )
  • Endif

16
This One may or May Not Work The Worst Of All
Possibilities
  • That is, it may work on one implementation and
    not work on another
  • Whether it works may depend on the size of the
    message or other unknown features of the
    implementation
  • It relies on the buffering of the messages for
    which the code does not specify no MPI_BSEND
    used or no MPI_Buffer_attach
  • Pattern of communication (doesnt matter who (0
    or 1) executes first)

17
Example Code May Fail
  • Call MPI_Comm_rank( comm, rank, ierr)
  • If( rank 0 ) then
  • call MPI_Send( sendbuf, count, MPI_REAL,
  • 1, tag, comm, ierr )
  • call MPI_Recv( recvbuf, count, MPI_REAL,
  • 1, tag, comm, status, ierr )
  • Else if( rank 1 ) then
  • call MPI_Send( recvbuf, count, MPI_REAL,
  • 0, tag, comm, ierr )
  • call MPI_Recv( recvbuf, count, MPI_REAL,
  • 0, tag, comm, status, ierr )
  • Endif

18
An Application Showing These Issues Very Close
To Your Code
  • Consider a 2-D Jacobi iteration (n ? n matrix)
    using a 5 point stencil
  • The data structure to be used here is a 1-D data
    structure
  • The coding illustrations are simpler here
  • However, this code does not scale well when the
    ratio of the size of the problem n to the number
    of processors is large the practical case
  • The communication overhead is too large in this
    case
  • The algorithm or computation is
  • Given an initial data for the matrix A, compute
    the average of the E-W-N-S neighbors of a point
    and assign it to the matrix B
  • Assign matrix B to A and repeat the process until
    the process has converged

19
Serial Code
  • real A(0n1,0n1), B(1n,1n)
  • ! Main loop
  • do while( .NOT. Converged(A) )
  • do j 1, n
  • b(1n,j) 0.25(a(0n-1,j)a(2n,j)
  • a(1n,j-1)a(1n,j1))
  • enddo
  • a(1n,1n) b(1n,1n)
  • enddo

20
Partitioning A an B Amongst The Processors
  • For simplicity of explaining the SEND/RECV
    commands, we use a 1-D partition

0
m1
0
m1
0
0
A
n1
n1
m
1
1
B
n
Process 0
21
Code For This -- Unsafe
  • real A(0n1,0n1), B(1n,1n)
  • ! Call MPI to return p (number of processors),
    and myrank
  • ! Assume m is an integral multiple of p
  • ! Main loop
  • do while( .NOT. Converged(A) )
  • ! Compute with A and store in B as in the
    serial code
  • if( myrank gt 0 ) then
  • ! Send first column of B to last column of A of
    myrank-1
  • endif
  • if( myrank lt p-1 ) then
  • ! Send last column of B to first column of A of
    myrank1
  • endif
  • if( myrank gt 0 ) then
  • ! Receive last column of B to first column of A
    of myrank-1
  • endif
  • if( myrank lt p-1 ) then
  • ! Receive first column of B to last column of A
    of myrank1
  • endif
  • enddo

22
Unsafe Why?
  • All the sends are executed before any received is
    posted
  • Assumes as before that the messages are buffered
  • This should not be assumed in standard mode
  • Solution
  • Divide the processors in two groups even and
    odd proccssors
  • The odd processors send to the even processors
    first
  • Then the odd processors receive from the even
    processors
  • The even processors receive from the odd
    processors first
  • Then the even processors send to the odd
    processors
  • The effect is to interleave the send and receive
    commands so that no buffers are required to
    complete the communication
  • They, of course, may be used

23
Safe Communication
  • do while( .NOT. Converged(A) )
  • ! Compute with A and store in B as in the
    serial code
  • if( mod(myrank,2) 1 ) then ! Odd
    ranked processors
  • ! Send first column of B to last column of A of
    myrank-1
  • ! If not the last processor, send the last
    column of B to ! processor myrank1
  • ! Receive into first column of A from processor
    myrank-1
  • ! If not the last processor, receive into last
    column of A ! from processor myrank1
  • else ! Even ranked processors
  • if( mod(myrank,2) 1 ) then ! Odd
    ranked processors
  • ! If not the first processor, receive last
    column of B to ! first column of A of
    myrank-1
  • ! If not the last processor, receive the first
    column of B to ! processor myrank1
  • ! If not the first processor, send into first
    column of B to ! processor myrank-1
  • ! If not the last processor, send the last
    column of B
  • ! to processor myrank1
  • endif
  • enddo

24
Safe And Simpler Communications
  • Use the send/receive commands for all but the
    first and last processors
  • Use null processes to avoid the use of the
    special cases of dealing with the first and last
    processors
Write a Comment
User Comments (0)
About PowerShow.com