Message Passing Programming with MPI - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Message Passing Programming with MPI

Description:

MPI specifies the API for message passing (communication related routines) ... Create a file hosts' specifying the machines to be used to run MPI programs. ... – PowerPoint PPT presentation

Number of Views:84
Avg rating:3.0/5.0
Slides: 36
Provided by: xiny2
Category:

less

Transcript and Presenter's Notes

Title: Message Passing Programming with MPI


1
Message Passing Programming with MPI
  • Introduction to MPI
  • Basic MPI functions
  • Most of the MPI materials are obtained from
    William Gropp and Rusty Lusks MPI tutorial at
    http//www.mcs.anl.gov/mpi/tutorial/

2
Message Passing Interface (MPI)
  • MPI is an industrial standard that specifies
    library routines needed for writing message
    passing programs.
  • Mainly communication routines
  • Also include other features such as topology.
  • MPI allows the development of scalable portable
    message passing programs.
  • It is a standard supported pretty much by
    everybody in the field.

3
  • MPI uses a library approach to support parallel
    programming.
  • MPI specifies the API for message passing
    (communication related routines)
  • MPI program C/Fortran program MPI
    communication calls.
  • MPI programs are compiled with a regular
    compiler(e.g gcc) and linked with an mpi library.

4
MPI execution model
  • Separate (collaborative) processes are running
    all the time.
  • mpirun machinefile machines np 16 a.out ? The
    same a.out is executed on 16 machines.
  • Different from the OpenMP model.
  • What about the sequential portion of an
    application?

5
MPI data model
  • No shared memory. Using explicit communications
    whenever necessary.
  • How to solve large problems (e.g. SOR)?
  • Logically partition the large array and
    distribute the large array into processes.

6
Compiling, linking and running MPI programs
  • MPICH is installed on linprog
  • To run a MPI program, do the following
  • Create a file called .mpd.conf in your home
    directory with content secretwordcluster
  • Create a file hosts specifying the machines to
    be used to run MPI programs.
  • Boot the system mpdboot n 3 f hosts
  • Check if the system is corrected setup
    mpdtrace
  • Compile the program mpicc hello.c
  • Run the program mpiexec machinefile hostmap n
    4 a.out
  • Hostmap specifies the mapping
  • -n 4 says running the program with 4 processes.
  • Exit MPI mpdallexit

7
  • MPI specification is both simple and complex.
  • Almost all MPI programs can be realized with six
    MPI routines.
  • MPI has a total of more than 100 functions and a
    lot of concepts.
  • We will mainly discuss the simple MPI, but we
    will also give a glimpse of the complex MPI.
  • MPI is about just the right size.
  • One has the flexibility when it is required.
  • One can start using it after learning the six
    routines.

8
The hello world MPI program (example1.c)
  • include "mpi.h"
  • include ltstdio.hgt
  • int main( int argc, char argv )
  • MPI_Init( argc, argv )
  • printf( "Hello world\n" )
  • MPI_Finalize()
  • return 0
  • Mpi.h contains MPI definitioins and types.
  • MPI program must start with MPI_init
  • MPI program must exit with MPI_Finalize
  • MPI functions are just library routines that can
    be used on top of the regular C, C, Fortran
    language constructs.

9
  • MPI uses the SPMD model (one copy of a.out).
  • How to make different process do different things
    (MIMD functionality)?
  • Need to know the execution environment Can
    usually decide what to do based on the number of
    processes on this job and the process id.
  • How many processes are working on this problem?
  • MPI_Comm_size
  • What is myid?
  • MPI_Comm_rank
  • Rank is with respect to a communicator (context
    of the communication). MPI_COM_WORLD is a
    predefined communicator that includes all
    processes (already mapped to processors).
  • See example2.c

10
Sending and receiving messages in MPI
  • Questions to be answered
  • To who are the data sent?
  • What is sent?
  • How does the receiver identify the message

11
  • Send and receive routines in MPI
  • MPI_Send and MPI_Recv (blocking send/recv)
  • Identify peer Peer rank (peer id)
  • Specify data Starting address, datatype, and
    count.
  • An MPI datatype is recursively defined as
  • predefined, corresponding to a data type from the
    language (e.g., MPI_INT, MPI_DOUBLE)
  • a contiguous array of MPI datatypes
  • a strided block of datatypes
  • an indexed array of blocks of datatypes
  • an arbitrary structure of datatypes
  • There are MPI functions to construct custom
    datatypes, in particular ones for subarrays
  • Identifying message sender id tag

12
MPI blocking send
  • MPI_Send(start, count, datatype, dest, tag, comm)
  • The message buffer is described by (start, count,
    datatype).
  • The target process is specified by dest, which is
    the rank of the target process in the
    communicator comm.
  • When this function returns, the data has been
    delivered to the system and the buffer can be
    reused. The message may not have been received
    by the target process.

13
MPI blocking receive
  • MPI_Recv(start, count, datatype, source, tag,
    comm, status)
  • Waits until a matching (both source and tag)
    message is received from the system, and the
    buffer can be used
  • source is rank in communicator specified by comm,
    or MPI_ANY_SOURCE (a message from anyone)
  • tag is a tag to be matched on or MPI_ANY_TAG
  • receiving fewer than count occurrences of
    datatype is OK, but receiving more is an error
    (result undefined)
  • status contains further information (e.g. size of
    message, rank of the source)
  • See pi_mpi.c and jacobi_mpi.c for the use of
    MPI_Send and MPI_Recv.

14
  • The Simple MPI (six functions that make most of
    programs work)
  • MPI_INIT
  • MPI_FINALIZE
  • MPI_COMM_SIZE
  • MPI_COMM_RANK
  • MPI_SEND
  • MPI_RECV
  • Only MPI_Send and MPI_Recv are non-trivial.

15
The MPI PI program
Logical partition the domain
h 1.0 / (double) n sum 0.0 for (i
myid 1 i lt n i numprocs) x h
((double)i - 0.5) sum 4.0 / (1.0 xx)
mypi h sum if (myid 0) for
(i1 iltnumprocs i) MPI_Recv(tmp,
1, MPI_DOUBLE, i, 0, MPI_COMM_WORLD, status)
mypi tmp else MPI_Send(mypi,
1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD) / see
pi_mpi.c /
  • h 1.0 / (double) n
  • sum 0.0
  • for (i 1 i lt n i)
  • x h ((double)i - 0.5)
  • sum 4.0 / (1.0 xx)
  • mypi h sum

Explicit communication
16
More on MPI
  • Nonblocking point-to-point routines
  • Deadlock
  • Collective communication

17
Non-blocking send/recv routines
  • Non-blocking primitives provide the basic
    mechanisms for overlapping communication with
    computation.
  • Non-blocking operations return (immediately)
    request handles that can be tested and waited
    on.
  • MPI_Isend(start, count, datatype, dest,
    tag, comm, request)
  • MPI_Irecv(start, count, datatype, dest,
    tag, comm, request)
  • MPI_Wait(request, status)

18
Overlapping communication with computation
routines
  • MPI_Isend()
  • MPI_Irecv()
  • Computation
  • MPI_Wait()
  • MPI_Wait()

19
  • One can also test without waiting
  • MPI_Test(request, flag, status)
  • MPI allows multiple outstanding non-blocking
    operations.
  • MPI_Waitall(count, array_of_requests,
    array_of_statuses)
  • MPI_Waitany(count, array_of_requests, index,
    status)

20
Sources of Deadlocks
  • Send a large message from process 0 to process 1
  • If there is insufficient storage at the
    destination, the send must wait for memory space
  • What happens with this code?
  • This is called unsafe because it depends on the
    availability of system buffers

21
Some Solutions to the unsafe Problem
  • Order the operations more carefully

Supply receive buffer at same time as send
22
More Solutions to the unsafe Problem
  • Supply own space as buffer for send (buffer mode
    send)

Use non-blocking operations
23
MPI Collective Communication
  • Send/recv routines are also called point-to-point
    routines (two parties). Some operations require
    more than two paries, e.g broadcast, reduce. Such
    operations are called collective operations, or
    collective communication operations.
  • Three classes of collective operations
  • Synchronization
  • data movement
  • collective computation

24
Synchronization
  • MPI_Barrier( comm )
  • Blocks until all processes in the group of the
    communicator comm call it.

25
Collective Data Movement
Broadcast
Scatter
B
C
D
Gather
26
Collective Computation (see example3a.c)
27
MPI Collective Routines
  • Many Routines Allgather, Allgatherv, Allreduce,
    Alltoall, Alltoallv, Bcast, Gather, Gatherv,
    Reduce, Reduce_scatter, Scan, Scatter, Scatterv
  • All versions deliver results to all participating
    processes.
  • V versions allow the hunks to have different
    sizes.
  • Allreduce, Reduce, Reduce_scatter, and Scan take
    both built-in and user-defined combiner functions.

28
SOR sequential version
29
SOR MPI version
  • How to partitioning the arrays?
  • double gridn1n/p1, tempn1n/p1

30
SOR MPI version
  • Receive grid1..n0 from the process myid-1
  • Receive grid1..nn/p from process myid1
  • Send grid1..n1 to process myid-1
  • Send grid1..nn/p-1 to process myid1
  • For (i1 iltn i)
  • for (j1 jltn/p j)
  • tempij 0.25 (gridij-1gridij
    1 gridi-1j gridi1j)

Local tempij and physical tempij?
31
Sequential Matrix Multiply
  • For (I0 Iltn I)
  • for (j0 jltn j)
  • cIj 0
  • for (k0 kltn k)
  • cIj cIj aIk
    bkj

MPI version? How to distribute a, b, and c? What
is the communication requirement?
32
Sequential TSP
  • Init_q() init_best()
  • While ((p dequeue()) ! NULL)
  • for each expansion by one city
  • q addcity (p)
  • if (complete(q)) update_best(q)
  • else enqueue(q)

MPI version? Need to implement a distributed
queue!!
33
MPI discussion
  • Ease of use
  • Programmer takes care of the logical
    distribution of the global data structure
  • Programmer takes care of synchronizations and
    explicit communications
  • None of these are easy.
  • MPI is hard to use!!

34
MPI discussion
  • Expressiveness
  • Data parallelism
  • Task parallelism
  • There is always a way to do it if one does not
    care about how hard it is to write the program.

35
MPI discussion
  • Exposing architecture features
  • Force one to consider locality, this often leads
    to more efficient program.
  • MPI standard does have some items to expose the
    architecture feature (e.g. topology).
  • Performance is a strength in MPI programming.
  • Would be nice to have both world of OpenMP and
    MPI.
Write a Comment
User Comments (0)
About PowerShow.com