Message Passing Models - PowerPoint PPT Presentation

About This Presentation

Title:

Message Passing Models

Description:

Title: Slide 1 Author: Miodrag Last modified by: SITE / EITI Created Date: 9/9/2005 1:56:44 AM Document presentation format: On-screen Show (4:3) Company – PowerPoint PPT presentation

Number of Views:37

Avg rating:3.0/5.0

Slides: 21

Provided by: Miod

Category:

more less

Transcript and Presenter's Notes

Title: Message Passing Models

1
Message Passing Models

Miodrag Bolic

2
Overview

Hardware model
Programming model
Message Passing Interface

3
Generic Model Of A Message-passing Multicomputer
5
Node
Node
Node
Node
Node
Node
Message-passing
direct network
interconnection
Node
Node
Node
Node
Node
Node
Gyula Fehér
4
Generic Node Architecture 5
External
channel
Fat-Node
Node
Node
-powerful processor
-large memory
-many chips
Node-processor
-costly/node
-moderate parallelism
Processor
Local memory
....
Thin-Node
Internal
channel(s)
-small processor
Router
-small memory
External
-one-few chips
channel
Communication
-cheap/node
Processor
-high parallelism
External
Switch unit
....
channel
External
channel
Gyula Fehér
5
Generic Organization Model 5
Switching network
PM
PM
CP
CP
S
S
PM
PM
PM
CP
CP
CP
(c) Centralized
(b) Decentralized
Gyula Fehér
6
Message Passing Properties 1

Complete computer as building block, including
I/O
Programming model directly access only private
address space (local memory)
Communication via explicit messages
(send/receive)
Communication integrated at I/O level, not memory
system, so no special hardware
Resembles a network of workstations (which can
actually be used as multiprocessor systems)

7
Message Passing Program 1

Problem Sum all of the elements of an array of
size n.
INITIALIZE //assign proc_num and num_procs
if (proc_num 0) //processor with a proc_num of
0 is the master,
//which sends out messages and sums the result
read_array(array_to_sum, size) //read the array
and array size from file
size_to_sum size/num_procs
for (current_proc 1 current_proc lt num_procs
current_proc)
lower_ind size_to_sum current_proc
upper_ind size_to_sum (current_proc 1)
SEND(current_proc, size_to_sum)
SEND(current_proc, array_to_sumlower_indupper_in
d)
//master nodes sums its part of the array
sum 0
for (k 0 k lt size_to_sum k)
sum array_to_sumk
global_sum sum

8
Message Passing Program (cont.) 1

Multiprocessor Software Functions Provided
INITIALIZE assigns a number (proc_num) to each
processor in the system, assigns the total number
of processors (num_procs).
SEND(receiving_processor_number, data) - sends
data to another processor
BARRIER(n_procs) When a BARRIER is encountered,
a processor waits at that BARRIER until n_procs
processors reach the BARRIER, then execution can
proceed.

9
Advantages 1

Advantages
Easier to build than scalable shared memory
machines
Easy to scale (but topology is important)
Programming model more removed from basic
hardware operations
Coherency and synchronization is the
responsibility of the user, so the system
designer need not worry about them.
Disadvantages
Large overhead copying of buffers requires large
data transfers (this will kill the benefits of
multiprocessing, if not kept to a minimum).
Programming is more difficult.
Blocking nature of SEND/RECEIVE can cause
increased latency and deadlock issues.

10
Message-Passing Interface MPI 3

Standardization - MPI is the only message passing
library which can be considered a standard. It is
supported on virtually all HPC platforms.
Practically, it has replaced all previous message
passing libraries.
Portability - There is no need to modify your
source code when you port your application to a
different platform that supports the MPI
standard.
Performance Opportunities - Vendor
implementations should be able to exploit native
hardware features to optimize performance.
Functionality - Over 115 routines are defined.
Availability - A variety of implementations are
available, both vendor and public domain.

11
MPI basics 3

Start Processes
Send Messages
Receive Messages
Synchronize
With these four capabilities, you can construct
any program.

12
Communicators 3

Provide a named set of processes for
communication
System allocated unique tags to processes
All processes can be numbered from 0 to n-1
Allow construction of libraries application
creates communicators
MPI_COMM_WORLD
MPI uses objects called communicators and groups
to define which collection of processes may
communicate with each other.
Provide functions (split, duplicate, ...) for
creating communicators from other communicators
Functions (size, my_rank, ) for finding out
about all processes within a communicator
Blocking vs. non-blocking

13
Hello world example 3

include ltstdio.hgt
include "mpi.h"
main(int argc, char argv)
int my_PE_num
MPI_Init(argc, argv)
MPI_Comm_rank(MPI_COMM_WORLD, my_PE_num)
printf("Hello from d.\n", my_PE_num)
MPI_Finalize()

14
Hello world example 3

Hello from 5.
Hello from 3.
Hello from 1.
Hello from 2.
Hello from 7.
Hello from 0.
Hello from 6.
Hello from 4.

15
MPMD 3

Use MPI_Comm_rank
if (my_PE_num 0)
Routine1
else if (my_PE_num 1)
Routine2
else if (my_PE_num 2)
Routine3 . . .

16
Blocking Sending and Receiving Messages 3

include ltstdio.hgt
include "mpi.h"
main(int argc, char argv)
int my_PE_num, numbertoreceive, numbertosend42
MPI_Status status
MPI_Init(argc, argv)
MPI_Comm_rank(MPI_COMM_WORLD, my_PE_num)
if (my_PE_num0)
MPI_Recv( numbertoreceive, 1, MPI_INT,
MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD,
status)
printf("Number received is d\n",
numbertoreceive)
else
MPI_Send( numbertosend, 1, MPI_INT, 0, 10,
MPI_COMM_WORLD)
MPI_Finalize()

17
Non-Blocking Message Passing Routines 4

include "mpi.h"
include ltstdio.hgt
int main(int argc, char argv)
int numtasks, rank, next, prev, buf2, tag11,
tag22
MPI_Request reqs4
MPI_Status stats4
MPI_Init(argc,argv)
MPI_Comm_size(MPI_COMM_WORLD, numtasks)
MPI_Comm_rank(MPI_COMM_WORLD, rank)
prev rank-1 next rank1
if (rank 0) prev numtasks - 1
if (rank (numtasks - 1)) next 0
MPI_Irecv(buf0, 1, MPI_INT, prev, tag1,
MPI_COMM_WORLD, reqs0)
MPI_Irecv(buf1, 1, MPI_INT, next, tag2,
MPI_COMM_WORLD, reqs1)

18
Collective Communications 3

The Communicator specifies a process group to
participate in a collective communication
MPI implements various optimized functions
Barrier synchronization
Broadcast
Reduction operations
with one destination or all in group destination
Collective operations are blocking

19
Comparison MPI vs. OpenMP
Features OpenMP MPI
Apply parallelism in steps yes no
Scale to large number of processors maybe yes
Code complexity Small increase Major increase
Runtime environment Expensive compilers Free
Cost of hardware Very expensive Cheap

20
References

J. Kowalczyk, Multiprocessor Systems, Xilinx,
2003.
D. Culler, J. P. Singh, Parallel Computer
Architectures, A Hardware/Software Approach,
Morgan Kaufman, 1999.
MPI Basics
Message Passing Interface (MPI)
D. Sima, T. Fountain and P. Kascuk, Advanced
Computer Architectures A Design Space Approach,
Pearson, 1997.