Collective Communications - PowerPoint PPT Presentation

About This Presentation
Title:

Collective Communications

Description:

Collective Communications Paul Tymann Computer Science Department Rochester Institute of Technology ptt_at_cs.rit.edu Collective Communications There are certain ... – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 20
Provided by: Computer84
Learn more at: https://www.cs.rit.edu
Category:

less

Transcript and Presenter's Notes

Title: Collective Communications


1
Collective Communications
  • Paul Tymann
  • Computer Science Department
  • Rochester Institute of Technology
  • ptt_at_cs.rit.edu

2
Collective Communications
  • There are certain communication patterns that
    appear in many different types of applications
  • MPI provides routines that implement these
    patterns
  • Barrier synchronization
  • Broadcast from one member to all other members
  • Gather data from an array spread across
    processors into one array
  • Scatter data from one member to all members
  • All-to-all exchange of data
  • Global reduction (e.g., sum, min of "common" data
    elements)
  • Scan across all members of a communicator

3
Characteristics
  • MPI collective communication routines differ in
    many ways from MPI point-to-point communication
    routines
  • Involve coordinated communication within a group
    of processes identified by an MPI communicator
  • Substitute for a more complex sequence of
    point-to-point calls
  • All routines block until they are locally
    complete
  • Communications may, or may not, be synchronized
    (implementation dependent)
  • In some cases, a root process originates or
    receives all data
  • Amount of data sent must exactly match amount of
    data specified by receiver
  • Many variations to basic categories
  • No message tags are needed
  • MPI collective communication can be divided into
    three subsets synchronization, data movement,
    and global computation.

4
Data Movement
  • MPI provides three types of collective data
    movement routines
  • broadcast
  • gather
  • scatter
  • allgather
  • alltoall
  • Let's take a look at the functionality and syntax
    of these routines.

5
Broadcast
  • Exactly what it says
  • Implementation will do what ever is most
    efficient for the given hardware (might use a
    reduction)
  • MPI_Bcast( buffer, count, datatype, root,
    communicator )
  • What might catch you by surprise is that the
    receiving process calls MPI_Bcast() as well
  • I often use broadcast to distribute parameters

6
Mandelbrot
The Mandelbrot set is a connected set of points
in the complex plane. Pick a point z0 in the
complex plane. Calculatez1 z02 z0z2 z12
z0z3 z22 z0. . . If the sequence z0 , z1
, z2 , z3 , ... remains within a distance of 2 of
the origin forever, then the point z0 is said to
be in the Mandelbrot set. If the sequence
diverges from the origin, then the point is not
in the set.
7
Parallel Mandelbrot
  • Can be done using farmer/worker since the
    calculation of each pixel in the picture is
    independent of any other pixel value.
  • We need to distribute a number of parameters to
    each of the processors
  • The size of the window
  • Location of the center
  • Width
  • Maximum number of iterations

8
The Farmer
void manager( int numProcs, char host )
double msg WORK_SIZE int maxMessageSize
( WINDOW_SIZE / ( numProcs - 1 )
WINDOW_SIZE ( numProcs -1 ) )
WINDOW_SIZE int result maxMessageSize
int i MPI_Status status int count
msg _PIXELS WINDOW_SIZE msg _X
X_CENTER - ( WIDTH / 2.0 ) msg _Y
Y_CENTER - ( WIDTH / 2.0 ) msg _WIDTH
_WIDTH msg _ITERS ITERATIONS
9
The Farmer
MPI_Bcast( msg, WORK_SIZE,
MPI_DOUBLE,
0, MPI_COMM_WORLD ) for
( i 0 i lt numProcs - 1 i i 1 )
MPI_Recv( ) // Parameters omitted
MPI_Get_count( status, MPI_INT, count )
drawTile( win, WINDOW_SIZE, numProcs,
status.MPI_SOURCE, result )
10
A Worker
void worker( int myRank, int numProcs, char
host ) double msg WORK_SIZE int
result MPI_Status status double x
double pointsPerPixel int colStart int
numCols / Obtain parameters from manager
/ MPI_Bcast( msg, WORK_SIZE,
MPI_DOUBLE, 0,
MPI_COMM_WORLD ) / Rest of the
program has been omitted /
11
MPE
  • MPI Parallel Environment (MPE) is a software
    package that contains a number of useful tools
  • Profiling Library
  • Viewers for logfiles
  • Parallel X Graphics library
  • Debugger setup routines
  • MPE is not part of the SUN HPC package, but it
    works with it. I have compiled and installed it
    in my account

12
X Routines
  • MPE_Open_graphics - (collectively) opens an X
    Windows display
  • MPE_Draw_point - Draws a point on an X Windows
    display
  • MPE_Draw_points - Draws points on an X Windows
    display
  • MPE_Draw_line - Draws a line on an X11 display
  • MPE_Fill_rectangle - Draws a filled rectangle on
    an X11 display
  • MPE_Update - Updates an X11 display
  • MPE_Close_graphics - Closes a X11 graphics device
  • MPE_Xerror( returnVal, functionName )
  • MPE_Make_color_array - Makes an array of color
    indices MPE_Num_colors - Gets the number of
    available colors MPE_Draw_circle - Draws a circle
  • MPE_Draw_logic - Sets logical operation for
    laying down new pixels
  • MPE_Line_thickness - set thickness of lines
  • MPE_Add_RGB_color( graph, red, green, blue,
    mapping )
  • MPE_Get_mouse_press - Waits for mouse button
    press
  • MPE_Iget_mouse_press - Checks for mouse button
    press
  • MPE_Get_drag_region - get rubber-band box''
    region (or circle

13
Using X Routines
include mpe.h include mpe_graphics.h MPE_XG
raph win MPE_Open_graphics( win,
// Display handle
MPI_COMM_SELF, // Communicator
(char )0, // X Display
-1, -1, //
Location on screen 500, 500,
// Size
MPE_GRAPH_INDEPENDENT ) // Collective MPE_Draw_p
oint( win, // Display
handle col,
// Coordinate of row,
// the point color
) // Color to
use MPE_Close_graphics( win )
// Display handle
14
Compiling
  • A little more to compiling
  • mpcc -I/home/fac/ptt/pub/mpe/include
    -L/home/fac/ptt/mpe/lib -o mandel
    Mandelbrot.c -lmpi -lm -lmpe
    -lX
  • Run it the same way

15
Reduce
  • MPI provides functions that perform standard
    reductions across processorsint MPI_Reduce(
    void operand,
  • void result, int count,
    MPI_Datatype type, MPI_Op operator,
    int root MPI_Comm comm )

Operation Name Meaning
MPI_MAX Maximum
MPI_MIN Minimum
MPI_SUM Sum
MPI_PROD Product
MPI_LAND Logical and
MPI_BAND Bitwise and
MPI_LOR Logical or
MPI_BOR Bitwise or
MPI_LXOR Logical xor
MPI_BXOR Bitwise xor
MPI_MAXLOC Max and location
MPI_MINLOC Min and location
16
Dot Product
  • The dot product of two vectors is defined as
  • x y x0y0 x1y1 x2y2 xn-1yn-1
  • Imagine having two vectors each containing n
    elements stored on p processors
  • Each processor will have N n/p elements
  • Lets assume a block distribution of data meaning
  • P0 has x0, x1, , xN-1 and y0, y1, yN-1
  • P1 has xN, xN1, , x2N-1 and yN, yN1, y2N-1

17
Serial_dot
float Serial_dot( float x, float y, int n )
int i float sum 0.0 for ( i 0 i
lt n i ) sum sum x i y i
return sum
18
Parallel_dot
float Parallel_dot( float local_x,
float local_y, int
local_n ) float local_dot float dot
0.0 local_dot Serial_dot( local_x, local_y,
local_n ) MPI_Reduce( local_dot,
dot, 1,
MPI_FLOAT, MPI_SUM,
0, MPI_COMM_WORLD ) return
dot / only process 0 will have result /
19
MPI_Allreduce
  • Note that MPI_Reduce() leaves the result in the
    root processor
  • What if you wanted the result everywhere?
  • You could reduce and broadcast
  • Consider the modified reduction
Write a Comment
User Comments (0)
About PowerShow.com