Message Passing Interface - PowerPoint PPT Presentation

1 / 108
About This Presentation
Title:

Message Passing Interface

Description:

Basics of MPI implementation (blocking communication) Basic input and output data ... Basics of MPI. MPI header (library) file should be included in user's ... – PowerPoint PPT presentation

Number of Views:158
Avg rating:3.0/5.0
Slides: 109
Provided by: jun6
Category:

less

Transcript and Presenter's Notes

Title: Message Passing Interface


1
Message Passing Interface
  • Outline
  • Introduction to Message passing library (MPI)
  • Basics of MPI implementation (blocking
    communication)
  • Basic input and output data
  • Basic nonbloking communication

2
Introduction
  • Basic concept of message passing
  • Most commonly used method of programming in
    distributed-memory MIMD systems
  • In message passing, the processes coordinate
    their activities by explicitly sending and
    receiving messages

3
Message Passing Interface
4
Introduction to MPI
  • Message Passing Interface (MPI)
  • Commonly used message passing library, which can
    statically allocate processes (number of
    processes is set at the beginning of the program
    execution, and no additional processes are
    created during execution).
  • Each processes is assigned a unique integer rank
    in the rang 0, 1, p-1 (p is the total number of
    processes defined)
  • Basically, one can write a single program and
    execute on different processes (SPMD)


5
Introduction to MPI
  • Message Passing Interface (MPI)
  • The selective execution is based on the
    conditional branch within the source code.
  • Buffering in communication
  • Blocking and non-blocking communication


6
Introduction to MPI
  • Parallel computing utility library of
    subroutine/functions, not a independent language
  • MPI subroutines and functions can be called from
    Fortran and C, respectively
  • Compiled with FORTRAN or C compilers
  • MPI-1 doesnt support F90, but MPI-2 does support
    F90 and C


7
Introduction to MPI (cont.)
  • Why people use MPI?
  • speed up computation
  • big demand of CPU time and more memory
  • more portable and scalable rather than using
    automatic "parallelizer" , which might not work
  • good for distributed memory computers, such as
    distributed clusters, network based computers or
    workstations


8
Introduction to MPI (cont.)
  • Why people are afraid of MPI?
  • more complicated than serial computing
  • more complicated to master the technique
  • synchronization lost
  • amount of time required to convert serial code to
    parallelized code


9
Introduction to MPI (cont.)
  • Alternative ways?
  • data parallel model using high level language
    such as HPF
  • advanced library (or interface), such as (The
    Portable, Extensible Toolkit for Scientific
    Computation (PETSC)
  • Java multithread computing on internet based
    distributed computation


10
Basics of MPI
  • MPI header (library) file should be included in
    users FORTRAN or C codes. The library files
    contains definitions of constants, prototypes.


include "mpif.h" for FORTRAN code include
"mpi.h" for C code
11
Basics of MPI
  • MPI is initiated by calling MPI_Init() first
    before invoking any other MPI subroutines or
    functions.
  • MPI processing ends with a call MPI_Finalize().


12
Basics of MPI
  • Only difference between MPI subroutines (for
    FORTRAN) and MPI functions (for C) is the error
    reporting flag.
  • In FORTRAN, it is returned as the last member of
    the subroutine's argument list. In C, the integer
    error flag is returned through the function
    return value.


13
Basics of MPI
  • Consequently, MPI FORTRAN subroutines always
    contain one additional variable in the argument
    list than the C counterpart.


14
Basics of MPI (cont.)
  • C's MPI function names start with MPI_ followed
    by a character string with the leading character
    in upper case letter while the rest in lower case
    letters
  • FORTRAN subroutines bear the same names but are
    case-insensitive.
  • On SGI's Origin20000 (NCSA), parallel I/O is
    supported.


15
Compilation and Execution (f77)
  • To compile and execute a f77 (or f90) code
    without MPI

f77 -o example example.f f90 o example
example.f /bin/time example Or time example
16
  • To compile and execute a f77 (or f90) code with
    MPI

f77 -o example1_1 example1_1.f lmpi g77 -o
example1_1 example1_1.f -lmpi f90 -o example1_1
example1_1.f lmpi mpif77 -o example1_1
example1_1.f (our cluster) mpif90 -o example1_1
example1_1.f (our cluster) bin/time mpirun -np
4 example1_1 time mpirun -np 4 example1_1
17
  • To compile and execute a C code without MPI

gcc -o exampleC exampleC.c -lm Or cc -o
exampleC exampleC.c -lm exampleC
18
  • To compile and execute a C code with MPI

cc o exampleC1_1 exampleC1_1.c lm lmpi gcc o
exampleC1_1 exampleC1_1.c lm lmpi mpicc
exampleC1_1.c (our cluster) Execution bin/tim
e mpirun -np 10 exampleC1_1 time mpirun -np 10
exampleC1_1
19
Basic communication among processes
  • Example 0 basic communication between processes
  • p, multiple processes starting from 0 to p-1
  • process 0 receive message from other processes

message
process 1
process 0
process 2
process 3
20
Learning MPI by Examples
  • Example 0 mechanism
  • system copies the executable code to each
    processes
  • each process begins execution of the copied
    executable code, simultaneously
  • different processes can execute different
    statements by branching within the program based
    on their ranks (this form of MIMD programming is
    called single-program multiple-data (SPMD)
    programming)

21
/
greetings.c -- greetings program Send a
message from all processes with rank ! 0 to
process 0. Process 0 prints the messages
received. Input none. Output contents
of messages received by process 0.

/ include ltstdio.hgt include ltstring.hgt include
"mpi.h"
include MPI library
22
Passing command-line parameters to main function
main(int argc, char argv) int
my_rank / rank of process
/ int p /
number of processes / int source
/ rank of sender /
int dest / rank of
receiver / int tag 0
/ tag for messages /
char message100 / storage for message
/ MPI_Status status / return
status for receive / / Start up MPI /
MPI_Init(argc, argv)
23
Obtain the rank number
/ Find out process rank /
MPI_Comm_rank(MPI_COMM_WORLD, my_rank)
printf("my_rank is d\n",my_rank) / Find
out number of processes / MPI_Comm_size(MPI_C
OMM_WORLD, p) printf("p, the toal number
of processes d\n",p) if (my_rank ! 0)
/ other processes, but not process 0 /
/ Create message /
sprintf(message, "Greetings from process d!",
my_rank) dest 0 /
destination to where the message send
24
/ Use strlen1 so that '\0' gets transmitted
/ MPI_Send(message, strlen(message)1,
MPI_CHAR, dest, tag,
MPI_COMM_WORLD) else
/ my_rank 0 , process 0/
for (source 1 source lt p source)
MPI_Recv(message, 100, MPI_CHAR, source, tag,
MPI_COMM_WORLD, status)
printf("s\n", message)
25
Learning MPI by Examples
/ Shut down MPI / MPI_Finalize()
/ main /
Commands
mpicc greetings.c mpirun -np 8 a.out
26
Result mpicc greetings.c mpirun -np 8
a.out my_rank is 3 p, the toal number of
processes 8 my_rank is 4 p, the toal number of
processes 8 my_rank is 0 p, the toal number of
processes 8 my_rank is 1 p, the toal number of
processes 8 Greetings from process 1! my_rank is
2
27
p, the toal number of processes 8 my_rank is
7 p, the toal number of processes 8 Greetings
from process 2! Greetings from process 3! my_rank
is 5 p, the toal number of processes 8 Greetings
from process 4! Greetings from process 5! my_rank
is 6 p, the toal number of processes 8 Greetings
from process 6! Greetings from process 7!
28
  • Example 0 (in Fortran)

c greetings.f -- greetings program c c Send a
message from all processes with rank ! 0 to
process 0. c Process 0 prints the messages
received. c c Input none. c Output contents
of messages received by process 0. c c Note
Due to the differences in character data in
Fortran and char c in C, their may be
problems in MPI_Send/MPI_Recv c
29
program greetings c include 'mpif.h' c integer
my_rank integer p integer source integer
dest integer tag character100
message character10 digit_string integer
size integer status(MPI_STATUS_SIZE) integer
ierr c
30
c function integer string_len c call
MPI_Init(ierr) c call MPI_Comm_rank(MPI_COMM_
WORLD, my_rank, ierr) call
MPI_Comm_size(MPI_COMM_WORLD, p, ierr) c if
(my_rank.ne.0) then call
to_string(my_rank, digit_string, size)
message 'Greetings from process ! ' //
digit_string(1size) // dest 0
tag 0 call MPI_Send(message,
string_len(message), MPI_CHARACTER,
dest, tag, MPI_COMM_WORLD, ierr) else
31
do 200 source 1, p-1 tag 0 call
MPI_Recv(message, 100, MPI_CHARACTER, source,
tag, MPI_COMM_WORLD, status, ierr)
call MPI_Get_count(status, MPI_CHARACTER, size,
ierr) write(6,100) message(1size) 100
format(' ',a) 200 continue endif c
call MPI_Finalize(ierr) stop
end c c
32
cccccccccccccccccccccccccccccccccccccccccccccccccc
cccccccc c c Converts the integer stored in
number into an ascii c string. The string is
returned in string. The number of c digits is
returned in size. subroutine to_string(number,
string, size) integer number character ()
string integer size character100
temp integer local integer last_digit integer
i local number i 0
33
c strip digits off starting with least
significant c do-while loop 100 last_digit
mod(local,10) local local/10 i i
1 temp(ii) char(last_digit
ichar('0')) if (local.ne.0) go to 100 size
i c reverse digits do 200 i 1, size
string(size-i1size-i1) temp(ii)
200 continue c return end
34
c to_string c c cccccccccccccccccccccccccccccccc
ccccccccccccccccccccccccc c Finds the number of
characters stored in a string c integer function
string_len(string) character()
string c character1 space parameter (space '
') integer i c i len(string)
35
c while loop 100 if ((string(ii).eq.space).
and.(i.gt.1)) then i i - 1 go
to 100 endif c if ((i.eq.1).and.(string(ii).eq.
space)) then string_len 0 else
string_len i endif c return end c end of
string_len
36
mpif77 greetings.f mpirun np 8 a.out
37
  • Not necessary to call MPI_Init function at the
    beginning of your code.
  • Not necessary to call MPI_finalize function athe
    the end of your code.
  • MPI section should be inserted only into wherever
    you need the code to be in parallel.

38
Numerical Integration
  • Example 1 numerical integration using mid-point
    method
  • mathematical problem
  • numerical method
  • serial programming and parallel programming

39
  • Problem
  • Testing integration of cos(x) from 0 to p/2

40
(No Transcript)
41
Example of C serial program
/ serial.c -- serial version of trapezoidal
rule Calculate definite integral using
trapezoidal rule. The function f(x) is
hardwired. Input a, b, n. Output estimate
of integral from a to b of f(x) using n
trapezoids. See Chapter 4, pp. 53 ff. in
PPMPI. / include ltstdio.hgt
42
main() float integral / Store result
in integral / float a, b / Left
and right endpoints / int n
/ Number of trapezoids / float h
/ Trapezoid base width / float
x int i float f(float x) /
Function we're integrating / printf("Enter
a, b, and n\n") scanf("f f d", a, b,
n)
43
h (b-a)/n integral (f(a) f(b))/2.0
x a for (i 1 i lt n-1 i)
x x h integral integral
f(x) integral integralh
printf("With n d trapezoids, our estimate\n",
n) printf("of the integral from f to f
f\n", a, b, integral)
44
float f(float x) float return_val
/ Calculate f(x). Store calculation in
return_val. / return_val xx return
return_val
45
Example of serial code in Fortran
C serial.f -- calculate definite integral using
trapezoidal rule. C C The function f(x) is
hardwired. C Input a, b, n. C Output estimate
of integral from a to b of f(x) C using n
trapezoids. C C See Chapter 4, pp. 53 ff. in
PPMPI. C PROGRAM serial INCLUDE
'mpif.h' real integral real a
real b
46
integer n real h real
x integer i C real f C
print , 'Enter a, b, and n' read , a,
b, n C h (b-a)/n integral (f(a)
f(b))/2.0 x a do 100 i 1 , n-1
x x h integral integral
f(x) 100 continue
47
integral integralh C print ,'With n
', n,' trapezoids, our estimate' print
,'of the integral from ', a, ' to ',b, ' ' ,
integral end C C
real function
f(x) real x C Calculate f(x). Store
calculation in return_val. C f xx
return end
48
  • To compile and execute serial.f
  • Result

g77 -o serial serial.f example
The result 1.000000 real 0.021 user
0.002 sys 0.013
49
  • Parallel programming with MPI blocking
    Send/Receive
  • implement-dependent because using assignment of
    inputs
  • Using the following MPI functions
  • MPI_Init and MPI_Finalize
  • MPI_Comm_rank
  • MPI_Comm_size
  • MPI_Recv
  • MPI_Send

50
  • Parallel programming with MPI blocking
    Send/Receive
  • master process receives each partial result,
    based on subinterval integration from other
    process
  • master sum all of the sub-result together
  • other processes are idle during master's
    performance (due to blocking communication)

51
Example of parallel programming in C (trap.c)
/ trap.c -- Parallel Trapezoidal Rule, first
version Input None. Output Estimate
of the integral from a to b of f(x) using
the trapezoidal rule and n trapezoids.
Algorithm 1. Each process calculates
"its" interval of integration. 2.
Each process estimates the integral of f(x)
over its interval using the trapezoidal
rule. 3a. Each process ! 0 sends its
integral to process 0. 3b. Process 0 sums
the calculations received from the
individual processes and prints the result.
52
Notes 1. f(x), a, b, and n are all
hardwired. 2. The number of processes (p)
should evenly divide the number of
trapezoids (n 1024) See Chap. 4, pp. 56
ff. in PPMPI. / include ltstdio.hgt / We'll be
using MPI routines, definitions, etc. / include
"mpi.h"
53
main(int argc, char argv) int
my_rank / My process rank /
int p / The number of processes
/ float a 0.0 / Left endpoint
/ float b 1.0 / Right
endpoint / int n 1024
/ Number of trapezoids / float
h / Trapezoid base length / /
local_a and local_b are the bounds for each
integration performed in individual process /
float local_a / Left endpoint my
process / float local_b / Right
endpoint my process / int local_n
/ Number of trapezoids for /
/ my calculation /
float integral / Integral over my
interval /
54
float total / Total integral
/ int source / Process
sending integral / int dest 0
/ All messages go to 0 / int
tag 0 MPI_Status status / Trap
function prototype. Trap function is used to
calculate local integral / float
Trap(float local_a, float local_b, int local_n,
float h) / Let the system do what it needs
to start up MPI / MPI_Init(argc, argv)
/ Get my process rank /
MPI_Comm_rank(MPI_COMM_WORLD, my_rank)
55
/ Find out how many processes are being used
/ MPI_Comm_size(MPI_COMM_WORLD, p) h
(b-a)/n / h is the same for all processes
/ local_n n/p / So is the number of
trapezoids / / Length of each process'
interval of integration local_nh. So
my interval starts at / local_a a
my_ranklocal_nh local_b local_a
local_nh integral Trap(local_a, local_b,
local_n, h) if (my_rank 0) /
Add up the integrals calculated by each process
/ total integral / this is the
intergal calculated by process 0 /
56
for (source 1 source lt p source)
MPI_Recv(integral, 1, MPI_FLOAT,
source, tag, MPI_COMM_WORLD,
status) total total integral
else printf("The
intergal calculated from process d is f\n",
my_rank,integral )
MPI_Send(integral, 1, MPI_FLOAT, dest, tag,
MPI_COMM_WORLD)
57
/ Print the result / if (my_rank 0)
printf("With n d trapezoids, our
estimate\n", n) printf("of the integral
from f to f f\n",a,b,total) /
Shut down MPI / MPI_Finalize()
58
float Trap( float local_a / in
/, float local_b / in /,
int local_n / in /, float h
/ in /) float integral / Store
result in integral / float x int i
float f(float x) / function we're integrating
/ integral (f(local_a)
f(local_b))/2.0 x local_a
59
for (i 1 i lt local_n-1 i)
x x h integral integral
f(x) integral integralh return
integral / Trap / float f(float x)
float return_val / Calculate f(x). /
/ Store calculation in return_val. /
return_val xx return return_val / f /
60
  • To compile a C code with MPI library
  • In our cluster system

cc -o trap trap.c -lmpi -lm
mpicc trap.c mpirun -np 8 a.out
61
  • Result

With n 1024 trapezoids, our estimate of the
integral from 0.000000 to 1.000000 0.333333 The
intergal calculated from process 3 is
0.024089 The intergal calculated from process 4
is 0.039714 The intergal calculated from process
7 is 0.110026 The intergal calculated from
process 5 is 0.059245 The intergal calculated
from process 1 is 0.004557 The intergal
calculated from process 2 is 0.012370 The
intergal calculated from process 6 is 0.082682
62
  • Example of parallel programming in Fortran
    (trap.f)

c trap.f -- Parallel Trapezoidal Rule, first
version c c Input None. c Output Estimate
of the integral from a to b of f(x) c using
the trapezoidal rule and n trapezoids. c c
Algorithm c 1. Each process calculates
"its" interval of c integration. c
2. Each process estimates the integral of f(x) c
over its interval using the trapezoidal
rule. c 3a. Each process ! 0 sends its
integral to 0. c 3b. Process 0 sums the
calculations received from
63
c the individual processes and prints the
result. c c Notes c 1. f(x), a, b, and
n are all hardwired. c 2. Assumes number of
processes (p) evenly divides c number of
trapezoids (n 1024) c c See Chap. 4, pp. 56
ff. in PPMPI. c program trapezoidal c
include 'mpif.h' c integer my_rank
integer p real a
64
real b integer n real
h real local_a real
local_b integer local_n real
integral real total
integer source integer dest
integer tag integer
status(MPI_STATUS_SIZE) integer ierr c
real Trap c
65
data a, b, n, dest, tag /0.0, 1.0, 1024, 0,
0/ call MPI_INIT(ierr) call
MPI_COMM_RANK(MPI_COMM_WORLD, my_rank, ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, p, ierr)
h (b-a)/n local_n n/p
local_a a my_ranklocal_nh local_b
local_a local_nh integral
Trap(local_a, local_b, local_n, h) if
(my_rank .EQ. 0) then total integral
66
do 100 source 1, p-1 call
MPI_RECV(integral, 1, MPI_REAL, source, tag,
MPI_COMM_WORLD, status, ierr)
total total integral 100
continue else call
MPI_SEND(integral, 1, MPI_REAL, dest,
tag, MPI_COMM_WORLD, ierr) endif
if (my_rank .EQ. 0) then
write(6,200) n 200 format(' ','With n
',I4,' trapezoids, our estimate')
write(6,300) a, b, total 300 format('
','of the integral from ',f6.2,' to ',f6.2,
' ',f11.5) endif
67
call MPI_FINALIZE(ierr) end
c c real function Trap(local_a, local_b,
local_n, h) real local_a real
local_b integer local_n real
h c real integral real
x real i c real f
68
integral (f(local_a) f(local_b))/2.0
x local_a do 100 i 1,
local_n-1 x x h
integral integral f(x) 100 continue
Trap integralh return
end c real function f(x) real x
real return_val return_val xx
f return_val return end
69
  • Example of parallel programming in Fortran
    (trap.f)

With n 1024 trapezoids, our estimate of the
integral from 0.00 to 1.00 0.33333
To compile a f77 code with MPI library In our
cluster system
f77 -o trap trap.f -lmpi
mpif77 trap.f mpirun -np 8 trap
70
  • Basic mechanism of message passing through
    buffering
  • Compose a message and put it in a buffer
  • Drop a message in a box, called by MPI_Send
  • Sending addresses should be addressed.
  • Envelopes should be created, which contains
    destination of message, information size of
    message, as well as add source process to the
    envelope.
  • Tags or message types are the standard on message
    passing
  • Tag is used to identify the process action on the
    data

71
  • Message envelope contains at least the following
    information
  • The rank of the receiver
  • The rank of the sender
  • A tag, like project identification
  • A communicator, collection of processes that can
    send message to each other. The predefined
    MPI_COMM_WORLD on all MPI system consists of all
    the processes running when execution of the
    program starts.
  • Message refers the actual data being transmitted
  • Status information on the data that was actually
    received

72
  • MPI datatype
  • MPI_CHAR signed char
  • MPI_SHORT signed short int
  • MPI_INT signed int
  • MPI_LONG signed long int
  • MPI_UNSIGNED_CHAR unsigned char
  • MPI_UNSIGNED_SHORT unsigned short int
  • MPI_UNSIGNED unsigned int
  • MPI_UNSIGNED_LONG unsigned long int
  • MPI_FLOAT float
  • MPI_DOUBLE double
  • MPI_LONG_DOUBLE long double
  • MPI_BYTE
  • MPI_PACKED

73
Int MPI_Send ( void message / in
/, int count / in /, MPI_Datatype datatype
/ in /, int dest / in /, int tag / in
/, MPI_Comm comm / in /) Int MPI_Recv
( void message / out /, int count /
in /, MPI_Datatype datatype / in
/, int source / in /, int tag / in
/, MPI_Comm comm / in /, MPI_Status status
/ out /)
74
  • Parallel programming with MPI non blocking
    Send/Receive
  • do not make processes idle
  • Using the following MPI functions
  • MPI_Init and MPI_Finalize
  • MPI_Comm_rank
  • MPI_Comm_size
  • MPI_Recv
  • MPI_ISend

75
  • Basic input and output in MPI
  • Global and local variables
  • Some variables are significant on all the
    processes
  • Some variables are significant on individual
    processes
  • I/O on parallel system
  • Many parallel system provide standards of I/O
    (keyboard input and terminal output) on process 0
  • Some systems allow all the processes to read and
    write
  • How do we deal with

76
  • If we want to input values such as a, b and n
    from keyboard, should we add
  • Scanf(f f d, a, a, n) ?????????
  • Usually we assume process 0 can read and write
  • Modified parallel code

77
/ get_data.c -- Parallel Trapezoidal Rule, uses
basic Get_data function for input.
Input a, b limits of integration.
n number of trapezoids. Output Estimate of
the integral from a to b of f(x) using the
trapezoidal rule and n trapezoids. Notes
1. f(x) is hardwired. 2. Assumes
number of processes (p) evenly divides
number of trapezoids (n). See Chap. 4, pp.
60 ff in PPMPI. /
78
include ltstdio.hgt / We'll be using MPI
routines, definitions, etc. / include
"mpi.h" main(int argc, char argv) int
my_rank / My process rank /
int p / The number of
processes / float a / Left
endpoint / float b
/ Right endpoint / int
n / Number of trapezoids /
float h / Trapezoid base length
/ float local_a / Left endpoint
my process / float local_b /
Right endpoint my process / int
local_n / Number of trapezoids for /
/ my calculation
/
79
float integral / Integral over my
interval / float total / Total
integral / int source
/ Process sending integral / int
dest 0 / All messages go to 0 /
int tag 0 MPI_Status status /
function prototypes / void Get_data(float
a_ptr, float b_ptr, int
n_ptr, int my_rank, int p) float Trap(float
local_a, float local_b, int local_n,
float h) / Calculate local integral /
/ Let the system do what it needs to start up
MPI / MPI_Init(argc, argv)
80
/ Get my process rank /
MPI_Comm_rank(MPI_COMM_WORLD, my_rank) /
Find out how many processes are being used /
MPI_Comm_size(MPI_COMM_WORLD, p)
Get_data(a, b, n, my_rank, p) h
(b-a)/n / h is the same for all processes
/ local_n n/p / So is the number of
trapezoids / / Length of each process'
interval of integration local_nh. So
my interval starts at / local_a a
my_ranklocal_nh local_b local_a
local_nh integral Trap(local_a, local_b,
local_n, h)
81
/ Add up the integrals calculated by each
process / if (my_rank 0)
total integral for (source 1 source
lt p source)
MPI_Recv(integral, 1, MPI_FLOAT, source, tag,
MPI_COMM_WORLD, status)
total total integral
else MPI_Send(integral, 1,
MPI_FLOAT, dest, tag,
MPI_COMM_WORLD)
82
/ Print the result / if (my_rank 0)
printf("With n d trapezoids, our
estimate\n", n) printf("of the integral
from f to f f\n", a, b, total)
/ Shut down MPI /
MPI_Finalize() / main / /
/ /
Function Get_data Reads in the user input a,
b, and n. Input parameters
83
1. int my_rank rank of current
process. 2. int p number of processes.
Output parameters 1. float a_ptr
pointer to left endpoint a. 2. float
b_ptr pointer to right endpoint b. 3.
int n_ptr pointer to number of trapezoids.
Algorithm 1. Process 0 prompts user for
input and reads in the values.
2. Process 0 sends input values to other
processes. / void Get_data( float
a_ptr / out /, float b_ptr /
out /, int n_ptr / out /,
84
int my_rank / in /, int
p / in /) int source 0 /
All local variables used by / int dest
/ MPI_Send and MPI_Recv / int
tag MPI_Status status if (my_rank
0) printf("Enter a, b, and n\n")
scanf("f f d", a_ptr, b_ptr, n_ptr)
for (dest 1 dest lt p dest)
tag 0
85
MPI_Send(a_ptr, 1, MPI_FLOAT, dest,
tag, MPI_COMM_WORLD)
tag 1 MPI_Send(b_ptr, 1,
MPI_FLOAT, dest, tag,
MPI_COMM_WORLD) tag 2
MPI_Send(n_ptr, 1, MPI_INT, dest,
tag, MPI_COMM_WORLD)
else tag 0
MPI_Recv(a_ptr, 1, MPI_FLOAT, source,
tag, MPI_COMM_WORLD, status)
tag 1
86
MPI_Recv(b_ptr, 1, MPI_FLOAT, source,
tag, MPI_COMM_WORLD, status)
tag 2 MPI_Recv(n_ptr, 1, MPI_INT,
source, tag,
MPI_COMM_WORLD, status) / Get_data /
/
/ float Trap( float local_a / in /,
float local_b / in /,
int local_n / in /, float h
/ in /)
87
float integral / Store result in
integral / float x int i
float f(float x) / function we're integrating
/ integral (f(local_a) f(local_b))/2.0
x local_a for (i 1 i lt
local_n-1 i) x x h
integral integral f(x) integral
integralh return integral / Trap
/
88
/
/ float f(float x) float return_val
/ Calculate f(x). / / Store
calculation in return_val. / return_val
xx return return_val / f /
89
Enter a, b, and n 0 1.0 1024 With n 1024
trapezoids, our estimate of the integral from
0.000000 to 1.000000 0.333333
90
  • Non blocking Send/Receive

/ get_dataNonBlocking.c -- Parallel Trapezoidal
Rule, uses basic Get_data function for
input. It uses non blocking MPI functions
Input a, b limits of integration.
n number of trapezoids. Output Estimate of
the integral from a to b of f(x) using the
trapezoidal rule and n trapezoids. Notes
91
1. f(x) is hardwired. 2. Assumes
number of processes (p) evenly divides
number of trapezoids (n). See Chap. 4, pp.
60 ff in PPMPI. / include ltstdio.hgt /
We'll be using MPI routines, definitions, etc.
/ include "mpi.h" main(int argc, char argv)
int my_rank / My process rank
/ int p / The
number of processes / float a
/ Left endpoint /
92
float b / Right endpoint
/ int n / Number of
trapezoids / float h /
Trapezoid base length / float
local_a / Left endpoint my process /
float local_b / Right endpoint my
process / int local_n / Number
of trapezoids for /
/ my calculation / float
integral / Integral over my interval /
float total / Total integral
/ int source / Process
sending integral / int dest 0
/ All messages go to 0 / int
tag 0 MPI_Status status MPI_Request
req
93
/ function prototypes / void
Get_data(float a_ptr, float b_ptr,
int n_ptr, int my_rank, int p) float
Trap(float local_a, float local_b, int local_n,
float h) / Calculate local
integral / / Let the system do what it
needs to start up MPI / MPI_Init(argc,
argv) / Get my process rank /
MPI_Comm_rank(MPI_COMM_WORLD, my_rank) /
Find out how many processes are being used /
MPI_Comm_size(MPI_COMM_WORLD, p)
Get_data(a, b, n, my_rank, p)
94
h (b-a)/n / h is the same for all
processes / local_n n/p / So is the
number of trapezoids / / Length of each
process' interval of integration
local_nh. So my interval starts at /
local_a a my_ranklocal_nh local_b
local_a local_nh integral Trap(local_a,
local_b, local_n, h) / Add up the
integrals calculated by each process / if
(my_rank 0) total integral
95
for (source 1 source lt p source)
MPI_Recv(integral, 1, MPI_FLOAT,
source, tag, MPI_COMM_WORLD,
status) total total integral
else
MPI_Isend(integral,1,MPI_FLOAT,dest,
tag,MPI_COMM_WORLD,req)
MPI_Wait(req,status) / Print the
result / if (my_rank 0)
96
printf("With n d trapezoids, our
estimate\n", n) printf("of the integral
from f to f f\n", a, b, total)
/ Shut down MPI /
MPI_Finalize() / main / /
/ / Function
Get_data Reads in the user input a, b, and n.
Input parameters 1. int my_rank rank
of current process. 2. int p number of
processes.
97
Output parameters 1. float
a_ptr pointer to left endpoint a. 2.
float b_ptr pointer to right endpoint b.
3. int n_ptr pointer to number of
trapezoids. Algorithm 1. Process 0
prompts user for input and reads in
the values. 2. Process 0 sends input
values to other processes. / void
Get_data( float a_ptr / out /,
float b_ptr / out /, int
n_ptr / out /, int my_rank
/ in /, int p / in /)
98
int source 0 / All local
variables used by / int dest /
MPI_Send and MPI_Recv / int tag
MPI_Status status if (my_rank 0)
printf("Enter a, b, and n\n")
scanf("f f d", a_ptr, b_ptr, n_ptr)
for (dest 1 dest lt p dest)
tag 0 MPI_Send(a_ptr, 1,
MPI_FLOAT, dest, tag,
MPI_COMM_WORLD) tag 1
99
MPI_Send(b_ptr, 1, MPI_FLOAT, dest,
tag, MPI_COMM_WORLD)
tag 2 MPI_Send(n_ptr, 1, MPI_INT,
dest, tag, MPI_COMM_WORLD)
else tag 0
MPI_Recv(a_ptr, 1, MPI_FLOAT, source,
tag, MPI_COMM_WORLD, status)
tag 1 MPI_Recv(b_ptr, 1, MPI_FLOAT,
source, tag,
MPI_COMM_WORLD, status) tag 2
MPI_Recv(n_ptr, 1, MPI_INT, source,
tag, MPI_COMM_WORLD, status) /
Get_data /
100
/
/ float Trap( float local_a / in /,
float local_b / in /, int
local_n / in /, float h
/ in /) float integral / Store
result in integral / float x int i
float f(float x) / function we're
integrating / integral (f(local_a)
f(local_b))/2.0
101
x local_a for (i 1 i lt
local_n-1 i) x x h
integral integral f(x) integral
integralh return integral / Trap
/ //
float f(float x) float return_val
/ Calculate f(x). / / Store calculation in
return_val. / return_val xx return
return_val / f /
102
  • Non blocking Send/Receive (in Fortran)

Program Example1_2 c
c example1_1.f c
parallel programming in Fortran c to solve
numerical integration using mid-point method c
function selected is cos(x) c it demonstrate
non-block communication c c This is an MPI
example on parallel integration
103
c It demonstrates the use of c c MPI_Init c
MPI_Comm_rank c MPI_Comm_size c MPI_Recv c
MPI_Isend c MPI_Wait c MPI_Finalize c c

implicit none integer n, p, i, j, k, ierr,
master real h, a, b, integral, pi
integer req(1)
104
include "mpif.h" !! This brings in
pre-defined MPI constants, ... integer Iam,
source, dest, tag, status(MPI_STATUS_SIZE)
real my_result, Total_result, result data
master/0/ cStarts MPI processes ...
call MPI_Init(ierr)
!! starts MPI call MPI_Comm_rank(MPI_COMM_WO
RLD, Iam, ierr)
!! get current proc id
call MPI_Comm_size(MPI_COMM_WORLD, p, ierr)

!! get number of procs pi acos(-1.0)
!! 3.14159... a 0.0 !!
lower limit of integration b pi/2.
!! upper limit of integration
105
n 500 !! number of increment
within each process dest master !!
define the process that computes the final
result tag 123 !! set the tag to
identify this particular job h (b-a)/n/p
!! length of increment my_result
integral(a,Iam,h,n) write(,)'Iam',Iam,',
my_result',my_result if(Iam .eq. master)
then ! the following is serial
result my_result do k1,p-1 !! more
efficient, less prone to deadlock !! root
receives my_result from proc call
MPI_Recv(my_result, 1, MPI_REAL,
MPI_ANY_SOURCE, tag, MPI_COMM_WORLD,
status, ierr) result result
my_result enddo
106
else call MPI_Isend(my_result, 1,
MPI_REAL, dest, tag, MPI_COMM_WORLD,
req, ierr) !!
send my_result to intended dest. call
MPI_Wait(req, status, ierr) !! wait for nonblock
send ... endif cresults from all procs
have been collected and summed ... if(Iam
.eq. 0) then write(,)'Final Result
',result endif call
MPI_Finalize(ierr) !! let
MPI finish up ... stop end
107
real function integral(a,i,h,n) implicit
none integer n, i, j real h, h2, aij,
a real fct, x fct(x) cos(x)
!! kernel of the integral integral
0.0 !! initialize integral
h2 h/2. do j0,n-1 !!
sum over all "j" integrals aij a (in
j)h !! lower limit of "j" integral
integral integral fct(aijh2)h
enddo return end
108
Result Process 6 has the partial result of
0.056906 Process 1 has the partial result of
0.187593 Process 0 has the partial result of
0.195090 Process 2 has the partial result of
0.172887 Process 3 has the partial result of
0.151536 Process 4 has the partial result of
0.124363 Process 5 has the partial result of
0.092410 Process 7 has the partial result of
0.019215 The result 0.9999998
Write a Comment
User Comments (0)
About PowerShow.com