MPI and OpenMP - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

MPI and OpenMP

Description:

Title: OpenMP Author: user Last modified by: Kahim Created Date: 3/21/2005 2:44:47 AM Document presentation format: Company: University of Kentucky – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 46
Provided by: mgnetOrg
Category:

less

Transcript and Presenter's Notes

Title: MPI and OpenMP


1
MPI and OpenMP
  • Kevin Leung
  • Nathan Liang
  • Paul Maynard

2
What is MPI ?
  • The Message Passing Interface (MPI) standard
  • is a library with functions that can be called
    from C, C or Fortran
  • MPI is a programming paradigm used widely on
    parallel computers, e.g.
  • Scalable Parallel Computers (SPCs) with
    distributed memory,
  • Networks of Workstations (NOWs).
  • developed by a broadly based committee of
    vendors, implementers, and users.

3
Why MPI is introduced?
  • Motivation for Parallel System
  • Hardware limits on single CPUs
  • Commodity computing
  • Problem
  • Coordinating use of multiple CPUs
  • Solution
  • Message passing
  • Problem
  • Proprietary Systems, Lack of Portablilty
  • Solution
  • MPI Consortium started in 1992

4
Goals of MPI
  • Design an application programming interface.
  • Allow efficient communication.
  • Allows data to be passed between processes in a
    distributed memory environment
  • Allow for implementations that can be used in a
    heterogeneous environment
  • Allow convenient C and Fortran 77 bindings for
    the interface.
  • Provide a reliable communication interface. The
    user need not cope with communication failures.
  • Define an interface not too different from
    current practice, such as NX, PVM etc. and
    provides extensions that allow greater
    flexibility
  • Define an interface that can be implemented on
    many vendors platforms.

5
Version of MPI
  • The original MPI standard was created by the
    Message Passing Interface Forum (MPIF).
  • The public release of version 1.0 of MPI was made
    in June 1994.
  • The MPIF began meeting again in March 1995.
  • In June 1995 and version 1.1 of the standard was
    released
  • In July of 1997, the original MPI is being
    referred to as MPI-1 and the new effort is being
    called MPI-2

6
What is Includes in MPI (standard)?
  • Bindings for Fortran 77 and C
  • Point-to-point communication
  • Collective operations
  • Process groups
  • Communication domains
  • Process topologies
  • Environmental Management and inquiry
  • Profiling interface

7
Language Binding
  • All MPI names have an MPI_ prefix,
  • In Fortran 77, all characters are upper case
  • In C, constants are in all capital letters, and
    defined types and functions have one capital
    letter after the prefix
  • Programs must not declare variables or functions
    with names beginning with the prefix MPI_ or
    PMPI_
  • The definition of named constants, function
    prototypes, and type definitions must be supplied
    in an include file mpi.h. and mpif.h

8
MPI Function
  • MPI is large (there are 128 MPI routines )
  • 6 Basics function
  • MPI_INIT ( Int arg, char argv)
  • Initiate an MPI computation.
  • MPI_FINALIZE ()
  • Shutdown a computation.
  • MPI_COMM_SIZE (comm, size)
  • Determine the number of processes in a
    computation
  • MPI_COMM_RANK (comm, pid)
  • Determine the identifier of the current process.
  • MPI_SEND (buf, count, datatype, dest, tag comm)
  • Send a message
  • MPI_RECV (buf, count, datatype, source, tag,
    comm, status)
  • Receive a message

9
How to use MPI
  • Include the MPI header files
  • e.g. include ltmpi.hgt
  • Initialize the MPI environment
  • Write the code
  • Finalize the MPI enviroment
  • e.g. MPI_Finalize()

10
Hello World!!
include "mpi.h" int main(int argc, char
argv) int my_rank, p, source, dest, tag
0 char message100 MPI_Status
status MPI_Init(argc, argv)
MPI_Comm_rank(MPI_COMM_WORLD, my_rank)
MPI_Comm_size(MPI_COMM_WORLD, p) if
(my_rank ! 0) / Create message /
sprintf(message, Hello from process d!",
my_rank) dest 0 MPI_Send(message,
strlen(message)1, MPI_CHAR, dest, tag,
MPI_COMM_WORLD) else for(source 1
source lt p source) MPI_Recv(message,
100, MPI_CHAR, source, tag, MPI_COMM_WORLD,
status) printf("s", message)
MPI_Finalize()
11
Include File
Include
Include MPI header file
include ltstdio.hgt include ltstdlib.hgt include
ltmpi.hgt int main(int argc, char argv)
Initialize
Work
Terminate
12
Initialize MPI
Include
Initialize MPI environment
int main(int argc, char argv) int
numtasks, rank MPI_Init (argc,argv)
MPI_Comm_size(MPI_COMM_WORLD, numtasks)
MPI_Comm_rank(MPI_COMM_WORLD, rank) ...
Initialize
Work
Terminate
13
Initialize MPI (cont.)
MPI_Init (argc,argv) Not MPI functions called
before this call. MPI_Comm_size(MPI_COMM_WORLD,
nump) A communicator is a collection of
processes that can send messages to each other.
MPI_COMM_WORLD is a predefined communicator
that consists of all the processes running when
the program execution begins. MPI_Comm_rank(MPI
_COMM_WORLD, myrank) In order for a process to
find out its rank.
Include
Initialize
Work
Terminate
14
Work with MPI
Work Make message passing calls (Send, Receive)
Include
if(my_rank ! 0) MPI_Send(data, strlen(data)1,
MPI_CHAR, dest, tag, MPI_COMM_WORLD) els
e MPI_Recv(data, 100, MPI_CHAR, source, tag,
MPI_COMM_WORLD, status)
Initialize
Work
Terminate
15
Terminate MPI environment
Terminate MPI environment
Include
include ltstdio.hgt include ltstdlib.hgt include
ltmpi.hgt int main(int argc, char argv)
MPI_Finalize()
Initialize
Work
No MPI functions called after this call.
Terminate
16
Compile and Run MPI
  • Compile
  • gcc c hello.exe mpi_hello.c lmpi
  • mpicc mpi_hello.c
  • Run
  • mpirun np 5 hello.exe
  • Output

mpirun np 5 hello.exe Hello from process
1! Hello from process 2! Hello from process
3! Hello from process 4!
17
Implementation
  • MPI's advantage over older message passing
    libraries is that it is both portable (because
    MPI has been implemented for almost every
    distributed memory architecture) and fast
    (because each implementation is optimized for the
    hardware it runs on).

18
Kinds of Commands
  • Point to Point Communication
  • Collective Communication
  • User Defined Datatypes and Packing
  • Groups and Communicators
  • Process Topologies

19
Point to Point Communication
  • The basic communication mechanism
  • handle data transmission between any two
    processors
  • one sends the data and the other receives it

20
Example C code. Process 0 sends a message to
process 1. char msg20 int myrank, tag 99
MPI_STATUS status ... MPI_Comm_rank(MPI_COMM_WO
RLD, myrank) / find my rank / if (myrank
0) strcpy(msg, "Hello there") MPI_SEND(msg,
strlen(msg)1, MPI_CHAR, 1, tag,
MPI_COMM_WORLD) else if (myrank 1)
MPI_Recv(msg, 20, MPI_CHAR, 0, tag,
MPI_COMM_WORLD, status)
21
  • Blocking Communication
  • MPI_SEND and MPI_RECV
  • The send function blocks until process 0 can
    safely over-write the contents of msg
  • the receive function blocks until the receive
    buffer actually contains the contents of the msg

22
Deadlock
  • Example
  • Solutions
  • Reorder the communications
  • Use the MPI_Sendrecv
  • Use non-blocking ISend or IRecv
  • Use the buffered mode BSend

Process 0 Process 1 Recv(1) Recv(0)
Send(1) Send(0)
Process 0 Process 1 Send(1) Recv(0) Recv(1)
Send(0)
Process 0 Process 1 Sendrecv(1) Sendrecv(0)
Process 0 Process 1 ISend(1) ISend(0)
IRecv(1) IRecv(0) Waitall Waitall
Process 0 Process 1 Bsend(1) Bsend(0) Recv(1)
Recv(0)
23
  • Nonblocking Communication
  • MPI_ISEND and MPI_IRECV
  • The process is immediately, no wait for calls to
    be completed
  • Concurrency

24
User Defined Datatypes and Packing
  • All MPI communication functions take a datatype
    argument. In the simplest case this will be a
    primitive type, such as an integer or
    floating-point number.
  • An important and powerful generalization results
    by allowing user-defined types wherever the
    primitive types can occur.
  • The user can define derived datatypes, that
    specify more general data layouts
  • A sending process can explicitly pack
    noncontiguous data (an array, a structure, etc.)
    into a contiguous buffer, and next send it
  • A receive process can unpack the contiguous
    buffer and store it as noncontiguous data.

25
Collective Communication
  • Collective communications transmit data among all
    processes in a group
  • Barrier Synchronization
  • MPI_Barrier synchronizes all processes in the
    communicator calling this function
  • Data movement
  • Broadcast from 1 --gt all
  • Gather data from all --gt 1
  • Scatter data from 1 --gt all
  • All gather
  • All to all

26
Groups and Communicators
  • Division of processes
  • MPI_COMM_WORLD
  • MPI_COMM_SIZE
  • MPI_COMM_RANK
  • Avoiding Message Conflicts between Modules.
  • Expand the functionality of the message passing
    system
  • Safety

27
Process Topologies
  • The rank processes are arranged in topological
    patterns such as two- or three-dimensional grids
  • A topology can provide a convenient naming
    mechanism for the processes of a group (within a
    communicator), and additionally, may assist the
    runtime system in mapping the processes onto
    hardware.

Relationship between ranks and Cartesian
coordinates for a 3x4 2D topology. The upper
number in each box is the rank of the process and
the lower value is the (row, column) coordinates
Overlapping topology. The upper values in each
process is the rank / (row,col) in the original
2D topology and the lower values are the same for
the shifted 2D topology
28
OpenMP
  • Paul Maynard

29
What is it?
  • What does openMP stand for?
  • Open specifications for Multi Processing
  • It is an API with three main components
  • Compiler directives
  • Library routines
  • Variables
  • Used for writing multithreaded programs

30
What do you need?
  • What programming languages?
  • C\C
  • FORTRAN (77, 90, 95)
  • What operating systems?
  • UNIX
  • Windows NT
  • Can I compile openMP code with gcc?
  • No it takes a special compiler

31
Some compilers for openMP
  • SGI MIPSpro
  • Fortran, C, C
  • IBM XL
  • C/C and Fortran
  • Sun Studio 10
  • Fortran 95, C, and C
  • Portland Group Compilers and Tools
  • Fortran, C, and C
  • Absoft Pro FortranMP
  • Fortran, C, and C
  • PathScale
  • Fortran

32
What it does
  • Program starts off with a master thread
  • It runs for some amount of time
  • When the master thread reaches a region where the
    work can be done concurrently
  • It creates several threads
  • They all do work in this region
  • When the end of the region is reached
  • All the threads terminate
  • Except for the master thread

33
Example
  • I get a job moving boxes
  • When I go to work I bring several friends
  • Who help me move the boxes
  • On pay day
  • I dont bring any friends and I get all the money

34
OpenMP directives
  • Format example
  • pragma omp parallel for shared(y)
  • Always starts with
  • pragma omp
  • Then the directive name
  • parallel for
  • Followed by an clause
  • The clause is optional
  • shared(y)
  • At the end a newline

35
Directives list
  • PARALLEL
  • Multiple threads will execute on the code
  • DO/for
  • Causes the do or for loop to be executed in
    parallel by the worker threads
  • SECTIONS
  • Each section will be executed by multiple threads
  • SINGLE
  • Only to be executed by one thread
  • PARALLEL DO/for
  • Contains only one DO/for loop in the block
  • PARALLEL SECTIONS
  • Contains only one section in the block

36
Work Sharing
37
Work Sharing
38
Work Sharing
39
Data scope attribute clauses
  • PRIVATE
  • Variables declared in this block are independent
    for each thread
  • SHARED
  • Variables declared in this block are shared for
    each thread
  • DEFAULT
  • Allows a scope for all variables in the block
  • FIRSTPRIVATE
  • PRIVATE that has initialization of the variables
  • LASTPRIVATE
  • PRIVATE that copies the value from the last loop
    through the block is copied to the original
    object
  • COPYIN
  • Assign the same value to a variable independent
    for each thread
  • REDUCTION
  • Applies the variable to all the private copies of
    a shared variable

40
Directives and clauses
41
Synchronization
  • MASTER
  • Only the master thread can execute this block
  • CRITICAL
  • Only one thread can execute this block at a time
  • BARRIER
  • Causes all of the threads to wait at this point
    until all of the threads reaches this point
  • ATOMIC
  • The memory location will be wrote one at a time
  • FLUSH
  • The view of memory must be consistent
  • ORDERED
  • The loop will be executed as if it was serially
    executed

42
Environment Variables
  • OMP_SCHEDULE
  • Number of runs through a loop
  • OMP_NUM_THREADS
  • Number of threads
  • OMP_DYNAMIC
  • If dynamic number of thread is allowed
  • OMP_NESTED
  • If nested parallelism is allowed

43
Library Routines
  • OMP_SET_NUM_THREADS
  • OMP_GET_NUM_THREADS
  • OMP_GET_MAX_THREADS
  • OMP_GET_THREAD_NUM
  • OMP_GET_NUM_PROCS
  • OMP_IN_PARALLEL
  • OMP_SET_DYNAMIC
  • OMP_GET_DYNAMIC
  • OMP_SET_NESTED
  • OMP_GET_NESTED
  • OMP_INIT_LOCK
  • OMP_DESTROY_LOCK
  • OMP_SET_LOCK
  • OMP_UNSET_LOCK
  • OMP_TEST_LOCK

44
Example http//beowulf.lcs.mit.edu/18.337/beowulf
.html
  • include ltmath.hgt
  • include ltstdio.hgt
  • define N 16384
  • define M 10
  • double dotproduct(int, double )
  • double dotproduct(int i, double x)
  • double temp0.0, denom
  • int j
  • for (j0 jltN j)
  • // zero based!!
  • denom (ij)(ij1)/2 i1
  • temp temp xj(1/denom)
  • return temp
  • int main()
  • double x new doubleN
  • double y new doubleN
  • double eig sqrt(N)
  • double denom,temp
  • int i,j,k
  • for (i0 iltN i) xi 1/eig
  • for (k0kltMk)
  • yi0 // compute y Ax
  • pragma omp parallel for shared(y)
  • for (i0 iltN i) yi dotproduct(i,x)
  • // find largest eigenvalue of y eig 0
  • for (i0 iltN i) eig eig yiyi
  • eig sqrt(eig)
  • printf("The largest eigenvalue after 2d
    iteration is 16.15e\n",k1, eig) // normalize
  • for (i0 iltN i) xi yi/eig

45
References
  • http//beowulf.lcs.mit.edu/18.337/beowulf.html
  • http//www.compunity.org/resources/compilers/index
    .php
  • http//www.llnl.gov/computing/tutorials/workshops/
    workshop/openMP/MAIN.htmlClausesDirectives
Write a Comment
User Comments (0)
About PowerShow.com