Introduction to Parallel Computing - PowerPoint PPT Presentation

1 / 51

About This Presentation

Title:

Introduction to Parallel Computing

Description:

LAM. Cray T3E. MPICH-T3E. Unix / Windows NT. MPICH. Programming ... http://www.lam-mpi.org/ http://epcc.ed.ac.uk/chimp/ http://www-unix.mcs.anl.gov/mpi/www/www3 ... – PowerPoint PPT presentation

Number of Views:23

Avg rating:3.0/5.0

Slides: 52

Provided by: nus5

Category:

more less

Transcript and Presenter's Notes

Title: Introduction to Parallel Computing

1
Introduction to Parallel Computing

Part IIb

2
What is MPI?

Message Passing Interface (MPI) is a
standardised interface. Using this interface,
several implementations have been made.
The MPI standard specifies three forms of
subroutine interfaces
Language independent notation
Fortran notation
C notation.

3
MPI Features

MPI implementations provide
Abstraction of hardware implementation
Synchronous communication
Asynchronous communication
File operations
Time measurement operations

4
Implementations
5
Programming with MPI

What is the difference between programming
using the traditional approach and the MPI
approach
Use of MPI library
Compiling
Running

6
Compiling (1)

When a program is written, compiling it
should be done a little bit different from the
normal situation. Although details differ for
various MPI implementations, there are
two frequently used approaches.

7
Compiling (2)

First approach
Second approach

gcc myprogram.c o myexecutable -lmpi
mpicc myprogram.c o myexecutable
8
Running (1)

In order to run an MPI-Enabled application
we should generally use the command
mpirun
Where x is the number of processes to use,
and ltparametersgt are the arguments to the
Executable, if any.

mpirun np x myexecutable ltparametersgt
9
Running (2)

The mpirun program will take care of the
creation of processes on selected processors.
By default, mpirun will decide which
processors to use, this is usually determined
by a global configuration file. It is possible
to specify processors, but they may only be
used as a hint.

10
MPI Programming (1)

Implementations of MPI support Fortran, C,
or both. Here we only consider programming
using the C Libraries. The first step in writing
a program using MPI is to include the correct
header

include mpi.h
11
MPI Programming (2)
include mpi.h int main (int argc, char
argv) MPI_Init(argc, argv)
MPI_Finalize() return
12
MPI_Init

int MPI_Init (int argc, char argv)
The MPI_Init procedure should be called
before any other MPI procedure (except
MPI_Initialized). It must be called exactly
once, at program initialisation. If removes
the arguments that are used by MPI from the
argument array.

13
MPI_Finalize

int MPI_Finalize (void)
This routine cleans up all MPI states. It should
be the last MPI routine to be called in a
program no other MPI routine may be called
after MPI_Finalize. Pending communication
should be finished before finalisation.

14
Using multiple processes

When running an MPI enabled program using
multiple processes, each process will run an
identical copy of the program. So there must
be a way to know which process we are.
This situation is comparable to that of
programming using the fork statement. MPI
defines two subroutines that can be used.

15
MPI_Comm_size

int MPI_Comm_size (MPI_Comm comm, int size)
This call returns the number of processes
involved in a communicator. To find out how
many processes are used in total, call this
function with the predefined global
communicator MPI_COMM_WORLD.

16
MPI_Comm_rank

int MPI_Comm_rank (MPI_Comm comm, int rank)
This procedure determines the rank (index) of
the calling process in the communicator. Each
process is assigned a unique number within a
communicator.

17
MPI_COMM_WORLD

MPI communicators are used to specify to
what processes communication applies to.
A communicator is shared by a group of
processes. The predefined MPI_COMM_WORLD
applies to all processes. Communicators can
be duplicated, created and deleted. For most
application, use of MPI_COMM_WORLD
suffices.

18
Example Hello World!

include ltstdio.hgt
include "mpi.h"
int main (int argc, char argv)
int size, rank
MPI_Init (argc, argv)
MPI_Comm_size (MPI_COMM_WORLD, size)
MPI_Comm_rank (MPI_COMM_WORLD, rank)
printf ("Hello world! from processor
(d/d)\n", rank1, size)
MPI_Finalize()
return 0

19
Running Hello World!

mpicc -o hello hello.c
mpirun -np 3 hello
Hello world! from processor (1/3)
Hello world! from processor (2/3)
Hello world! from processor (3/3)
_

20
MPI_Send

int MPI_Send (void buf, int count, MPI_Datatype
datatype,
int dest, int tag,
MPI_Comm comm )
Synchronously sends a message to dest. Data
is found in buf, that contains count elements
of datatype. To identify the send, a tag has to
be specified. The destination dest is the
processor rank in communicator comm.

21
MPI_Recv

int MPI_Recv (void buf, int count, MPI_Datatype
datatype,
int source, int tag,
MPI_Comm comm,
MPI_Status status)
Synchronously receives a message from source.
Buffer must be able to hold count elements of
datatype. The status field is filled with status
information. MPI_Recv and MPI_Send calls
should match equal tag, count, datatype.

22
Datatypes

MPI_CHAR signed char
MPI_SHORT signed short int
MPI_INT signed int
MPI_LONG signed long int
MPI_UNSIGNED_CHAR unsigned char
MPI_UNSIGNED_SHORT unsigned short int
MPI_UNSIGNED unsigned int
MPI_UNSIGNED_LONG unsigned long int
MPI_FLOAT float
MPI_DOUBLE double
MPI_LONG_DOUBLE long double
(http//www-jics.cs.utk.edu/MPI/MPIguide/MPIguide.
html)

23
Example send / receive

include ltstdio.hgt
include "mpi.h"
int main (int argc, char argv)
MPI_Status s
int size, rank, i, j
MPI_Init (argc, argv)
MPI_Comm_size (MPI_COMM_WORLD, size)
MPI_Comm_rank (MPI_COMM_WORLD, rank)
if (rank 0) // Master process
printf ("Receiving data . . .\n")
for (i 1 i lt size i)
MPI_Recv ((void )j, 1, MPI_INT, i,
0xACE5, MPI_COMM_WORLD, s)
printf ("d sent d\n", i, j)

24
Running send / receive

mpicc -o sendrecv sendrecv.c
mpirun -np 4 sendrecv
Receiving data . . .
1 sent 1
2 sent 4
3 sent 9
_

25
MPI_Bcast

int MPI_Bcast (void buffer, int count,
MPI_Datatype datatype,
int root, MPI_Comm
comm)
Synchronously broadcasts a message from
root, to all processors in communicator comm
(including itself). Buffer is used as source in
root processor, as destination in others.

26
MPI_Barrier

int MPI_Barrier (MPI_Comm comm)
Blocks until all processes defined in comm
have reached this routine. Use this routine to
synchronize processes.

27
Example broadcast / barrier

int main (int argc, char argv)
int rank, i
MPI_Init (argc, argv)
MPI_Comm_rank (MPI_COMM_WORLD, rank)
if (rank 0) i 27
MPI_Bcast ((void )i, 1, MPI_INT, 0,
MPI_COMM_WORLD)
printf ("d i d\n", rank, i)
// Wait for every process to reach this code
MPI_Barrier (MPI_COMM_WORLD)
MPI_Finalize()
return 0

28
Running broadcast / barrier

mpicc -o broadcast broadcast.c
mpirun -np 3 broadcast
0 i 27
1 i 27
2 i 27
_

29
MPI_Sendrecv

int MPI_Sendrecv (void sendbuf, int sendcount,
MPI_Datatype sendtype,
int dest, int sendtag,
void recvbuf, int recvcount,
MPI_Datatype recvtype,
int source, int recvtag, MPI_Comm
comm, MPI_Status status)
int MPI_Sendrecv_replace( void buf, int count,
MPI_Datatype datatype,
int
dest, int sendtag, int source, int recvtag,
MPI_Comm comm, MPI_Status status )
Send and receive (2nd, using only one buffer).

30
Other useful routines

MPI_Scatter
MPI_Gather
MPI_Type_vector
MPI_Type_commit
MPI_Reduce / MPI_Allreduce
MPI_Op_create

31
Example scatter / reduce

int main (int argc, char argv)
int data 1, 2, 3, 4, 5, 6, 7 // Size
must be gt processors
int rank, i -1, j -1
MPI_Init (argc, argv)
MPI_Comm_rank (MPI_COMM_WORLD, rank)
MPI_Scatter ((void )data, 1, MPI_INT,
(void )i , 1, MPI_INT,
0, MPI_COMM_WORLD)
printf ("d Received i d\n", rank, i)
MPI_Reduce ((void )i, (void )j, 1,
MPI_INT,
MPI_PROD, 0, MPI_COMM_WORLD)
printf ("d j d\n", rank, j)
MPI_Finalize()

32
Running scatter / reduce

mpicc -o scatterreduce scatterreduce.c
mpirun -np 4 scatterreduce
0 Received i 1
0 j 24
1 Received i 2
1 j -1
2 Received i 3
2 j -1
3 Received i 4
3 j -1
_

33
Some reduce operations
34
Measuring running time

double MPI_Wtime (void)

double timeStart, timeEnd ... timeStart
MPI_Wtime() // Code to measure time for goes
here. timeEnd MPI_Wtime() ... printf (Running
time f seconds\n, timeEnd
timeStart)
35
Parallel sorting (1)

Sorting an sequence of numbers using the
binarysort method. This method divides
a given sequence into two halves (until
only one element remains) and sorts both
halves recursively. The two halves are then
merged together to form a sorted sequence.

36
Binary sort pseudo-code

sorted-sequence BinarySort (sequence)
if ( elements in sequence gt 1)
seqA first half of sequence
seqB second half of sequence
BinarySort (seqA)
BinarySort (seqB)
sorted-sequence merge (seqA, seqB)
else sorted-sequence sequence

37
Merge two sorted sequences
1
7
8
4
5
6
2
3
38
Example binary sort
39
Parallel sorting (2)

This way of dividing work and gathering the
results is a quite natural way to use for a
parallel implementation. Divide work in two
to two processors. Have each of these
processors divide their work again, until either
no data can be split again or no processors are
available anymore.

40
Implementation problems

Number of processors may not be a power of two
Number of elements may not be a power of two
How to achieve an even workload?
Data size is less than number of processors

41
Parallel matrix multiplication

We use the following partitioning of data (p4)

P1
P1
P2
P2
P3
P3
P4
P4
42
Implementation

Master (process 0) reads data
Master sends size of data to slaves
Slaves allocate memory
Master broadcasts second matrix to all other
processes
Master sends respective parts of first matrix to
all other processes
Every process performs its local multiplication
All slave processes send back their result.

43
Multiplication 1000 x 1000
44
Multiplication 5000 x 5000
45
Gaussian elimination

We use the following partitioning of data (p4)

P1
P1
P2
P2
P3
P3
P4
P4
46
Implementation (1)

Master reads both matrices
Master sends size of matrices to slaves
Slaves calculate their part and allocate memory
Master sends each slave its respective part
Set sweeping row to 0 in all processes
Sweep matrix (see next sheet)
Slave send back their result

47
Implementation (2)

While sweeping row not past final row do
Have every process decide whether they own the
current sweeping row
The owner sends a copy of the row to every other
process
All processes sweep their part of the matrix
using the current row
Sweeping row is incremented

48
Programming hints

Keep it simple!
Avoid deadlocks
Write robust code even at cost of speed
Design in advance, debugging is more difficult
(printing output is different)
Error handing requires synchronisation, you cant
just exit the program.

49
References (1)

MPI Forum Home Page
http//www.mpi-forum.org/index.html
Beginners guide to MPI (see also /MPI/)
http//www-jics.cs.utk.edu/MPI/MPIguide/MPIguide.h
tml
MPICH
http//www-unix.mcs.anl.gov/mpi/mpich/

50
References (2)

Miscellaneous
http//www.erc.msstate.edu/labs/hpcl/projects/mpi/
http//nexus.cs.usfca.edu/mpi/
http//www-unix.mcs.anl.gov/gropp/
http//www.epm.ornl.gov/walker/mpitutorial/
http//www.lam-mpi.org/
http//epcc.ed.ac.uk/chimp/
http//www-unix.mcs.anl.gov/mpi/www/www3/

51
Thank you for coming!

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

Introducing-PowerShowcom PowerPoint PPT Presentation

Introducing-PowerShowcom - Introducing-PowerShowcom (Without Music)

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

Introduction to Parallel Computing PowerPoint PPT Presentation

Introduction to Parallel Computing - Load balancing is important to parallel programs for ... Memory Hybrid Distributed-Shared Memory Shared Memory Shared memory parallel computers vary ... | PowerPoint PPT presentation | free to view

Introduction to Cluster Computing PowerPoint PPT Presentation

Introduction to Cluster Computing - ... and HTC Parallel algorithms Software technologies High Performance Computing CPU clock frequency Parallel computers ... load balancing Transparent process ... | PowerPoint PPT presentation | free to view

Introduction to Parallel Programming PowerPoint PPT Presentation

Introduction to Parallel Programming - Title: The IC Wall Collaboration between Computer science + Physics Last modified by: bal Document presentation format: Custom Other titles: Times New Roman Arial ... | PowerPoint PPT presentation | free to view

CX: A Scalable, Robust Network for Parallel Computing PowerPoint PPT Presentation

CX: A Scalable, Robust Network for Parallel Computing - CX: A Scalable, Robust Network for Parallel Computing Peter Cappello & Dimitrios Mourloukos Computer Science UCSB | PowerPoint PPT presentation | free to view

Introduction to Parallel Computing - Introduction to Parallel Computing Yao-Yuan Chuang | PowerPoint PPT presentation | free to view

CS267 Applications of Parallel Computers Lecture 1: Introduction PowerPoint PPT Presentation

CS267 Applications of Parallel Computers Lecture 1: Introduction - Applications of Parallel Computers Lecture 1: Introduction Kathy Yelick yelick@eecs.berkeley.edu http://www.cs.berkeley.edu/~yelick | PowerPoint PPT presentation | free to view

Lecture 1: Introduction to High Performance Computing PowerPoint PPT Presentation

Lecture 1: Introduction to High Performance Computing - Title: CSE 574 Parallel Processing Author: ICS Faculty User Last modified by: Esin Onbasioglu Created Date: 7/12/2005 12:19:29 PM Document presentation format | PowerPoint PPT presentation | free to view

An Introduction to Parallel Computing PowerPoint PPT Presentation

An Introduction to Parallel Computing - Task parallelism. The problem consists of a number of independent tasks ... Data parallelism. The problem consists of dependent tasks ... | PowerPoint PPT presentation | free to view

Introduction to Parallel Programming (Message Passing) PowerPoint PPT Presentation

Introduction to Parallel Programming (Message Passing) - Introduction to Parallel Programming (Message Passing) Francisco Almeida falmeida@ull.es Parallel Computing Group Beowulf Computers The Parallel Model The Message ... | PowerPoint PPT presentation | free to view

Introduction to Parallel Programming - Given two N x N matrices A and B. Compute C = A x B. Cij = Ai1B1j ... if x not present = factor 2 ... 50] = factor 1. 3. if A[51] = x = factor 51 ... | PowerPoint PPT presentation | free to view

Introduction to Parallel Computing - Its role in providing multiplicity of datapaths and increased access to storage ... The emergence of standardized parallel programming environments, libraries, and ... | PowerPoint PPT presentation | free to view

An Interactive Introduction to OpenGL Programming PowerPoint PPT Presentation

An Interactive Introduction to OpenGL Programming - An Interactive Introduction to OpenGL Programming Dave Shreiner Ed Angel Vicki Shreiner | PowerPoint PPT presentation | free to view

Introduction to Computer Systems and Performance PowerPoint PPT Presentation

Introduction to Computer Systems and Performance - Chapter 1 Introduction to Computer Systems and Performance CS.216 Computer Architecture and Organization | PowerPoint PPT presentation | free to view

Statistical Machine Translation Part I - Introduction PowerPoint PPT Presentation

Statistical Machine Translation Part I - Introduction - * Where we have been Human evaluation & BLEU Parallel corpora Sentence alignment ... of machine translation Parallel corpora Sentence alignment ... | PowerPoint PPT presentation | free to view

Grid Computing: an introduction PowerPoint PPT Presentation

Grid Computing: an introduction - Weather Forecast and Climate. Simulation of VLSI systems. Parallel Search in Databases ... Fibre channel, Gigabit Ethernet, Web services, XML: 1995-2000 ... | PowerPoint PPT presentation | free to view

Using Computing Methods to Secure Vehicular Ad hoc Network (VANET): A Survey PowerPoint PPT Presentation

Using Computing Methods to Secure Vehicular Ad hoc Network (VANET): A Survey - Security is one of the key prominent factors for implement VANET in real environment. Different researchers already provides different solutions to make it secure from attacker and attacks in network. In this survey paper, discuss in detail the various computing methods and illustrate the relationship with vehicular network. Using these computing methods to secure the vehicular network from attackers and attacks. | PowerPoint PPT presentation | free to view

pArray as an Efficient Static Parallel Container in STAPL (Standard Template Adaptive Parallel Library) PowerPoint PPT Presentation

pArray as an Efficient Static Parallel Container in STAPL (Standard Template Adaptive Parallel Library) - pArray as an Efficient Static Parallel Container in STAPL (Standard Template Adaptive Parallel Library) Presenter: Olga Tkachyshyn Grad Student Advisors: Ping An ... | PowerPoint PPT presentation | free to view

CS267/E233 Applications of Parallel Computers Lecture 1: Introduction PowerPoint PPT Presentation

CS267/E233 Applications of Parallel Computers Lecture 1: Introduction - Savings: approx. $1 billion per company per year. Securities industry: ... Parallel processors, collectively, have large, fast ... | PowerPoint PPT presentation | free to view

AQUEOUS COMPUTING - Writing on Molecules - PowerPoint PPT Presentation

AQUEOUS COMPUTING - Writing on Molecules - - AQUEOUS COMPUTING - Writing on Molecules - T. Head, M. Yamamura, and S. Gal Binghamton University | PowerPoint PPT presentation | free to view

Parallel Job Deployment and Monitoring in a Hierarchy of Mobile Agents PowerPoint PPT Presentation

Parallel Job Deployment and Monitoring in a Hierarchy of Mobile Agents - Parallel Job Deployment and Monitoring in a Hierarchy of Mobile Agents Munehiro Fukuda Computing & Software Systems, University of Washington, Bothell | PowerPoint PPT presentation | free to view

CLOUD COMPUTING: PROSPECTS AND CHALLENGES PowerPoint PPT Presentation

CLOUD COMPUTING: PROSPECTS AND CHALLENGES - cloud computing: prospects and challenges by prof. (mrs.) s. c. chiemeke department of computer science university of benin, benin city | PowerPoint PPT presentation | free to view

Parallel Programming Platforms PowerPoint PPT Presentation

Parallel Programming Platforms - Chapter 2 Parallel Programming Platforms Reference: http://www-users.cs.umn.edu/~karypis/parbook/ http://www.eel.tsint.edu.tw/teacher/ttsu/teach01.htm | PowerPoint PPT presentation | free to view

Parallel Programming Orientation PowerPoint PPT Presentation

Parallel Programming Orientation - ... no disk required Less than 20 seconds Virtual, ... Shared memory OpenMP Sockets PVM Linda MPI Most distributed parallel programs are now ... Presentation Author ... | PowerPoint PPT presentation | free to view

Introduction to Parallel Computing, MPI, and OpenMP PowerPoint PPT Presentation

Introduction to Parallel Computing, MPI, and OpenMP - High Performance Computing Photos -- http://cs.calvin.edu/CS/parallel/resources/photos ... set the include paths and links to appropriate libraries ... | PowerPoint PPT presentation | free to view

Parallel Algorithms: status and prospects PowerPoint PPT Presentation

Parallel Algorithms: status and prospects - Parallel Algorithms: status and prospects Guo-Liang CHEN Dept.of computer Science and Technology National High Performance Computing Center at Hefei | PowerPoint PPT presentation | free to view

Parallel Application Scaling, Performance, and Efficiency PowerPoint PPT Presentation

Parallel Application Scaling, Performance, and Efficiency - Placement of computation within a parallel computer. Performance costs of ... Fabrizio Petrini, Darren J. Kerbyson, Scott Pakin, 'The Case of the Missing ... | PowerPoint PPT presentation | free to view

Heuristic Optimization Methods Introduction to Evolutionary Computation PowerPoint PPT Presentation

Heuristic Optimization Methods Introduction to Evolutionary Computation - Heuristic Optimization Methods Introduction to Evolutionary Computation David B. Fogel | PowerPoint PPT presentation | free to view