Title: Programming%20in%20AMPI
1Programming in AMPI
- Laxmikant Kale
- CS320
- Spring 2003
- Kale_at_cs.uiuc.edu
- http//charm.cs.uiuc.edu
- Parallel Programming Laboratory
- Department of Computer Science
- University of Illinois at Urbana Champaign
2Virtualization Object-based Parallelization
User is only concerned with interaction between
objects (VPs)
User View
3Object Arrays
- A collection of data-driven objects
- With a single global name for the collection
- Each member addressed by an index
- sparse 1D, 2D, 3D, tree, string, ...
- Mapping of element objects to procS handled by
the system
Users view
A0
A1
A2
A3
A..
4Object Arrays
- A collection of data-driven objects
- With a single global name for the collection
- Each member addressed by an index
- sparse 1D, 2D, 3D, tree, string, ...
- Mapping of element objects to procS handled by
the system
Users view
A0
A1
A2
A3
A..
System view
A3
A0
5Object Arrays
- A collection of data-driven objects
- With a single global name for the collection
- Each member addressed by an index
- sparse 1D, 2D, 3D, tree, string, ...
- Mapping of element objects to procS handled by
the system
Users view
A0
A1
A2
A3
A..
System view
A3
A0
6Adaptive MPI
- A migration path for legacy MPI codes
- AMPI MPI Virtualization
- Uses Charm object arrays and migratable threads
- Minimal modifications to convert existing MPI
programs - Automated via AMPizer
- Based on Polaris Compiler Framework
- Bindings for
- C, C, and Fortran90
7AMPI
8AMPI
Implemented as virtual processors (user-level
migratable threads)
9Writing AMPI programs
- Same as MPI programming, except
- Do not use global variables.
- Why?
- How to
- Move all global variables into a module in f90
- Called globals, for example
- Class or struct in C/C
- Dynamically allocate one instance of globals
- Make any reference to a (former) global variable
x as - globalsx (or globals-gtx in C)
10Handling global variables in AMPI
- More refined methods for getting rid of global
variables - Esp. when converting existing MPI programs
- Classes of global variables
- Really local (are live only inside subroutines)
- Read-only
- Truly global, but short lived
- Move to functions, and pass as parameters
- Others move to one or more dynamically allocated
modules
11Load Balancing
- Call load balancer periodically
12Example code
- 2D Jacobi relaxation with 1D decomposition
- Repeat until convergence
- Note the use of modules to eliminate global
variables - Not that the program is a legal MPI program as
well as a legal AMPI program
13MODULE chunkModule TYPE, PUBLIC
chunk_type INTEGER h,w
REAL(KIND8), dimension(,), pointer T
REAL(KIND8), dimension(,), pointer Tnew
END TYPE chunk_type END MODULE chunkModule
14PROGRAM MAIN USE mpi USE chunkModule
IMPLICIT NONE INTEGER i,j, iter, left,
right , niter INTEGER tag, tagLeft,
tagRight INTEGER, DIMENSION(MPI_STATUS_SIZE)
status DOUBLE PRECISION error, tval,
maxerr, starttime, endtime TYPE(chunk_type)
chunk INTEGER thisIndex, ierr, nblocks,
pupidx next page
15CALL MPI_Init(ierr) CALL MPI_Comm_rank(MPI_COMM_W
ORLD, thisIndex, ierr) CALL MPI_Comm_size(MPI_COM
M_WORLD, nblocks, ierr) h 100 w 10
allocate(T(h, w2)) DO i 1, w DO j 0,
h-1 T(j1, i1) 100(i-1) j ENDDO
ENDDO call MPI_Barrier(MPI_COMM_WORLD, ierr)
if(thisIndex .eq. 0) then starttime
MPI_Wtime() end if
16maxerr 0.0 left mod((thisIndex-1nblocks),
nblocks) right mod((thisIndex1), nblocks) DO
iter 0,niter-1 maxerr 0.0 call
MPI_Send(T(1,2), h, MPI_DOUBLE_PRECISION, left,
??, MPI_COMM_WORLD, ierr) call
MPI_Recv(T(1, w2), h, MPI_DOUBLE_PRECISION,
right, ??, MPI_COMM_WORLD, status, ierr)
call MPI_Send(T(1,w1), h, MPI_DOUBLE_PRECISION,
right,??, MPI_COMM_WORLD, ierr) call
MPI_Recv(T(1, 1), h, MPI_DOUBLE_PRECISION, left,
??, MPI_COMM_WORLD, status, ierr) compute
(code on next slide) CALL MPI_Finalize(ierr) END
PROGRAM
17 compute loop DO i 2, w1 DO j 2, h-1
Tnew(j,I) (t(j,i)t(j,i1)t(j,i-1)t(j1,i)t(
j-1,i))/5.0 error abs(Tnew(j,I)-t(j,i)) if(erro
r gt maxerr) maxerr error END DO END DO
call MPI_AllReduce(maxerr, maxerr, 1,
MPI_DOUBLE_PRECISION, MPI_MAX,
MPI_COMM_WORLD, ierr) Swap Tnew and T if
(thisIndex .eq. 0) then write(,)
'error', maxerr, ' iter ', iter, ' time ',
MPI_Wtime() endif