Title: Shared Memory Parallel Programming
1Shared Memory Parallel Programming
2Gauss-Seidel Method
- for some number of timesteps/iterations
- for (i1 iltn i )
- for( j1, jltn, j )
- grid ij 0.25
- ( gridi-1j gridi1j
- gridij-1 gridij1 )
-
Can be used instead of Jacobi
3Gauss-Seidel Method
- for some number of timesteps/iterations
- for (i1 iltn i )
- for( j1, jltn, j )
- grid ij 0.25
- ( gridi-1j gridi1j
- gridij-1 gridij1 )
-
Data reuse here we have to access values before
they are overwritten
4Gauss-Seidel Method
j
new values old values
i
5Reminder Parallel Jacobi Method
- for some number of timesteps/iterations
- for (i1 iltn i ) ? distribute iterations
- for( j1, jltn, j )
- tempij 0.25
- ( gridi-1j gridi1j
- gridij-1 gridij1 )
- synchronization point
- for( i1 iltn i ) ? distribute iterations
- for( j1 jltn j )
- gridij tempij
- synchronization point
6Gaus-Seidel Data Parallel Computation
j
Thread1
Thread0
Not a great idea
i
compute only after all on lower diagonal computed
7Gauss-Seidel in Parallel
- Method uses less memory and converges faster than
Jacobi - Doesnt require that second loop
- But dependences and anti-dependences in loop nest
- Leads to wave front execution in parallel
This implies non-trivial synchronization. It may
still be worthwhile.
8Red-Black Method
- Grid points partitioned into two sets like a
chess board - colored red and black
- Update in two steps
- Compute new values on red points using current
values on neighboring black points - Compute new values on black points using
current values on neighboring red points - Doesnt require temp array
9Red-Black Grid Points
j
i
10Red-Black Method
- for some number of timesteps/iterations //
update red points - for (i1 iltn i2 )
- for( j1, jltn, j2 )
- grid ij 0.25
- ( gridi-1j gridi1j
- gridij-1 gridij1 )
- for (i2 iltn i2 )
- for( j2, jltn, j2 )
- grid ij 0.25
- ( gridi-1j gridi1j
- gridij-1 gridij1 )
-
11Red-Black Method
- for some number of timesteps/iterations //
update black points - for (i1 iltn i2 )
- for( j2, jltn, j2 )
- grid ij 0.25
- ( gridi-1j gridi1j
- gridij-1 gridij1 )
- for (i2 iltn i2 )
- for( j1, jltn, j2 )
- grid ij 0.25
- ( gridi-1j gridi1j
- gridij-1 gridij1 )
-
12Parallel Red-Black
- Uses same amount of memory as Gauss-Seidel
- Converges more slowly but better than Jacobi
- A popular compromise for parallel computations
13Red-Black Method
j
i
red points updated simultaneously
14Red-Black Method
j
i
black points updated simultaneously
15Parallel Red-Black
- Splits Gauss-Seidel computation into two loops
(two pairs in this formulation) - No dependences within any of loops
- But dependences between update of red points and
update of black points - So need barrier between 2nd and 3rd loop, and at
end of each iteration
16OpenMP Overview
COMP FLUSH
pragma omp critical
COMP THREADPRIVATE(/ABC/)
CALL OMP_SET_NUM_THREADS(10)
- OpenMP An API for Writing Multithreaded
Applications - A set of compiler directives and library routines
for parallel application programmers - Greatly simplifies writing multi-threaded (MT)
programs in Fortran, C and C - Standardizes last 20 years of SMP practice
COMP parallel do shared(a, b, c)
call omp_test_lock(jlok)
call OMP_INIT_LOCK (ilok)
COMP MASTER
COMP ATOMIC
COMP SINGLE PRIVATE(X)
setenv OMP_SCHEDULE dynamic
COMP PARALLEL DO ORDERED PRIVATE (A, B, C)
COMP ORDERED
COMP PARALLEL REDUCTION ( A, B)
COMP SECTIONS
pragma omp parallel for private(A, B)
!OMP BARRIER
COMP PARALLEL COPYIN(/blk/)
COMP DO lastprivate(XX)
Nthrds OMP_GET_NUM_PROCS()
omp_set_lock(lck)
The name OpenMP is the property of the OpenMP
Architecture Review Board.