Title: Embarrassingly Parallel Computation
1Embarrassingly Parallel Computation
- But not that embarrassing
2Embarrassingly parallel computations
A computation that can obviously be divided into
a number of completely independent parts, each of
which can be executed by a separate processor.
3MPI master/slave approach
All processes start together
send()
recv()
recv()
send()
Master
slaves
4Embarrassing(?) examples
- Low level image processing
- Mandelbrot set
- Monte Carlo computations
- Numerical integration - quadrature
But the obvious approach is not always the best
one.
5Low level image processing
Process
Map
Square region for each process (can also use
strips - actually easier)
6Geometric transformation
(x, y)
(x?, y?)
(x,y) f(x?, y?)
Pixel values on new grid calculated by resampling
on old grid according to transformation function.
7Some geometrical operations
Shifting Objects shifted in x y directions by
?x in and ?y x? x ?x y? y ?y
Scaling Objects scaled in x y directions by
factors Sx and Sy x? Sx x y? Sy y
Rotation Object rotated through angle ? about the
origin x? x cos ? y sin ? y? -x sin ? y
cos ?
8Generalized image transformations
(x,y) f(x?, y?)
9Mandelbrot set
Set of points in the complex plane that are
quasi-stable (will increase and decrease but not
exceed some limit) when computed by the
function zk1 zk2 c Where zk1 is the
(k1)th iteration of the the complex number
zabi and c is the complex number representing
the position of a given point in the complex
plane. Initially z10. Iterations continued
until magnitude of z is greater than 2 or number
of iterations reaches arbitrary limit. This
number of iterations is assigned to the pixel
corresponding to c.
10Sequential routine
typedef struct float real float imag
complex int calc_pixel(complex c) int
count, max complex z float temp, lengthsq
max 256 z.real 0 z.imag 0 count
0 do temp z.real z.real - z.imag
z.imag c.real z.imag 2 z.real z.imag
c.imag z.real temp lengthsq
z.real z.real z.imag z.imag count
while ((lengthsq lt 4.0) (count lt max))
return count
11Mandelbrot image
Points here are bounded when applied to the
Mandelbrot sequence
12Parallelizing Mandelbrot set computation
Static task assignment Simply divide the region
into a fixed number of parts with each computed
by a separate processor Why is this not a good
idea?
13Dynamic task assignmentWork pool/processor farms
Have each processor request new regions after
computing previous regions
region2
region1
region3
region5
region4
Task
Return results/ request new task
14Other Mandelbrot sets
Can be generated using a number of different
sequences zk1 zk2 1/c zk1 zk3
c zk1 zk3 (c-1)zk - c
15Monte Carlo methods
Use random number selections in calculations to
solve numerical and/or physical problems.
16Example - calculate ?
Points within square chosen randomly Keep score
of how many points lie within circle
17Monte Carlo integration
Toss in random pairs of numbers (x,y)
f(x)
ymax
Counted if yltf(x)
x1
x2
18Alternative (better) method
Use random values of x to compute f(x) and sum
values of f(x)
(x2-x1) (1/N) ?f(xr)
Area
Where xr are randomly generated values betwee x1
and x2
Monte Carlo integration very useful if the
function cannot beintegrated numerically - maybe
having a large number of variables.
19Generating random numbers
Random numbers must be independent of each
other A common technique is to use a linear
congruential generator xi1 (axi c) mod
m Eg a16807, m231-1, c0 provides a good
generator Often require millions of random
numbers Is this embarrassingly parallel?
20Parallel implementation
Obvious way - each slave has its own random
number generator - not formally correct because
random numbers are not drawn from the same
pool - risk of sequence duplication How about a
centralized random generator based in the master
process? - Why is this a bad idea?
21Generating random numbers in parallel
It turns out that xi1 (axi c) mod m xik
(Axi C) mod m A ak mod m, Cc(ak-1ak-2
a1a0) k is the jump constant
- Given m processors
- Generate first m numbers sequentially
- Use each process to generate the next m in
parallel
See text book p97 (p99 old edition)
22(x1,x2,x3,x4)
x2
x3
x4
(x5,x9,x13,x17)
23Cautionary note
What, at first, looks like an embarrassingly
parallel problem, may not be the case!
24Parallel Techniques
- Embarrassingly parallel computations
- Partitioning/divide conquer
- Pipelined computations
- Synchronous computations
- Asynchronous computations
- Load balancing