Title: Adding numbers
1(No Transcript)
2(No Transcript)
3(No Transcript)
4Adding numbers
n data items, p processors
ts O(n)
tp O(n/p) if data on each proc gt Sts/tpO(p)
tp O(n n/p) if data needs broadcasting gt
Sts/tpo(1)
5Sequential Recursion
6(No Transcript)
7Parallel Recursion
tcomm O(n/2 n/4 .. n/p) O(n)
So(1)
tcomp O(n/2 n/4 .. n/p) O(n)
8tcomm O(1 1 .. 1) O(log p)
SO(n / log p)
tcomp O(1 1 .. 1) O(log p)
9Parallel Bucket Sort
10Sequential
m buckets , n numbers
ts O(n m((n/m) log (n/m))) O(n log(n/m))
11m buckets , n numbers, pm processors
tp O(n (n/p) log (n/p))
12(No Transcript)
13tp O(n/p (n/p) log (n/p)) O( (n/p) log
(n/p)) gt SO(p)
14(No Transcript)
15Det. Sample Sort
16Det. Sample Sort
- sort locally and create p-sample
17Det. Sample Sort
- send all p-samples to processor 1
18Det. Sample Sort
- proc.1 sort all received samples and compute
global p-sample
19Det. Sample Sort
- broadcast global p-sample
- bucket locally according to global p-sample
- send bucket i to proc.i
- resort locally
20Det. Sample Sort
- Lemma Each proc. receives at most 2 n/p data
items
n/p2
n/p2
global sample
global sample
21Det. Sample Sort
n/p
n/p
n/p
n/p
n/p
n/p
n/p
n/p
1
2
3
4
5
6
7
8
1
2
3
4
5
6
7
8
- Post-Processing Array Balancing
- 2 Rounds
- Each proc. sends rec. data size to all other
proc. - Move data to right location via one h-relation
22Det. Sample Sort
- 5 MPI_AlltoAllv
- for n/p gt p2
- O(n/p log n) local comp.
- Goodrich (FOCS'98)
- O(1) rounds
- for n/p gt pe
23Performance Det. Sample Sort
24Numerical Integration
25static assignment of processors to segments of
a,b
area d (f(p)f(q))/2
26Problem precision depends on curves shape
27Adaptive Quadrature
Terminate when C is sufficiently small Problem
different parts of the curve need different
resolution
28segment 1
segment 3
segment 4
segment 2
segment 5
29Gravitational N-Body Problem
30(No Transcript)
31(No Transcript)
32ts O(n2) per time step
33(No Transcript)
34(No Transcript)
35(No Transcript)
36(No Transcript)
37(No Transcript)
38(No Transcript)
39for each time step for each object traverse
tree to determine its forces Problemtraversal
s have different lengths
40object 1
object 3
object 5
object 2
object 4