Parallel Prefix and Data Parallel Operations - PowerPoint PPT Presentation

About This Presentation
Title:

Parallel Prefix and Data Parallel Operations

Description:

Parallel Prefix and. Data Parallel Operations. Motivation: basic parallel ... Assume that n = 2k. for i = 0 to k-1. for j = 0 to n-1-2i do in parallel ... – PowerPoint PPT presentation

Number of Views:189
Avg rating:3.0/5.0
Slides: 23
Provided by: cse6
Learn more at: http://www.cse.msu.edu
Category:

less

Transcript and Presenter's Notes

Title: Parallel Prefix and Data Parallel Operations


1
Parallel Prefix and Data Parallel Operations
  • Motivation basic parallel operations which
    occurs repeatedly.
  • Let ) be an associative operation.
  • (a1 ) a2) ) a3 a1 ) (a2 ) a3 )
  • How to compute
  • (a1 ) a2 ) . ) an ) in parallel in O(logn)
    time?

2
Approach 1
a0
a1
a2
a3
a4
a5
a6
a7
?01
?00
?12
?23
?34
?45
?56
?67
d1
?01
?00
?02
?03
?14
?25
?36
?47
d2
?01
?00
?02
?03
?04
?05
?06
?07
d4
Assume that n 2k for i 0 to k-1 for j
0 to n-1-2i do in parallel xj 2i
xj xj 2i
3
How to do on Tree Architecture?
for each node if there is a signal from left and
right St lt- Sl Sr if there is a signal R,
send R to both its children if the node is a
leaf and there is a signal R, X lt- X R
4
How to do on a Hypercube
A complete binary tree can be embedded into a
hypercube Simpler solution each node
computes prefix and total sum
for i 0 to k-1 for j 0 to
n-1 do in parallel xj xj
sumji if i-th bit of j 1
sumj sumj sumji, where ji and
j have the same binary number representation
except their i-th bit, where the i-th bit of ji
is the complement of the i-bit of j.

5
Prefix on Hypercube
for i 0 to k-1 for j 0 to
n-1 do in parallel xj xj
sumji if i-th bit of j 1
sumj sumj sumji,
6
Applications of Data Parallel Operations
  • Any associative operations
  • Examples
  • min, max, add
  • adding two binary numbers
  • finite state automata
  • radix sort
  • segmented prefix sum
  • routing
  • packing
  • unpacking
  • broadcast (copy-scan)
  • solving recurrence equations
  • straight line computation (parallel arithmetic
    evaluation)

7
Adding two n bit numbers as parallel prefix
  • a an-1 . a0
  • b bn-1 . b0
  • s a b
  • note that si ai ? bi ? ci-1
  • to compute ci define g and p as
  • gi ai ? bi , pi ai ? bi
  • define ? as (g,p) ? (g,p) (g ? (p ?
    g), p ? p)
  • Then carry bit ci can be computed by
  • (g,p) ? (g,p) (g ? (p ? g), p ? p)
  • (Gi, Pi) (gi,pi) ? (gi-1, pi-1) ? ?
    (g0,p0)
  • and Gi ci

8
Hardware circuit of recursive look-ahead adder
9
Parsing a regular language
?(q0,b) q2, ?(q0,c) q1, ?(q1,b) q0,
?(q1,c) qr, ?(q2,b) qr, ?(q2,c) q0 qr
reject state
b
10
Segmented Prefix operation
11
Segmented Prefix computation
Let ? be any associative operation. For
segmented operation of ?, define ? as
follows
Then ? is associative and we can compute
segmented operation in O(logn) time.
12
Enumerating
Data 5 6 3 1 8 3 7 5 9 2 active
procs 1 0 1 1 0 0 1 0 1
0 enumerated 0 x 1 2 x x 3 x 4 0
13
packing
  • data 5 6 3 1 8 3 7 5 9 2
  • active procs 1 0 1 1 0 0 1 0 1 0
  • enumerated 0 x 1 2 x x 3 x 4 x
  • packed data 5 3 1 7 9 x x x x x

14
Packing and Unpacking on Hypercube
  • Packing
  • adjust bit 0
  • adjust bit 1
  • adjust bit 2
  • ...
  • adjust bit k-1
  • Unpacking
  • adjust bit k-1
  • adjust bit k-2
  • ...
  • adjust bit 1
  • adjust bit 0
  • How about in the order of adjust bit 0, 1, ...,
    k-1 for packing?

15
Unpacking
Address 0 1 2 3 4 5 6 7 8
9 data 6 2 3 5 9 x x x x
x active procs 1 0 1 1 0 0 1 0 1
0 enumerated 0 x 1 2 x x 3 x 4
x destination 0 2 3 6 8 x x x x
x unpacked data 6 x 2 3 x x 5 x 9
x
16
Copy Scan (broadcast)
address 0 1 2 3 4 5 6 7
8 9 data 6 2 3 5 9 4 1
7 8 10 segmented bit 1 0 1 1 0
0 1 0 1 0 result 6 6 3
5 5 5 1 1 8 8
17
Radix Sort
for j k-1 to 0 // x has k
bits for all i in 0 .. n-1 do
parallel if j-th bit of xi
is 0 yi enumerate
c count if
j-th bit of xi is 1 y i lt-
enumerate c x yi x i

Radix sort another code for j k-1 to 0
// x has k bits for all i in 0
.. n-1 do parallel pack left
xi if j-th bit of xi pack
right xi if j-th bit of xi
18
Quick Sort
  • 1. Pick a pivot p
  • 2. Broadcast p
  • 3. For all PE i, compare Ai with p
  • if Ai ltp, pack left Ai in the segment
  • if Ai gt p, pack right Ai in the
    segment
  • 4. Mark the segment boundary
  • 5. Each segment, quick sort recursively

19
Solving Linear Recurrence Equations
  • fnan-1fn-1 an-2fn-2
  • fn
  • fn-1

20
Pointer Jumping and Tree Computation
How to compute a prefix on a linked list?
If NEXTi ! NILL then Xi lt- Xi
XNEXTi NEXTi lt- NEXTNEXTi
How to make 1 3 6 10 15 21 28
order?
21
Application Tree computation
Pre-order numbering
Can be applied to in order, post order number of
children, depth etc. Bi-component, etc also
22
Recurrence Equation
  • Example LU decomposition on a triangular matrix
Write a Comment
User Comments (0)
About PowerShow.com