Title: Examples of Two-Dimensional Systolic Arrays
1Examples of Two-Dimensional Systolic Arrays
2Obvious Matrix Multiply
Columns of b distributed to each PE in column.
Rows of a distributed to each PE in row.
Row x Column on respective PEs.
3Systolic Matrix Multiplication
- Processors are arranged in a 2-D grid.
- Each processor accumulates one element of the
product. - The elements of the matrices to be multiplied are
pumped through the array.
4- Multiplication
Here the matrix B is Transposed! - Each PE function is to first multiply and then
add. - PE ij ? Cij
Bn1
Sqewing inputs
B13 B22 B31
B12 B21
B11
A1nA12 A11
PE
PE
PE
PE
A22 A21
PE
PE
PE
PE
..A31
An1
PE
PE
PE
PE
5Systolic Matrix MultiplicationIllustrated with
two 3x3 matrices
b2,2 b1,2 b0,2
b2,1 b1,1 b0,1
b2,0 b1,0 b0,0
columns of b
alignments in time
rows of a
a0,2 a0,1 a0,0
a1,2 a1,1 a1,0
a2,2 a2,1 a2,0
6Systolic Matrix MultiplicationIllustrated with
two 3x3 matrices
b2,2 b1,2 b0,2
b2,1 b1,1 b0,1
b2,0 b1,0
alignments in time
b0,0
a0,0
a0,0b0,0
a0,2 a0,1
a1,2 a1,1 a1,0
a2,2 a2,1 a2,0
7Systolic Matrix MultiplicationIllustrated with
two 3x3 matrices
b2,2 b1,2 b0,2
b2,1 b1,1
b2,0
alignments in time
b1,0
b0,1
a0,0
a0,1
a0,0b0,0 a0,1b1,0
a0,0b0,1
a0,2
b0,0
a1,0
a1,0b0,0
a1,2 a1,1
a2,2 a2,1 a2,0
8Systolic Matrix MultiplicationIllustrated with
two 3x3 matrices
b2,2 b1,2
b2,1
b2,0
b1,1
b1,0
a0,2
a0,0
a0,0b0,0 a0,1b1,0 a0,2b2,0
a0,0b0,2
a0,1
a0,0b0,1 a0,1b1,1
b0,1
b1,0
a1,1
a1,0
a1,0b0,0 a1,1b1,0
a1,0b0,1
a1,2
b0,0
a2,0
a2,0b0,0
a2,2 a2,1
9Systolic Matrix MultiplicationIllustrated with
two 3x3 matrices
b2,2
b2,1
b1,2
a0,2
a0,0b0,2 a0,1b1,2
a0,1
a0,0b0,0 a0,1b1,0 a0,2b2,0
a0,0b0,1 a0,1b1,1 a0,2b2,1
b2,0
b1,1
b1,0
a1,0
a1,1
a1,2
a1,0b0,2
a1,0b0,0 a1,1b1,0 a1,2b2,0
a1,0b0,1 a1,1b1,1
b0,1
b1,0
a2,1
a2,0
a2,0b0,0 a2,1b1,0
a2,0b0,1
a2,2
10Systolic Matrix MultiplicationIllustrated with
two 3x3 matrices
b2,2
a0,2
a0,0b0,2 a0,1b1,2 a0,2b2,2
a0,0b0,0 a0,1b1,0 a0,2b2,0
a0,0b0,1 a0,1b1,1 a0,2b2,1
b2,1
b1,2
a1,1
a1,2
a1,0b0,0 a1,1b1,0 a1,2b2,0
a1,0b0,2 a1,1b1,2
a1,0b0,1 a1,1b1,1 a1,2b2,1
b2,0
b1,1
b1,0
a2,1
a2,2
a2,0
a2,0b1,0
a2,0b0,0 a2,1b1,0 a2,2b2,0
a2,0b0,1 a2,1b1,1
11Systolic Matrix MultiplicationIllustrated with
two 3x3 matrices
a0,0b0,2 a0,1b1,2
a0,0b0,2 a0,1b1,2 a0,2b2,2
a0,0b0,0 a0,1b1,0 a0,2b2,0
a0,0b0,1 a0,1b1,1 a0,2b2,1
b2,2
a1,2
a1,0b0,2 a1,1b1,2 a1,2b2,2
a1,0b0,0 a1,1b1,0 a1,2b2,0
a1,0b0,1 a1,1b1,1 a1,2b2,1
b2,1
b1,2
a2,1
a2,2
a2,0b1,0 a2,0b1,1
a2,0b0,0 a2,1b1,0 a2,2b2,0
a2,0b0,1 a2,1b1,1 a2,2b2,1
12Systolic Matrix MultiplicationIllustrated with
two 3x3 matrices
a0,0b0,2 a0,1b1,2
a0,0b0,2 a0,1b1,2 a0,2b2,2
a0,0b0,0 a0,1b1,0 a0,2b2,0
a0,0b0,1 a0,1b1,1 a0,2b2,1
a1,0b0,2 a1,1b1,2 a1,2b2,2
a1,0b0,0 a1,1b1,0 a1,2b2,0
a1,0b0,1 a1,1b1,1 a1,2b2,1
b2,2
a2,2
a2,0b1,0 a2,0b1,1 a2,2b2,2
a2,0b0,0 a2,1b1,0 a2,2b2,0
a2,0b0,1 a2,1b1,1 a2,2b2,1
13Systolic Algorithm for Matrix Multiplication
another visualization is very useful
- Problem multiply two nxn matrices A a_ij and
Bb_ij. Product matrix will be Rr_ij. - Systolic solution uses 2D array with NxN cells, 2
input streams and 2 output streams
14Operation at each cell
- Each cell updates at each time step as
shown below - initialized to 0
15Systolic Matrix Multiplication
b41 b42 b43 b44 b31 b32
b33 b34 b21 b22 b23
b24 b11 b12 b13 b14 --
-- -- -- --
-- ----
a44 a34 a24 a14 a43
a33 a23 a13 a42 a32
a22 a12 a41 a31 a21
a11 -- -- -- -- -- --
P11
P12
P21
P31
P13
P22
P32
P41
P14
P23
P33
P42
P24
P34
P43
P44
16Data Flow for Systolic MM
17Data Flow for Systolic MM
18Data Flow for Systolic MM
19Data Flow for Systolic MM
20Data Flow for Systolic MM
21Programming Issues
- Performance of systolic algorithms based on fine
granularity (1 update about the same as a
communication) and regular dataflow - Can be done on asynchronous platforms with
tagging but must ensure that idle time does not
dominate computation - Many systolic algorithms do not map well to more
general MIMD or distributed platforms