Title: Further Improvements in Interconnect-Driven High-Level Synthesis of DFGs Using 2-Level Graph Isomorphism
1Further Improvements in Interconnect-Driven
High-Level Synthesis of DFGs Using 2-Level Graph
Isomorphism
Michele Santoro michele.santoro_at_dresd.org
Relatore Donatella Sciuto Correlatore Marco D.
Santambrogio
2Motivation
- Problem statement Interconnects have great
impact on circuit design - on area interconnect size on area
- on circuit's latency signals propagation
- on power consumption parasite capacitance and
intrinsic resistance
- Solution Decreasing the number of Interconnects.
- Focus on the single steps or tasks of HLS
process.
HLS
3Innovation
- Innovative contribution of this thesis is focused
on scheduling and allocation phases - This thesis combines different techniques in an
innovative way to obtain a reduction of Synthesis
cost.
Coloring
Resource Sharing
Interconnection Sharing
Analysis of the scheduling and allocation problem
is divided into two phases.
Static
Dynamic
4Outline
- Introduction
- Scheduling and Allocation problem definition
- State of Art
- Implementations
- MR-LCS
- Coloring Aware
- PushDown algorithm
- Best Resource
- Results
- Benchmark
- Random DFGs
- Conclusion and Future Work
5Introduction
- As briefly said so far, the VLSI Design Flow
allows to create from high level specifications
an actual device. - High Level Synthesis is part of the VLSI Design
flow, and it is made of several steps, like - Scheduling
- Allocation
- Placement
- Floorplanning
Scheduling
Selects control step for each operation.
Determines the number of type of resources to
allocate.
Allocation
Maps operations to Functional Units.
Determines the total number of all kind of
resources, including Mux Registers
6Coloring Aware MR-LCS
- Starting point MR-LCS (ALAP generalization)
- Improving in 2 phases
- Coloring pre-processing phase
- Scheduling together with allocation and binding
- Identifying isomorphic 2-level sub-graphs
- Join patterns
- Split patterns
- Linear patterns
7Estimated Available Time
- Scheduling priority is given to Colored
Sub-graphs. - Need to Estimate Availability of nodes.
P1
R1
P2
R2
EAT 6
ST 5
n
x
y
P1
P2
R3
2
2
P2 generates an overlapping
P1
R1
n
R2
R3
8Pushdown algorithm
- The Pushdown algorithm exploits the Safe Range to
find the best solution in case of overlapping. - It also better manages the utilization of the
resources. - Eg
u
Initial situation
n1
n2
n3
n4
n5
Start backward from R_L
Schedule the operations
Final situation
9Best Resource algorithm
- It is possible to take advantage of the current
state of the scheduling keeping record of all the
existing interconnections.
R1
P1
P2
R2
R5
n
R6
R4
R3
C4
C3
10Results
- To validate the results, the algorithms have been
applied to Media Benchmark and also to Random
generated DFGs. - Captured Costs have been divided into
- Direct costs
- Indirect costs
- Derived costs
Number of interconnections Number of resources
Number of registers Number of multiplexers Max
fan-out
Wire Length Total Area
11Benchmark Results
- fft Fast Discrete Fourier Transformation
- convolve convolution of 2 functions
- jdmerge used in reconstructing JPEG images
- getblk a kernel service that manages buffers
Wires Wires Wires Wires Resources Resources Resources Resources
Benchmark Nodes Edges MR-LCS CA CA_PD_BR BR MR-LCS CA CA_PD_BR BR
fft2 11 9 6 7 5 5 4 5 4 4
fft1 17 12 12 9 9 9 13 8 6 6
convolve2 18 10 8 9 8 8 7 10 8 8
convolve1 23 18 14 16 14 16 9 9 11 11
getblk 33 29 16 22 20 21 9 9 9 9
convolve0 49 41 30 31 29 33 15 14 13 14
jdmerge 79 65 60 54 44 44 32 22 19 21
Avg 32.86 26.29 20.86 21.14 18.43 19.43 12.71 11.00 10.00 10.43
Improv. -1.37 11.64 6.85 13.48 21.35 17.98
12Random DFGs Direct Costs
Wires Wires Wires Wires Wires Wires
Nodes MR-LCS CA CA_PD_BR CA_PD_BR CA_PD_BR BR
a0.0 a0.5 a1.0
50 32 35 32 32 32 32
300 263 240 217 216 217 225
550 512 460 432 431 433 444
800 759 670 639 635 635 662
1050 1010 881 847 845 842 884
1300 1260 1091 1066 1054 1054 1103
1550 1505 1297 1260 1246 1234 1323
1800 1761 1519 1500 1478 1476 1548
7102 6194 5993 5937 5923 6221
Improv. 12.8 15.6 16.4 16.6 12.4
Table shows an improvement of about 60 for
Resource Sharing
Table shows an improvement of about 17 for Wire
Sharing
Resource Resource Resource Resource Resource Resource
Nodes MR-LCS CA CA_PD_BR CA_PD_BR CA_PD_BR BR
a0.0 a0.5 a1.0
50 15 15 10 10 9 10
300 101 101 54 53 55 62
550 179 179 99 98 100 117
800 263 263 139 134 139 169
1050 333 333 168 168 172 217
1300 412 412 207 206 212 257
1550 488 488 233 228 226 296
1800 579 579 299 273 277 353
2368 2368 1209 1170 1190 1480
Improv. 22.9 60.7 61.9 61.3 51.8
13Random DFGs Indirect Costs
Registers Registers Registers Registers Registers Registers
Nodes MR-LCS CA CA_PD_BR CA_PD_BR CA_PD_BR BR
a0.0 a0.5 a1.0
50 29 30 27 27 27 29
300 172 163 133 133 132 132
550 303 288 232 234 228 236
800 438 419 323 325 320 332
1050 580 544 407 411 398 423
1300 721 679 512 510 497 533
1550 841 798 588 605 581 628
1800 986 920 684 695 673 726
4069 3841 2907 2941 2856 3039
Improv. 5.6 28.6 27.7 29.8 25.3
Multiplexers Multiplexers Multiplexers Multiplexers Multiplexers Multiplexers
Nodes MR-LCS CA CA_PD_BR CA_PD_BR CA_PD_BR BR
a0.0 a0.5 a1.0
50 6 7 6 6 6 7
300 38 40 26 27 27 27
550 68 67 45 45 45 48
800 101 93 61 60 59 65
1050 134 123 74 72 73 82
1300 170 155 93 92 88 104
1550 194 180 104 106 102 118
1800 233 209 123 122 119 142
943 873 531 530 519 593
Improv. 7.5 43.6 43.8 44.9 37
Max Fan-out Max Fan-out Max Fan-out Max Fan-out Max Fan-out Max Fan-out
Nodes MR-LCS CA CA_PD_BR CA_PD_BR CA_PD_BR BR
a0.0 a0.5 a1.0
50 9 8 7 7 7 7
300 43 29 25 24 25 27
550 48 32 32 31 33 33
800 56 37 36 36 36 40
1050 62 40 42 40 40 45
1300 65 44 45 44 44 48
1550 73 47 49 46 47 53
1800 71 49 51 49 48 55
426 286 286 276 281 307
Improv. 33 32.9 35.3 34.2 28
14Random DFGs Derived Costs
Wire Length Wire Length Wire Length Wire Length Wire Length Wire Length
Nodes MR-LCS CA CA_PD_BR CA_PD_BR CA_PD_BR BR
a0.0 a0.5 a1.0
50 327 476 258 256 264 328
300 7225 4919 4859 5061 4425 5530
550 19195 16434 11289 11785 13346 13547
800 33625 23352 18485 18398 18506 24278
1050 53617 44911 30835 30872 29861 39128
1300 94362 47793 35643 34661 35451 41891
1550 101569 75465 53907 52420 53373 69657
1800 146467 90954 66617 64530 63814 83573
456387 304304 221893 217983 219040 277932
Improv. 33.3 51.4 52.2 52.0 39.1
Total Area Total Area Total Area Total Area Total Area Total Area
Nodes MR-LCS CA CA_PD_BR CA_PD_BR CA_PD_BR BR
a0.0 a0.5 a1.0
50 1344 1360 816 864 832 928
300 11680 8624 5376 5376 5760 6912
550 22672 15232 10752 10240 10816 12672
800 33792 23296 13376 14976 13728 17920
1050 47728 31520 20672 19712 18816 24960
1300 57792 33600 18816 19712 19712 20800
1550 67936 46144 26496 28288 25920 33408
1800 81312 55040 32000 28800 28800 34800
324256 214816 128304 127968 124384 152400
Improv. 33.8 60.4 60.5 61.6 53.0
Tables show a reduction of about 52 for Total
Wire Length and of about 60 for Total Area
15Conclusions and Future Works
- In this Master Thesis Project simple
considerations have been used - It has been proved that proposed algorithms
perform better than standard MR-LCS achieving - up to 17 of improvement in interconnection
sharing - around 68 of improvement in resource sharing
- reduction of around 64 of overall cost
- Future Works
- Recognize and exploit different topological
patterns. - Multi coloring pre-processing.
- Reiterate the solution through the algorithm
- This allows to get further improvements, because
the algorithm will be aware of the solution
upperbound.
16Questions?