Further Improvements in Interconnect-Driven High-Level Synthesis of DFGs Using 2-Level Graph Isomorphism - PowerPoint PPT Presentation

About This Presentation
Title:

Further Improvements in Interconnect-Driven High-Level Synthesis of DFGs Using 2-Level Graph Isomorphism

Description:

Further Improvements in Interconnect-Driven High-Level Synthesis of DFGs Using 2-Level Graph Isomorphism Michele Santoro: michele.santoro_at_dresd.org – PowerPoint PPT presentation

Number of Views:91
Avg rating:3.0/5.0
Slides: 17
Provided by: MicheleS155
Learn more at: http://www.dresd.org
Category:

less

Transcript and Presenter's Notes

Title: Further Improvements in Interconnect-Driven High-Level Synthesis of DFGs Using 2-Level Graph Isomorphism


1
Further Improvements in Interconnect-Driven
High-Level Synthesis of DFGs Using 2-Level Graph
Isomorphism
Michele Santoro michele.santoro_at_dresd.org
Relatore Donatella Sciuto Correlatore Marco D.
Santambrogio
2
Motivation
  • Problem statement Interconnects have great
    impact on circuit design
  • on area interconnect size on area
  • on circuit's latency signals propagation
  • on power consumption parasite capacitance and
    intrinsic resistance
  • Solution Decreasing the number of Interconnects.
  • Focus on the single steps or tasks of HLS
    process.

HLS
3
Innovation
  • Innovative contribution of this thesis is focused
    on scheduling and allocation phases
  • This thesis combines different techniques in an
    innovative way to obtain a reduction of Synthesis
    cost.

Coloring
Resource Sharing
Interconnection Sharing
Analysis of the scheduling and allocation problem
is divided into two phases.
Static
Dynamic
4
Outline
  • Introduction
  • Scheduling and Allocation problem definition
  • State of Art
  • Implementations
  • MR-LCS
  • Coloring Aware
  • PushDown algorithm
  • Best Resource
  • Results
  • Benchmark
  • Random DFGs
  • Conclusion and Future Work

5
Introduction
  • As briefly said so far, the VLSI Design Flow
    allows to create from high level specifications
    an actual device.
  • High Level Synthesis is part of the VLSI Design
    flow, and it is made of several steps, like
  • Scheduling
  • Allocation
  • Placement
  • Floorplanning

Scheduling
Selects control step for each operation.
Determines the number of type of resources to
allocate.
Allocation
Maps operations to Functional Units.
Determines the total number of all kind of
resources, including Mux Registers
6
Coloring Aware MR-LCS
  • Starting point MR-LCS (ALAP generalization)
  • Improving in 2 phases
  • Coloring pre-processing phase
  • Scheduling together with allocation and binding
  • Identifying isomorphic 2-level sub-graphs
  • Join patterns
  • Split patterns
  • Linear patterns

7
Estimated Available Time
  • Scheduling priority is given to Colored
    Sub-graphs.
  • Need to Estimate Availability of nodes.

P1
R1
P2
R2
EAT 6
ST 5
n
x
y
P1
P2
R3
2
2
P2 generates an overlapping
P1
R1
n
R2
  • 1
  • EAT(n) 8
  • 0
  • EAT(n) Alap(n)

R3
8
Pushdown algorithm
  • The Pushdown algorithm exploits the Safe Range to
    find the best solution in case of overlapping.
  • It also better manages the utilization of the
    resources.
  • Eg

u
Initial situation
n1
n2
n3
n4
n5
Start backward from R_L
Schedule the operations
Final situation
9
Best Resource algorithm
  • It is possible to take advantage of the current
    state of the scheduling keeping record of all the
    existing interconnections.

R1
P1
P2
R2
R5
n
R6
R4
R3
C4
C3
10
Results
  • To validate the results, the algorithms have been
    applied to Media Benchmark and also to Random
    generated DFGs.
  • Captured Costs have been divided into
  • Direct costs
  • Indirect costs
  • Derived costs

Number of interconnections Number of resources
Number of registers Number of multiplexers Max
fan-out
Wire Length Total Area
11
Benchmark Results
  • fft Fast Discrete Fourier Transformation
  • convolve convolution of 2 functions
  • jdmerge used in reconstructing JPEG images
  • getblk a kernel service that manages buffers

Wires Wires Wires Wires Resources Resources Resources Resources
Benchmark Nodes Edges MR-LCS CA CA_PD_BR BR MR-LCS CA CA_PD_BR BR
fft2 11 9 6 7 5 5 4 5 4 4
fft1 17 12 12 9 9 9 13 8 6 6
convolve2 18 10 8 9 8 8 7 10 8 8
convolve1 23 18 14 16 14 16 9 9 11 11
getblk 33 29 16 22 20 21 9 9 9 9
convolve0 49 41 30 31 29 33 15 14 13 14
jdmerge 79 65 60 54 44 44 32 22 19 21
Avg 32.86 26.29 20.86 21.14 18.43 19.43 12.71 11.00 10.00 10.43
Improv.       -1.37 11.64 6.85   13.48 21.35 17.98
12
Random DFGs Direct Costs
  Wires Wires Wires Wires Wires Wires
Nodes MR-LCS CA CA_PD_BR CA_PD_BR CA_PD_BR BR
      a0.0 a0.5 a1.0  
50 32 35 32 32 32 32
300 263 240 217 216 217 225
550 512 460 432 431 433 444
800 759 670 639 635 635 662
1050 1010 881 847 845 842 884
1300 1260 1091 1066 1054 1054 1103
1550 1505 1297 1260 1246 1234 1323
1800 1761 1519 1500 1478 1476 1548
  7102 6194 5993 5937 5923 6221
Improv.   12.8 15.6 16.4 16.6 12.4
Table shows an improvement of about 60 for
Resource Sharing
Table shows an improvement of about 17 for Wire
Sharing
  Resource Resource Resource Resource Resource Resource
Nodes MR-LCS CA CA_PD_BR CA_PD_BR CA_PD_BR BR
      a0.0 a0.5 a1.0  
50 15 15 10 10 9 10
300 101 101 54 53 55 62
550 179 179 99 98 100 117
800 263 263 139 134 139 169
1050 333 333 168 168 172 217
1300 412 412 207 206 212 257
1550 488 488 233 228 226 296
1800 579 579 299 273 277 353
  2368 2368 1209 1170 1190 1480
Improv.   22.9 60.7 61.9 61.3 51.8
13
Random DFGs Indirect Costs
  Registers Registers Registers Registers Registers Registers
Nodes MR-LCS CA CA_PD_BR CA_PD_BR CA_PD_BR BR
      a0.0 a0.5 a1.0  
50 29 30 27 27 27 29
300 172 163 133 133 132 132
550 303 288 232 234 228 236
800 438 419 323 325 320 332
1050 580 544 407 411 398 423
1300 721 679 512 510 497 533
1550 841 798 588 605 581 628
1800 986 920 684 695 673 726
  4069 3841 2907 2941 2856 3039
Improv.   5.6 28.6 27.7 29.8 25.3
  Multiplexers Multiplexers Multiplexers Multiplexers Multiplexers Multiplexers
Nodes MR-LCS CA CA_PD_BR CA_PD_BR CA_PD_BR BR
      a0.0 a0.5 a1.0  
50 6 7 6 6 6 7
300 38 40 26 27 27 27
550 68 67 45 45 45 48
800 101 93 61 60 59 65
1050 134 123 74 72 73 82
1300 170 155 93 92 88 104
1550 194 180 104 106 102 118
1800 233 209 123 122 119 142
  943 873 531 530 519 593
Improv.   7.5 43.6 43.8 44.9 37
  Max Fan-out Max Fan-out Max Fan-out Max Fan-out Max Fan-out Max Fan-out
Nodes MR-LCS CA CA_PD_BR CA_PD_BR CA_PD_BR BR
      a0.0 a0.5 a1.0  
50 9 8 7 7 7 7
300 43 29 25 24 25 27
550 48 32 32 31 33 33
800 56 37 36 36 36 40
1050 62 40 42 40 40 45
1300 65 44 45 44 44 48
1550 73 47 49 46 47 53
1800 71 49 51 49 48 55
  426 286 286 276 281 307
Improv.   33 32.9 35.3 34.2 28
14
Random DFGs Derived Costs
  Wire Length Wire Length Wire Length Wire Length Wire Length Wire Length
Nodes MR-LCS CA CA_PD_BR CA_PD_BR CA_PD_BR BR
      a0.0 a0.5 a1.0  
50 327 476 258 256 264 328
300 7225 4919 4859 5061 4425 5530
550 19195 16434 11289 11785 13346 13547
800 33625 23352 18485 18398 18506 24278
1050 53617 44911 30835 30872 29861 39128
1300 94362 47793 35643 34661 35451 41891
1550 101569 75465 53907 52420 53373 69657
1800 146467 90954 66617 64530 63814 83573
  456387 304304 221893 217983 219040 277932
Improv.   33.3 51.4 52.2 52.0 39.1
  Total Area Total Area Total Area Total Area Total Area Total Area
Nodes MR-LCS CA CA_PD_BR CA_PD_BR CA_PD_BR BR
      a0.0 a0.5 a1.0  
50 1344 1360 816 864 832 928
300 11680 8624 5376 5376 5760 6912
550 22672 15232 10752 10240 10816 12672
800 33792 23296 13376 14976 13728 17920
1050 47728 31520 20672 19712 18816 24960
1300 57792 33600 18816 19712 19712 20800
1550 67936 46144 26496 28288 25920 33408
1800 81312 55040 32000 28800 28800 34800
  324256 214816 128304 127968 124384 152400
Improv.   33.8 60.4 60.5 61.6 53.0
Tables show a reduction of about 52 for Total
Wire Length and of about 60 for Total Area
15
Conclusions and Future Works
  • In this Master Thesis Project simple
    considerations have been used
  • It has been proved that proposed algorithms
    perform better than standard MR-LCS achieving
  • up to 17 of improvement in interconnection
    sharing
  • around 68 of improvement in resource sharing
  • reduction of around 64 of overall cost
  • Future Works
  • Recognize and exploit different topological
    patterns.
  • Multi coloring pre-processing.
  • Reiterate the solution through the algorithm
  • This allows to get further improvements, because
    the algorithm will be aware of the solution
    upperbound.

16
Questions?
Write a Comment
User Comments (0)
About PowerShow.com