Modulo Graph Embedding : Mapping Applications onto Coarse-Grained Reconfigurable Architectures - PowerPoint PPT Presentation

About This Presentation

Title:

Modulo Graph Embedding : Mapping Applications onto Coarse-Grained Reconfigurable Architectures

Description:

Electrical Engineering and Computer Science. Objectives of This Work ... Draw affinity graph onto scheduling space minimizing edge length ... – PowerPoint PPT presentation

Number of Views:149

Avg rating:3.0/5.0

Slides: 23

Provided by: fank

Learn more at: https://cccp.eecs.umich.edu

Category:

more less

Transcript and Presenter's Notes

Title: Modulo Graph Embedding : Mapping Applications onto Coarse-Grained Reconfigurable Architectures

1
Modulo Graph Embedding Mapping
Applications onto Coarse-Grained
Reconfigurable Architectures

Hyunchul Park, Kevin Fan,
Manjunath Kudlur,Scott Mahlke

Advanced Computer Architecture Lab University of
Michigan
2
Coarse-Grained Reconfigurable Architecture (CGRA)
Config
FU
LRF

Array of PEs connected in a mesh-like
interconnect
Characterized by array size, node
functionalities, interconnect, register file
configurations
Execute compute intensive kernels in multimedia
applications

3
CGRA Attractive Alternative to ASICs

Suitable for running multimedia applications on
embedded systems
High computation throughput
Low power consumption and scalability
High flexibility with fast configuration
Morphosys 8x8 array with RISC processor
SIMD style execution of loops
Piperench 1-D reconfigurable hardware
Virtualize hardware pipeline
ADRES 8x8 array with tightly coupled VLIW
Modulo scheduling with simulated annealing

4
Scheduling in CGRA

Different from conventional VLIW
Sparse interconnect and distributed register
files
No dedicated routing resources
Need a good compiler to exploit the abundance of
computing resources

FU0
LRF
FU1
LRF
Central RF
FU3
FU2
FU1
FU0
FU2
LRF
FU3
LRF
CGRA
Conventional VLIW
5
Objectives of This Work

Modulo scheduling technique for CGRAs
Exploit loop-level parallelism by overlapping
execution of iterations
Targeting low-cost CGRAs
Achieve quality schedule under restriction of
hardware
Fast compilation time

6
Modulo Scheduling Basics

Expose loop-level parallelism by overlapping
execution of iterations
Initiation interval (II)
Each iteration is executed every II cycles

II
Overlapped Execution
7
Modulo Scheduling for CGRA

Mapping DFG onto 3-D scheduling space
Limited number of scheduling slots (number of
PEs) x II
Minimize routing cost (number of slots used for
routing)
Sparse interconnect and distributed register
files
Ensure routability of operands

II
time
Scheduling Space
4x4 CGRA
8
Our Approach

Systematic approach to generate good schedule in
reasonable time
Minimize routing cost
Convert scheduling problem into graph embedding
Leverage graph embedding algorithm
Ensure routability of operands
Skewed scheduling space
Create a narrow, but tall scheduling space

9
1 Minimize Routing Cost

Routing cost number of PEs used for routing
Determined by positions of producer and consumer
Minimize distance between producers and consumers
Height-based list scheduling
Schedule operations in the order of dependence
height
Place consumers close to producers
Need to carefully place operations in the same
height

10
Scheduling Example Routing Cost
time PE 0 PE 1 PE 2 PE 3
0
1
2
3
0
1
3
2
0
1
3
2
4
5
4
5
4
5
6
6
Routing Cost 2
time PE 0 PE 1 PE 2 PE 3
0
1
2
3
DFG
0
1
3
2
4
5
6
1x4 CGRA
Routing Cost 0
Common consumer information is important !
11
Affinity Graph Heuristic

Consider placement of operations with same height
together
Use common consumer information
Affinity value between operations
Measured by the distance of common consumers in
DFG
Construct affinity graph
Nodes operations, edges affinity values
Place operations with affinity edges close to
each other

12
Affinity Graph Example
0
1
3
2
5
4
height 3
height 2
height 1
Affinity Graph
DFG
Mapping onto CGRA
2x4 CGRA
Bad mapping
Good mapping
Drawing affinity graph onto scheduling space
13
Leveraging Graph Embedding