Title: A High Performance Application Representation for Reconfigurable Systems
1A High Performance Application Representation
for Reconfigurable Systems
- Wenrui Gong Gang Wang Ryan Kastner
- Department of Electrical and Computer
EngineeringUniversity of CaliforniaSanta
Barbara, CA 93106-9560 - gong, wanggang, kastner_at_ece.ucsb.edu
- http//express.ece.ucsb.edu
- June 22, 2004
2Outline
- Reconfigurable computing systems
- Compilation process
- Synthesizing to hardware
- Experimental results
- Concluding remarks
3Outline
- Reconfigurable computing systems
- Reconfigurable computing systems
- Challenges of application representations
- Compilation process
- Synthesizing to hardware
- Experimental results
- Concluding remarks
4Reconfigurable Computing Systems
- Standard programmable platforms
- Post-manufacturing customization
- Designs shift from physical chips to
configuration files - A software design flow
- Feature hardware speed with software flexibility
- Enable higher productivity
5Application Representations
- A common application representation is needed to
tame the complexity of system synthesis - Requirements
- Able to generate software code for
microprocessors - Able to be easily translate to hardware
configuration files - Allow a variety of transformations and
optimizations to exploit the performance
6Parallelism Exploration
- Fine grain parallelism
- Multiple functional units
- Issuing an operation to a free functional units
- Operations executed independently
- Coarse grain parallelism
- Executing multiple threads
- With occasional synchronization
- Reconfigurable computing systems support both
fine and coarse grain parallelism
7PDG SSA
- The PDG SSA representation can be used for both
hardware synthesis and software generation - The PDG and SSA forms are common representations
for software generation - Here we concentrate on hardware synthesis
8Outline
- Reconfigurable computing systems
- Compilation process
- Overview
- Constructing the PDG
- Incorporating the SSA form
- Synthesizing to hardware
- Experimental results
- Concluding remarks
9Overview
10Program Dependence Graph
- PDG Program Dependence Graph
- ENTRY node the root node of a PDG
- PREDICATE nodes producing predicate values from
expressions - Diamond-shaped nodes 2, 3, and 4
- STATEMENTS nodes a arbitrary set of operations
- Circle nodes 1, 4, 6, 7, and 8
- REGION nodes summarizing all operations with the
same control conditions together. - House-shaped nodes R2, R3, R4
- R3 the predicate value of 2 is True
- Edges represent dependencies
11Constructing the PDG from the CDFG
- Implemented based on Ferrantes algorithm
- Using post-dominate tree
var pred for (i 0 i lt len i) val
diff if (val gt 32767) val
32767 else if (val lt -32768) val
-32768 return val
12Constructing the PDG (contd)
13The Static Single Assignment Form
- Each variable has exactly one assignment
- A variable is referenced always using the same
name - At joint points of control conditions, special Ø
nodes are inserted.
val diff if (val gt 32767) val
32767 else if (val lt -32768) val -32768
val_2 val_1 diff if (val_2 gt 32767)
val_3 32767 else if (val_2 lt -32768) val_4
-32768 val_5 phi(val_2,val_3,val_4)
14Extending the PDG with Ø-Nodes
15The Program Representation
- Loop independent Ø-nodes
- taking two or more input values and a predicate
value - committing one of the inputs depending on this
predicate - Loop carried Ø-nodes
- Input the initial value, the loop-carried value,
and also a predicate value - Outputs one to the iteration body, and the other
to the loop exit - Directing proper values to proper outputs.
16Outline
- Reconfigurable computing systems
- Compilation process
- Synthesizing to hardware
- Data-path elements
- Ø-nodes
- Experimental results
- Concluding remarks
17Synthesizing the Data-Path
- A one-to-one mapping is used
- Different resource allocation and binding
algorithms can be used (on-going work) - Each operation has an operator and several
operands - Operands are synthesized directly to wires in the
circuit - Each variable in the SSA form has only one
definition point - PREDICATE nodes synthesized to Boolean logic
signals to control next-stage transitions and
direct multiplexers to commit the correct value.
18Synthesizing Ø-nodes
- A loop-independent Ø-nodes are synthesized to a
multiplexer. The multiplexer selects input values
depending on the predicate values. - For a loop carried Ø-node, an additional switch
is generated to direct the loop-exiting values
19Synthesize to Hardware
- Simplifications and optimizations
- Removing unnecessary control dependencies
- Cascading/ expanding multipliers obtain better
performance - Flip-flops are inserted
- Guarantee that correct values will available no
matter which execution path is taken
20Outline
- Reconfigurable computing systems
- Compilation process
- Synthesizing to hardware
- Experimental results
- Setup and benchmarks
- Results
- Concluding remarks
21Setup and Benchmarks
- Benchmark suites
- Functions from the MediaBench suite
- Profiled using sample data
- Only report conservative results
- Estimated execution time
- Aggressive predicated execution
- Only report conservative results
- Area
- One-to-one mapping without resource sharing
- Reported in numbers of FPGA slices
22Estimated Execution Time
23Estimated Execution Time (contd)
24Estimated FPGA Area
25Outline
- Reconfigurable computing systems
- Compilation process
- Synthesizing to hardware
- Experimental results
- Concluding remarks
- On-going/future work
26Concluding Remarks
- The PDGSSA form supports a variety of
transformations and enables both coarse and fine
grain parallelism - A method to synthesize this form to hardware
- This form gives faster execution time using
similar area when compared with CFG and PSSA forms
27On-going/Future work
- Investigate transformations to create coarse
grained parallelism using the PDGSSA form - Augment the PDGSSA form with architectural
information to provide fast estimation. - Integrate of resource sharing and other
architectural synthesis techniques
28Thank You
- Prof Ryan Kastner and Gang Wang
- All audiences
29Questions