A High Performance Application Representation for Reconfigurable Systems presentation

About This Presentation

Transcript and Presenter's Notes

Title: A High Performance Application Representation for Reconfigurable Systems

1
A High Performance Application Representation
for Reconfigurable Systems

Wenrui Gong Gang Wang Ryan Kastner
Department of Electrical and Computer
EngineeringUniversity of CaliforniaSanta
Barbara, CA 93106-9560
gong, wanggang, kastner_at_ece.ucsb.edu
http//express.ece.ucsb.edu
June 22, 2004

2
Outline

Reconfigurable computing systems
Compilation process
Synthesizing to hardware
Experimental results
Concluding remarks

3
Outline

Reconfigurable computing systems
Reconfigurable computing systems
Challenges of application representations
Compilation process
Synthesizing to hardware
Experimental results
Concluding remarks

4
Reconfigurable Computing Systems

Standard programmable platforms
Post-manufacturing customization
Designs shift from physical chips to
configuration files
A software design flow
Feature hardware speed with software flexibility
Enable higher productivity

5
Application Representations

A common application representation is needed to
tame the complexity of system synthesis
Requirements
Able to generate software code for
microprocessors
Able to be easily translate to hardware
configuration files
Allow a variety of transformations and
optimizations to exploit the performance

6
Parallelism Exploration

Fine grain parallelism
Multiple functional units
Issuing an operation to a free functional units
Operations executed independently
Coarse grain parallelism
Executing multiple threads
With occasional synchronization
Reconfigurable computing systems support both
fine and coarse grain parallelism

7
PDG SSA

The PDG SSA representation can be used for both
hardware synthesis and software generation
The PDG and SSA forms are common representations
for software generation
Here we concentrate on hardware synthesis

8
Outline

Reconfigurable computing systems
Compilation process
Overview
Constructing the PDG
Incorporating the SSA form
Synthesizing to hardware
Experimental results
Concluding remarks

9
Overview
10
Program Dependence Graph

PDG Program Dependence Graph
ENTRY node the root node of a PDG
PREDICATE nodes producing predicate values from
expressions
Diamond-shaped nodes 2, 3, and 4
STATEMENTS nodes a arbitrary set of operations
Circle nodes 1, 4, 6, 7, and 8
REGION nodes summarizing all operations with the
same control conditions together.
House-shaped nodes R2, R3, R4
R3 the predicate value of 2 is True
Edges represent dependencies

11
Constructing the PDG from the CDFG

Implemented based on Ferrantes algorithm
Using post-dominate tree

var pred for (i 0 i lt len i) val
diff if (val gt 32767) val
32767 else if (val lt -32768) val
-32768 return val
12
Constructing the PDG (contd)
13
The Static Single Assignment Form

Each variable has exactly one assignment
A variable is referenced always using the same
name
At joint points of control conditions, special Ø
nodes are inserted.

val diff if (val gt 32767) val
32767 else if (val lt -32768) val -32768
val_2 val_1 diff if (val_2 gt 32767)
val_3 32767 else if (val_2 lt -32768) val_4
-32768 val_5 phi(val_2,val_3,val_4)
14
Extending the PDG with Ø-Nodes
15
The Program Representation

Loop independent Ø-nodes
taking two or more input values and a predicate
value
committing one of the inputs depending on this
predicate
Loop carried Ø-nodes
Input the initial value, the loop-carried value,
and also a predicate value
Outputs one to the iteration body, and the other
to the loop exit
Directing proper values to proper outputs.

16
Outline

Reconfigurable computing systems
Compilation process
Synthesizing to hardware
Data-path elements
Ø-nodes
Experimental results
Concluding remarks

17
Synthesizing the Data-Path

A one-to-one mapping is used
Different resource allocation and binding
algorithms can be used (on-going work)
Each operation has an operator and several
operands
Operands are synthesized directly to wires in the
circuit
Each variable in the SSA form has only one
definition point
PREDICATE nodes synthesized to Boolean logic
signals to control next-stage transitions and
direct multiplexers to commit the correct value.

18
Synthesizing Ø-nodes

A loop-independent Ø-nodes are synthesized to a
multiplexer. The multiplexer selects input values
depending on the predicate values.
For a loop carried Ø-node, an additional switch
is generated to direct the loop-exiting values

19
Synthesize to Hardware

Simplifications and optimizations
Removing unnecessary control dependencies
Cascading/ expanding multipliers obtain better
performance
Flip-flops are inserted
Guarantee that correct values will available no
matter which execution path is taken

20
Outline

Reconfigurable computing systems
Compilation process
Synthesizing to hardware
Experimental results
Setup and benchmarks
Results
Concluding remarks

21
Setup and Benchmarks

Benchmark suites
Functions from the MediaBench suite
Profiled using sample data
Only report conservative results
Estimated execution time
Aggressive predicated execution
Only report conservative results
Area
One-to-one mapping without resource sharing
Reported in numbers of FPGA slices

22
Estimated Execution Time
23
Estimated Execution Time (contd)
24
Estimated FPGA Area
25
Outline

Reconfigurable computing systems
Compilation process
Synthesizing to hardware
Experimental results
Concluding remarks
On-going/future work

26
Concluding Remarks

The PDGSSA form supports a variety of
transformations and enables both coarse and fine
grain parallelism
A method to synthesize this form to hardware
This form gives faster execution time using
similar area when compared with CFG and PSSA forms

27
On-going/Future work

Investigate transformations to create coarse
grained parallelism using the PDGSSA form
Augment the PDGSSA form with architectural
information to provide fast estimation.
Integrate of resource sharing and other
architectural synthesis techniques

28
Thank You

Prof Ryan Kastner and Gang Wang
All audiences

29
Questions

Write a Comment

User Comments (0)

About PowerShow.com

A High Performance Application Representation for Reconfigurable Systems PowerPoint PPT Presentation