Title: GanesanP9
1Synthesis for Partially Reconfigurable Computing
Systems
- Satish Ganesan, Abhijit Ghosh, Ranga Vemuri
- Digital Design Environments Laboratory
- Dept of ECECS, University of Cincinnati
- satish, ranga _at_ececs.uc.edu
This work is sponsored in part by the US Air
Force, Wright Laboratory, WPAFB, under contract
number F33615-97-C-1043
2Synthesis System Overview
Input Specification (VHDL / C)
Translator
High-level Synthesis
Dynamic Reconfiguration Set Generation
Logic Elaboration
Layout Synthesis
Host-side Controller
PARTIALLY RECONFIGURABLE FPGA
3Target Architecture Model
- Features
- Partially reconfigurable device where a portion
- of the device can be reconfigured while the
- remaining part is still operational
- Target device split into two parts P1 , P2
- Design is split into sequential blocks and
- loaded on the two portions of the device
- Reconfiguration of a block is overlapped with
- execution of another
device
P1 P2
4Input Specification
- Behavior specification in VHDL/C subset
- Translated into Intermediate Representation
- Intermediate Representation
- Behavior Block Input Format
- Single thread of control
- Each block performs set of computations
- Data transfer through branch interface
- Supports control constructs
5High-level Synthesis (HLS)
Input Specification (Behavior Blocks)
RTL Component Library
Area / Timing Constraints
High-level Synthesis Engine
Scheduling
Allocation
Binding
Register - Transfer Level Design
(RTL Blocks)
6High-level Synthesis (HLS)
- Each behavior block in the block graph
separately synthesized
HLS
7RTL Model
I/0 Clock Reset
Start Finish
DESIGN
Flags
DATAPATH (net-list of components)
CONTROLLER (finite state machine)
Controls
- Glushkovian Model
- Components in the datapath implement operations
specified in behavior - Controller (FSM) provides necessary controls for
execution - HLS generates 4 signals Clock(in), Reset(in),
Start(in), Finish(out)
8Dynamic Reconfiguration
DR
- Input
- RTL block graph, with each block having been
separately synthesized - Output
- Sequence of reconfiguration sets
- Each reconfiguration set has two blocks one
reconfigures, other executes - Intermediate data between blocks stored in
board registers
9Dynamic Reconfiguration Example
Step1 RTL Block 1 is loaded on the device Step2
RTL Block 1 is executed RTL Block 2 is
configured Step3 RTL Block 1 completes execution
RTL Block 3 is reconfigured in
place of RTL Block 1 RTL Block 2 is
executed Step4 Repeat Steps 2 and 3 until all
RTL blocks have been loaded and
executed
10Latency Improvement
- Latency of design without DRSG approach
- L1 ?(R i E i) 1 lt i lt n
- Latency of design with DRSG approach
- L2 R1 ?max(R i1, E i) 1 lt i lt
n - where
- Ri reconfiguration time of ith block
- Ei execution time of ith block
- It is easily seen that L2 lt L1
11Handling Conditional Constructs
- RTL Block 1 is a conditional block
- Either RTL Block2 or RTL Block3 is executed
- due to single thread of control
- Two approaches to handle conditional
- branching
- Approach I host polling
- The host waits on the conditional predicate to
- evaluate to load the appropriate branch
- L1 R1 ?max(R i1 , E i) Rj 1 lt
i lt n - where Rj reconfiguration time of the branch
that is - executed
12Handling Conditional Constructs
- Approach II branch prediction
- The host loads one of the branches based on a
- user given profile
- Latency of the design if the correct branch
- was loaded
- L1 R1 ?max(R i1 , E i) 1 lt i
lt n - If the wrong branch was loaded,
- L2 R1 ?max(R i1 , E i) Rj 1 lt i
lt n -
- where Rj reconfiguration time of the
branch - L1 lt L2 , always
13Logic Elaboration
RTL Component Library
Input RTL Specification
Logic Elaboration VELAB
Elaborated net-list file in EDIF format
- Features
- Pre-placed component library to aid layout
synthesis - RTL specification obtained form HLS tool ASSERTA
- Net-list produced in EDIF format
14Layout Synthesis
Input Net-list Specification
Layout Synthesis XACT6000
FPGA bit-stream
- Features
- Manual placement required to ensure place and
route using XACT6000 - Replaced blocks are placed in the same location
as the blocks they - substitute
- Bitmap files produced in cal format
15Host-side Controller
Bitmap files
Reconfiguration Set Sequence
Host-side Controller
RTR implementation of design
- Features
- Manages the partially reconfigurable FPGA device
- Loads and executes bitmap files based on the
reconfiguration - sequence generated by DRSG phase
- Device used is Xilinx 6200
16Results Percentage Configuration time
conf 19.7 11.2 62.8
Design 4x4 2D FFT 4x4 1D DCT 16-tap FIR
Total rec. 929 us 1416 us 338 us
Overlap 678 us 1161 us 0 us
Latency 1276 us 2263 us 538 us
Total exec 1025 us 2008 us 200 us
- Table presents percentage total time spent only
in configuration - using the synthesis flow
- The examples show significant improvements in
overall latency
17Conclusions and Future Work
- Conclusions
- Presented a synthesis system for partially
reconfigurable FPGAs - Proposed a dynamic reconfiguration set
generation strategy to - improve overall design latency by reducing
reconfiguration time - Results showed considerable decrease in
reconfiguration times - Future work
- Automate the procedure of generating run-time
reconfigurable - designs for partially reconfigurable FPGAs