Title: Direct synthesis of large-scale asynchronous controllers using a Petri-net-based approach
1Direct synthesis of large-scale asynchronous
controllers using a Petri-net-based approach
- Ivan Blunno Politecnico di Torino
- Alex Bystrov Univ. Newcastle upon Tyne
- Josep Carmona Univ. Politècnica de Catalunya
- Jordi Cortadella Univ. Politècnica de Catalunya
- Luciano Lavagno Università di Udine
- Alex Yakovlev Univ. Newcastle upon Tyne
2Outline
- Motivation
- Design flow
- Verilog HDL specification
- Petri nets and trace expressions
- Synthesis process
- Conclusion
3Motivation
- Language-based design key enabler to synchronous
logic success - Use HDL as single language for
- specification
- logic simulation and debugging
- synthesis
- post-layout simulation
- HDL must support multiple levels of abstraction
4Motivation
- HDL generates large asynchronous controllers
need direct synthesis - Guarantee an implementation
- Automatic exploration of the design space
- Benefit from existing structural methods for
logic synthesis - Benefit (at the design stage) from existing
performance estimation approaches
5Design flow
HDL specification
Synthesizable HDL (data)
Control/data splitting
STG (control)
Synthesis (Synopsys)
Logic delays
Synthesis (petrify)
Timing analysis (Synopsys)
HDL implementation
Logic implementation
Delay insertion
6Design flow
- What is available?
- simulators (no synchronous assumption)
- logic synthesis (from BFSM, STG, )
- layout (almost like synchronous)
- What is missing?
- translator from HDL to synthesis specification
model - translator from synthesis implementation model to
HDL
7Other approaches
- Special-purpose languages
- pros syntax and semantics can be tailored to
asynchronous Models of Computation (STG, BFSM,
process algebrae) - cons not familiar to designers,no standard tool
support - Examples
- Tangram
- Communicating Hardware Processes
- Balsa
8Our approach
- General-purpose language
- pros several tools available, broad user basis
- cons syntax and semantics oriented to gates,
(not STGs or BFSMs or process algebrae) - need to define a subset for synthesis (full
language only good for simulation) - Choice
- VHDL
- Verilog Blunno Lavagno, ASYNC00
9Outline
- Motivation
- Design flow
- Verilog HDL specification
- Petri nets and trace expressions
- Synthesis
- Conclusion
10Asynchronous Verilog subset
- Module and signal declaration
- module example(a, b, c, d)
- input a, b7..0
- output c, d
- reg e, f, g11..0
- Currently only single module supported
- always loop surrounds live behavior
- initial block defines initialization sequence
11Asynchronous Verilog subset
- Transitions
- input signals wait statement
- wait(a) ... wait (!b)
- output signals assignment statement
- c a b
- Each statement generates a trace expression and a
datapath fragment
12Asynchronous Verilog subset
- Causality relations Verilog statements
- begin-end for sequencing
- fork-join for concurrency
- if-then-else for input choice
- Only structured mix of sequencing, concurrency
and choice can be specified
13Example simple filter
always begin wait(start) R SMP 3 RES
SMP 4 if(b7 1) RES 0 else begin
if(b6 1) RES 1 end done
1 wait(!start) done 0 end
14Control-data partitioning
- Splitting of asynchronous control and synchronous
data path - Automated insertion of bundling delays
CONTROL UNIT
DATA PATH
request
delay
acknowledge
15Outline
- Motivation
- Design flow
- Verilog HDL specification
- Petri nets and trace expressions
- Synthesis
- Conclusion
16Controller design flow
HDL
Syntax-directed translation
Petri Net
Reductions
Transformations
PNTE
Synthesis
Circuit
17Design flow
Cost estimation
Transformations
Critical cycles
PNTE
Boolean equations
Area Estimation
Performance Estimation
18PNTE
- Free-choice Petri net
- Transitions are trace expressions
- Trace expressions represent well-structured event
relations - Causality
- Concurrency
- Choice
19Trace expressions (TE)
e
TE
TE TE
TE TE
TE TE
?TE
trace expressions are a subset of CCS agent
expressions Milner 80
20Trace expressions example
21From PN to PNTE
- Reductions to simplify the net structure
- Concurrency relations take
- O(n2) in Trace expressions
- O(n3) in Free-Choice systemsKovalyov Esparza
22Reductions
TE1
TE2
23Reductions
?
TE1
TE2
?
24Example
a
d?a ( b f )
f
b
e
c
c
h
g h?e
d
g
25Outline
- Motivation
- Design flow
- Verilog HDL specification
- Petri nets and trace expressions
- Synthesis
- Conclusion
26Exploration of the design space
- Kit of transformations at Petri net
- Concurrency reduction
- Increase of concurrency
- Event hiding
- Fast cost estimation
- Area (Boolean equations)
- Performance (critical cycles)
27Transformations at the net level
Concurrency reduction
a
f
b
c
d
28Transformations at the net level
Concurrency reduction
a
f
b
c
d
29Transformations at the net level
Concurrency reduction in TE
a
- Concurrency in TE
- b and f have a common
- parallel antecessor
f
b
c
d
30Transformations at the net level
Concurrency reduction in TE
a
- Concurrency reduction
- change the parallelizer
- by a sequencer
f
b
c
d
31Transformations at the net level
Increase of concurrency
a
- c is ordered with f and b!
f
b
c
d
32Transformations at the net level
Increase of concurrency
a
- c, f and b are concurrent!
b
c
f
d
33Transformations at the net level
Increase of concurrency in TE
a
- Increase of concurrency
- reorganizing the subtree
a
f
b
c
b
c
d
f
d
34Transformations at the net level
Increase of concurrency in TE
a
- Increase of concurrency
- reorganizing the subtree
a
f
b
c
b
c
d
f
d
35Transformations at the net level
Increase of concurrency in TE
a
- Increase of concurrency
- reorganizing the subtree
a
f
b
d
c
b
d
36Transformations at the net level
Event hiding
a
f
b
c
d
37Transformations at the net level
Event hiding
a
f
c
d
38Transformations at the net level
Event hiding in TE
a
- Event hiding
- delete the corresponding
- leaf ...
f
b
c
d
39Transformations at the net level
Event hiding in TE
a
- Event hiding
- delete the corresponding
- leaf ...
a
f
b
c
c
d
d
40Transformations at the net level
Event hiding in TE
a
- Event hiding
- delete the corresponding
- leaf ... and simplify the
- tree structure
a
f
b
f
c
c
d
d
41Synthesis of control logic
- For large-scale controllers
- Direct translation from Petri Net (or
STG-h/s-refined) specifications -
- Logic synthesis from fully refined STGs with
pseudo-one-hot encoding, structural techniques
and STG-level optimisations
42Why direct translation?
- Logic synthesis has problems with state space
explosion, repetitive and regular structures
(log-based encoding approach) - Direct translation has linear complexity but can
be area inefficient (inherent one-hot encoding) - What about performance?
43Shifter Example
- (xyya) Bystrov at al, 6th UK Async
Forum,99
Control Logic option Speed (ns)
Refined STG directly synthesized by Petrify 5.4
Circuit decomposition with two D-elements 4.2
Circuit decomposition and Petrify re-synthesis 3.3
Re-synthesis with relative timing 1.7
44Direct Translation of Petri Nets
- Previous work dates back to 70s
- Synthesis into event-based (2-phase) circuits
(similar to micropipeline control) - S.Patil, F.Furtek (MIT)
- Synthesis into level-based (4-phase) circuits
(similar to synthesis from one-hot encoded FSMs) - R. David (69, translation FSM graphs to CUSA
cells) - L. Hollaar (82, translation from parallel
flowcharts) - V. Varshavsky et al. (90,96, translation from
PN into an interconnection of David Cells)
45Davids original approach
a
x1
yb
x1
x2
b
d
ya
yc
c
x2
x1
x2
CUSA for storing state b
Fragment of flow graph
46Hollaars approach
(0)
M
(1)
K
A
(1)
N
M
N
(1)
B
(1)
L
L
K
1
(1)
A
1
B
Fragment of flow-chart
One-hot circuit cell
47Hollaars approach
1
M
0
K
A
(1)
N
M
N
0
B
(1)
L
L
K
1
(1)
A
1
B
Fragment of flow-chart
One-hot circuit cell
48Hollaars approach
1
M
0
K
A
(1)
N
M
N
1
B
(1)
L
L
K
0
(1)
A
1
B
Fragment of flow-chart
One-hot circuit cell
49Varshavskys Approach
Controlled
Operation
p1
p2
p2
p1
(0)
(1)
(1)
(0)
(1)
1
To Operation
50Varshavskys Approach
p1
p2
p2
p1
0-gt1
1-gt0
(1)
(0)
(1)
1-gt0
51Varshavskys Approach
p1
p2
p2
p1
1-gt0
0-gt1
1-gt0
0-gt1
1
1-gt0-gt1
52Translation in brief
This method has been used for designing control
of a token ring adaptor Yakovlev et al.,Async.
Design Methods, 1995 The size of control was
about 80 David Cells with 50 controlled hand
shakes
53Direct translation examples
- In this work we tried direct translation
- From STG-refined specification (VME bus
controller) - Worse than logic synthesis
- From a largish abstract specification with high
degree of repetition (mod-6 counter) - Considerable gain to logic synthesis
- From a small concurrent specification with dense
coding space (butterfly circuit) - Similar or better than logic synthesisb
54Example 1 VME bus controller
Result of direct translation (DC unoptimised)
55VME bus controller
After DC-optimisation (in the style of Varshavsky
et al WODES96)
56David Cell library
57VME bus controller
After DC-optimisation (in the style of Varshavsky
et al WODES96)
58Data path control logic
Example of interface with a handshake control
(DTACK, DSR/DSW)
59Ex 2 Flat mod-6 Counter
- TE-like Specification
- ((p?q!)5p?c!)
- Petri net (5-safe)
q!
5
p?
5
c!
60Flat mod-6 Counter
Refined (by hand) and optimised (by Petrify)
Petri net
61Flat mod-6 counter
Result of direct translation (optimised by hand)
62David Cells and Timed circuits
(a) Speed-independent
(b) With Relative Timing
63Flat mod-6 counter
(a) speed-independent
(b) with relative timing
64Butterfly circuit
STG after CSC resolution
Initial Specification
a
b
x
a
a-
z
y
x-
b-
a-
b
b-
y-
z-
65Butterfly circuit
Speed-independent logic synthesis solution
66Butterfly circuit
Speed-independent DC-circuit
67Butterfly circuit
DC-circuit with aggressive relative timing
68Comparison with logic synthesis
Example Logic synthesis DC-translation
VME-bus (overall operation cycle) 6ns 11ns
Mod-6 count (p-gtq/c, worst case cycle) gt5ns 1.6ns
Butterfly (with RT, operation cycle) 2ns 1.8ns
69DC control with Relative Timing
DC
DC
DC
op1
op2
70DC control with Relative Timing
DC
DC
DC
op1
op2
David Cell type Token shift time
Speed-independent 1.2ns
Mild RT (fast bkwd reset) 0.8ns
Aggressive RT (fast fwd set) 0.4ns
71Synthesis
- Encoding based on a David-cell approach
- Transformations to improve area and performance
- Structural methods to derive a circuit Pastor
et al. Transactions on CAD, Nov98
72Synthesis
Next-state function of signal y ?
73Synthesis
Next-state function of signal y ?
y x z
74Synthesis example VME bus
DSr
Bus
Data Transceiver
LDS
LDTACK
Device
D
LDS
DSr
VME Bus Controller
D
DSw
LDTACK
DTACK
DTACK
Read Cycle
75Synthesis example VME bus
LDS
LDTACK
LDTACK-
DSr
D
DTACK-
LDS-
DTACK
D-
DSr-
READ CYCLE SPECIFICATION
76Synthesis example VME bus
LDS
LDTACK
LDTACK-
DSr
D
DTACK-
LDS-
DTACK
D-
DSr-
77Synthesis example VME bus
78Synthesis example VME bus
79Cost estimation
- Heuristics
- AREA
- ? literals in each Excitacion Region
- PERFORMANCE length of critical cycle in the
net - Exploration of the design space guided by cost
estimations
80Performance estimation critical cycles
81Conclusions
- Fully automated design flow
- From HDLs (control / data splitting)
- Existing tools for data-path synthesis
- Direct synthesis guarantees implementation(HDL ?
Petri net, Petri-net-based encoding) - Synthesis of large controllers by efficient spec
models (Free-choice Petri nets trace
expressions) - Exploration of the design space (optimization) by
property-preserving transformations - Logic synthesis by structural methods