Title: STG-based synthesis and Petrify
1STG-based synthesis and Petrify
J. Cortadella (Univ. Politècnica Catalunya)Mike
Kishinevsky (Intel Corporation)Alex Kondratyev
(University of Aizu)Luciano Lavagno (Universitá
di Udine)Enric Pastor (Univ. Politècnica
Catalunya) Alexander Taubin (University of
Aizu) Alex Yakovlev (Univ. Newcastle upon Tyne)
2What is it about?
- This tutorial is about the synthesis of
asynchronous circuits from behavioral
specifications. - STGs can specify I/O concurrency(based on Petri
nets). - STGs specify behavior at a level in which logic
synthesis techniques can be applied. - Speed-independent and timed circuits can be
derived.
3Design flow
4Outline
- Overview
- Synthesis steps
- Specification (STGs)
- State encoding
- Logic synthesis, decomposition and mapping
- Synthesis with relative timing
- Conclusions
5x
x
y
y
z
z
x-
z
x
y
z-
y-
Signal Transition Graph (STG)
6(No Transcript)
7(No Transcript)
8Next-state functions
9(No Transcript)
10Specification(STG)
Reachability analysis
State Graph
State encoding
SG withCSC
Design flow
Boolean minimization
Next-state functions
Logic decomposition
Decomposed functions
Technology mapping
Gate netlist
11VME bus
12STG for the READ cycle
DTACK-
DSr
LDS
LDTACK
D
DTACK
DSr-
D-
LDS-
LDTACK-
D
LDS
DSr
VME Bus Controller
LDTACK
DTACK
13Choice Read and Write cycles
14Choice Read and Write cycles
15Choice Read and Write cycles
16Choice Read and Write cycles
17Circuit synthesis
- Goal
- Derive a hazard-free circuitunder a given delay
model andmode of operation
18Modes of operation
- Fundamental mode
- Single-input changes
- Multiple-input changes
- Input / Output mode
- Concurrencycircuit / environment
Currentstate
Nextstate
19Speed independence
- Delay model
- Unbounded gate / environment delays
- Certain wire delays shorter than certain paths in
the circuit - Conditions for implementability
- Consistency
- Complete State Coding
- Output persistency
20Other synthesis approaches
- Burst-mode machines
- Mealy-like FSMs
- Fundamental mode (slow environment)
- VLSI programming
- Syntax-directed translation from
CSP(Communicating Sequential Processes) - No logic synthesis
- Circuit size Size of the specification
21Specification(STG)
Reachability analysis
State Graph
State encoding
SG withCSC
Design flow
Boolean minimization
Next-state functions
Logic decomposition
Decomposed functions
Technology mapping
Gate netlist
22STG for the READ cycle
DTACK-
DSr
LDS
LDTACK
D
DTACK
DSr-
D-
LDS-
LDTACK-
D
LDS
DSr
VME Bus Controller
LDTACK
DTACK
23State Graph (Read cycle)
DSr
DTACK-
LDS
LDTACK-
LDTACK-
LDTACK-
DSr
DTACK-
LDS-
LDS-
LDS-
LDTACK
DSr
DTACK-
D
D-
DSr-
DTACK
24Binary encoding of signals
DSr
DTACK-
LDS
LDTACK-
LDTACK-
LDTACK-
DSr
DTACK-
LDS-
LDS-
LDS-
LDTACK
DSr
DTACK-
D
D-
DSr-
DTACK
25Binary encoding of signals
DSr
DTACK-
10000
LDS
LDTACK-
LDTACK-
LDTACK-
DSr
DTACK-
10010
LDS-
LDS-
LDS-
LDTACK
DSr
DTACK-
10110
01110
10110
D
D-
DSr-
DTACK
(DSr , DTACK , LDTACK , LDS , D)
26Excitation / Quiescent Regions
27Next-state function
0 ? 1
0 ? 0
1 ? 1
1 ? 0
28Karnaugh map for LDS
LDS 1
LDS 0
-
-
-
0
1
-
0
1
-
-
-
-
-
-
-
-
1
1
1
-
-
-
-
-
0
0
0
0
0
0/1?
-
-
29Specification(STG)
Reachability analysis
State Graph
State encoding
SG withCSC
Design flow
Boolean minimization
Next-state functions
Logic decomposition
Decomposed functions
Technology mapping
Gate netlist
30Concurrency reduction
LDS
LDS-
LDS-
LDS-
10110
10110
31Concurrency reduction
DTACK-
DSr
LDS
LDTACK
D
DTACK
DSr-
D-
LDS-
LDTACK-
32State encoding conflicts
LDS
LDTACK-
LDS-
LDTACK
10110
10110
33Signal Insertion
LDTACK-
LDS
LDS-
LDTACK
101101
101100
D-
DSr-
34Specification(STG)
Reachability analysis
State Graph
State encoding
SG withCSC
Design flow
Boolean minimization
Next-state functions
Logic decomposition
Decomposed functions
Technology mapping
Gate netlist
35Complex-gate implementation
36Specification(STG)
Reachability analysis
State Graph
State encoding
SG withCSC
Design flow
Boolean minimization
Next-state functions
Logic decomposition
Decomposed functions
Technology mapping
Gate netlist
37Hazards
38Hazards
1000
1100
1100
0100
0110
39Decomposition
- Global acknowledgement
- Generating candidates
- Hazard-free signal insertion
- Event insertion
- Signal insertion
40Global acknowledgement
41How about 2-input gates ?
42How about 2-input gates ?
c
z
b
a
a
y
b
d
43How about 2-input gates ?
0
c
0
z
b
a
a
y
b
d
44How about 2-input gates ?
c
z
b
a
a
y
b
d
45How about 2-input gates ?
c
z
y
d
46Strategy for logic decomposition
- Each decomposition defines a new internal signal
- Method Insert new internal signals such that
- After resynthesis, some large gates are
decomposed - The new specification is hazard-free
- Generation of candidates for decomposition
- Algebraic factorization
- Boolean factorization (boolean relations)
47Decomposition example
48(No Transcript)
49y-
s
50s1
y-
s-
s-
s-
s
s0
51(No Transcript)
52s1
y-
y-
1001
1011
s-
w
1001
z-
0011
1000
z-
w-
w
y
x-
1010
y
x
x-
0111
s
s0
z
z
0111
z- is delayed by the new transition s- !
53(No Transcript)
54Signal insertion for function F
Insertion by input borders
State Graph
55Event insertion
56Properties to preserve
a is persistent
57Specification(STG)
Reachability analysis
State Graph
State encoding
SG withCSC
Design flow
Boolean minimization
Next-state functions
Logic decomposition
Decomposed functions
Technology mapping
Gate netlist
58Timing assumptions in design flow
- Speed-independent wire delays after a
forkshorter than fan-out gate delays - Burst-mode circuit stabilizes betweentwo
changes at the inputs - Timed circuits Absolute bounds on gate /
environment delays are known a priori (before
physical design)
59Relative Timing Circuits
- Assumptions a before b for concurrent
andordered events - Used by the tool to derive a circuit and timing
constraints that must be met in physical design
flow - Applied to design of the Rotating Asynchronous
Pentium Processor(TM) Instruction Decoder
(K.Stevens, S.Rotem et al. Intel Corporation)
60Lazy Transition Systems
ER (LDS)
LDS
LDS-
LDS-
LDS-
FR (LDS-)
DTACK-
EnR (LDS-)
Event LDS- is lazy firing subset of enabling
61Timing assumptions
- (a before b) for concurrent events
concurrency reduction for firing and
enabling - (a before b) for ordered events
early enabling - (a simultaneous to b wrt c) for triples of
events combination of the above
62Netlist with SI timing constraints
DTACK-
DSr
LDS
LDTACK
D
DTACK
DSr-
D-
LDS-
LDTACK-
D
DTACK
LDS
map
csc
DSr
LDTACK
63Adding timing assumptions (I)
DTACK-
DSr
LDS
LDTACK
D
DTACK
DSr-
D-
LDS-
LDTACK-
D
DTACK
LDS
map
csc
DSr
LDTACK
64Adding timing assumptions (I)
D
DTACK
LDS
map
csc
DSr
LDTACK
65State space domain
DSr
LDTACK-
66State space domain
DSr
LDTACK-
67State space domain
DSr
LDTACK-
Two more unreachable states
68Boolean domain
LDS 1
LDS 0
-
-
-
0
1
-
0
1
-
-
-
-
-
-
-
-
1
1
1
-
-
-
-
-
0
0
0
0
0
0/1?
-
-
69Boolean domain
LDS 1
LDS 0
-
-
-
0
1
-
0
1
-
-
-
-
-
-
-
-
1
1
1
-
-
-
-
-
0
0
-
0
0
1
-
-
One more DC vector for all signals
One state conflict is removed
70Netlist with one constraint
D
DTACK
LDS
map
csc
DSr
LDTACK
71Netlist with one constraint
72Timing assumptions
- (a before b) for concurrent events
concurrency reduction for firing and
enabling - (a before b) for ordered events
early enabling - (a simultaneous to b wrt c) for triples of
events combination of the above
73Ordered events early enabling
b
b
a
c
c
F
G
a
b
c
74Adding timing assumptions (II)
DSr
DTACK-
LDS
LDTACK
D
DTACK
DSr-
D-
LDS-
LDTACK-
D
DTACK
LDS
DSr
LDTACK
75State space domain
LDS-
D-
DSr-
Reachable space is unchanged
For LDS- enabling can be changed in one state
76Boolean domain
LDS 1
LDS 0
-
-
-
0
1
-
0
1
-
-
-
-
-
-
-
-
1
1
1
-
-
-
-
-
0
0
-
0
0
1
-
-
77Boolean domain
LDS 1
LDS 0
-
-
-
0
1
-
0
1
-
-
-
-
-
-
-
-
-
1
1
-
-
-
-
-
0
0
-
0
0
1
-
-
One more DC vector for one signal LDS
If used LDS DSr, otherwise LDS DSr D
78Before early enabling
DSr
DTACK-
LDS
LDTACK
D
DTACK
DSr-
D-
LDS-
LDTACK-
D
DTACK
LDS
DSr
LDTACK
79Netlist with two constraints
DTACK-
DSr
LDS
LDTACK
D
DTACK
DSr-
D-
LDS-
LDTACK-
D
DTACK
DSr
LDS
LDTACK
Both timing assumptions are used for optimization
and become constraints
80Backannotation
- Timed circuits require post-verification
- Can synthesis tools help ?
- Report the least stringent set of timing
assumptions required for the correctness of the
circuit - Not all initial timing assumptions may be
required - Petrify reports a set of firing order constraints
that guarantee the circuit correctness
81Experiments
- Assumption delays are controllable in physical
design - 2-3x improvement in area/delay wrt to
SI(K.Stevens, S.Rotem et al. Intel Corporation) - Rotating Asynchronous Pentium Processor(TM)
- Instruction Decoder (Async99)
82Summary
- Synthesis of asynchronous circuits can be
automated at gate level (logic synthesis) - Timing assumptions/constraints are essential to
compete with synchronous circuits - Relative timing seems to be a promising approach
for specification and synthesis - High-level and logic synthesis can be
combined(e.g. CSP ? Petri net ? circuit)
83Petrify
- The synthesis methodology presented in this
tutorial is handled by petrify - but also ...
- Concurrency reduction
- Automatic handshake expansion (2-4 phase)
- Noise isolation
- Synthesis with gC elements and gate libraries
- Synthesis of Petri nets (crucial for
backannotation) - ...
84Petrify implementation details
- 50,000 lines of code SIS (data structures
logic synthesis) BDD package (symbolic
manipulation) dot (graph visualization package
from ATT) - BDD-based implementation
- Reachability analysis
- Manipulation of sets of states
- Boolean minimization
85Petrify
http//www.lsi.upc.es/jordic/petrify
- References
- Tutorial for the designer
- Binaries (for several platforms)