Logic design of asynchronous circuits - PowerPoint PPT Presentation

About This Presentation
Title:

Logic design of asynchronous circuits

Description:

ASPDAC / VLSI 2002 - Tutorial on Logic Design of Asynchronous Circuits. 5 ... After resynthesis, some large gates are decomposed. The new specification is hazard-free ... – PowerPoint PPT presentation

Number of Views:196
Avg rating:3.0/5.0
Slides: 81
Provided by: Comp910
Learn more at: https://www.cs.upc.edu
Category:

less

Transcript and Presenter's Notes

Title: Logic design of asynchronous circuits


1
Logic design ofasynchronous circuits
  • Part III
  • Advanced topics on synthesis

2
Outline
  • Logic decomposition
  • Hazard-free decomposition
  • Signal insertion
  • Technology mapping
  • Optimization based on timing information
  • Relative timing
  • Timing assumptions and constraints
  • Other synthesis paradigms
  • HDLs, CSP, burst-mode, ...

3
Design flow
Specification(STG)
Reachability analysis
State Graph
State encoding
SG withCSC
Boolean minimization
Next-state functions
Logic decomposition
Decomposed functions
Technology mapping
Gate netlist
4
No Hazards
5
Decomposition May Lead to Hazards
1000
1100
1100
0100
0110
6
Decomposition
  • Acknowledgement
  • Global acknowledgement
  • Generating candidates
  • Hazard-free signal insertion
  • Event insertion
  • Signal insertion

7
Global acknowledgement
8
How about 2-input gates ?
9
How about 2-input gates ?
c
z
b
a
a
y
b
d
10
How about 2-input gates ?
0
c
0
z
b
a
a
y
b
d
11
How about 2-input gates ?
c
z
b
a
a
y
b
d
12
How about 2-input gates ?
c
z
y
d
13
Strategy for logic decomposition
  • Each decomposition defines a new internal signal
  • Method Insert new internal signals such that
  • After resynthesis, some large gates are
    decomposed
  • The new specification is hazard-free
  • Generate candidates for decomposition using
    standard logic factorization techniques
  • Algebraic factorization
  • Boolean factorization (boolean relations)

14
Decomposition example
15
Decomposition example
y-
1001
1011
z-
w-
1000
0001
w
y
x
w-
z-
1010
0000
0101
0011
w-
z-
y
x
0010
0100
x-
y
x
z
0110
0111
16
Decomposition example
s1
y-
s
1001
1011
z-
s-
w
1001
1000
z-
s-
y
w-
0011
0001
1000
1010
y
s-
x
w-
z-
x-
0000
0101
1010
w-
z-
y
x
0111
0010
0100
s
y
x
s0
z
0111
0110
17
Decomposition example
s1
y-
y-
1001
1011
z-
s-
s-
w
1001
1000
z-
s-
y
w-
z-
w-
w
0011
0001
1000
1010
y
s-
x
w-
z-
x-
0000
0101
1010
y
x
x-
w-
z-
y
x
0111
0010
0100
s
s
y
x
z
s0
z
0111
0110
18
Decomposition example
y-
1011
z-
w-
1000
0001
w
y
x
w-
z-
1010
0000
0101
0011
w-
z-
y
x
0010
0100
x-
y
x
z
0110
0111
yz1
yz0
19
Decomposition example
y-
y-
s1
1001
1011
s-
s-
w
1001
z-
w-
0011
0001
1000
z-
w-
w
y
x
w-
z-
x-
0000
0101
1010
w-
z-
y
x
y
x
x-
0111
0010
0100
s
y
x
s
s0
z
z
0111
0110
z- is delayed by the new transition s- !
20
Decomposition example
y-
s1
1001
1011
s-
w
1001
z-
w-
0011
0001
1000
y
x
w-
z-
x-
0000
0101
1010
w-
z-
y
x
0111
0010
0100
s
y
x
y
y
y
y
y
y
y
s0
z
0111
0110
21
Decomposition (Algebraic, Boolean relations)
F
22
Decomposition (Algebraic, Boolean relations)
F
until no more progress
Hazard-free ? (Event insertion)
23
Signal insertion for function F
Insertion by input borders
State Graph
24
Event insertion
25
Event insertion
SR(x)
b
x
x
x
x
26
Properties to preserve
a is persistent
27
Boolean decomposition
f F (x1,,xn)
f G(H(x1,,xn))
Our problem Given F and G, find H
28
h1
f
h2
This is a Boolean Relation
29
a
F
c
y
d
30
a
c
y
d
31
a
c
y
d
a
32
a
c
y
d
a
d
c
33
Technology mapping
  • Merging small gates into larger gates introduces
    no new hazards
  • Standard synchronous technique can be applied,
    e.g. BDD-based boolean matching
  • Handles sequential gates and combinational
    feedbacks
  • Due to hazards there is no guarantee to find
    correct mapping (some gates cannot be decomposed)
  • Timing-aware decomposition can be applied in
    these rare cases

34
Design flow
Specification(STG)
Reachability analysis
State Graph
State encoding
SG withCSC
Boolean minimization
Next-state functions
Logic decomposition
Decomposed functions
Technology mapping
Gate netlist
35
Timing assumptions in design flow
  • Speed-independent wire delays after a
    forksmaller than fan-out gate delays
  • Burst-mode circuit stabilizes betweentwo
    changes at the inputs
  • Timed circuits Absolute bounds on gate /
    environment delays are known a priori (before
    physical design)

36
Relative Timing Circuits
  • Assumptions a before b
  • for concurrent events reduces reachable state
    space
  • for ordered events permits early enabling
  • both increase dont care space for logic
    synthesis gt simplify logic (better area and
    timing)
  • Assume - if useful - guarantee approach
    assumptions are used by the tool to derive a
    circuit and required timing constraints that must
    be met in physical design flow
  • Applied to design of the Rotating Asynchronous
    Pentium Processor(TM) Instruction Decoder
    (K.Stevens, S.Rotem et al. Intel Corporation)

37
Relative Timing Asynchronous Circuits
Speed-independent C-element
b
c
a
38
State Graph (Read cycle)
DSr
DTACK-
LDS
LDTACK-
LDTACK-
LDTACK-
DSr
DTACK-
LDS-
LDS-
LDS-
LDTACK
DSr
DTACK-
D
D-
DSr-
DTACK
39
Lazy Transition Systems
ER (LDS)
LDS
LDS-
LDS-
LDS-
FR (LDS-)
DTACK-
ER (LDS-)
Event LDS- is lazy firing subset of enabling
40
Timing assumptions
  • (a before b) for concurrent events
    concurrency reduction for firing and
    enabling
  • (a before b) for ordered events
    early enabling
  • (a simultaneous to b wrt c) for triples of
    events combination of the above

41
Speed-independent Netlist
DTACK-
DSr
LDS
LDTACK
D
DTACK
DSr-
D-
LDS-
LDTACK-
D
DTACK
LDS
map
csc
DSr
LDTACK
42
Adding timing assumptions (I)
DTACK-
DSr
LDS
LDTACK
D
DTACK
DSr-
D-
LDS-
LDTACK-
D
DTACK
LDS
map
csc
DSr
LDTACK
43
Adding timing assumptions (I)
DTACK-
DSr
LDS
LDTACK
D
DTACK
DSr-
D-
LDS-
LDTACK-
D
DTACK
LDS
map
csc
DSr
LDTACK
44
State space domain
DSr
LDTACK-
45
State space domain
DSr
LDTACK-
46
State space domain
DSr
LDTACK-
Two more unreachable states
47
Boolean domain
LDS 1
LDS 0
-
-
-
0
1
-
0
1
-
-
-
-
-
-
-
-
1
1
1
-
-
-
-
-
0
0
0
0
0
0/1?
-
-
48
Boolean domain
LDS 1
LDS 0
-
-
-
0
1
-
0
1
-
-
-
-
-
-
-
-
1
1
1
-
-
-
-
-
0
0
-
0
0
1
-
-
One more DC vector for all signals
One state conflict is removed
49
Netlist with one constraint
DTACK-
DSr
LDS
LDTACK
D
DTACK
DSr-
D-
LDS-
LDTACK-
D
DTACK
LDS
map
csc
DSr
LDTACK
50
Netlist with one constraint
DTACK-
DSr
LDS
LDTACK
D
DTACK
DSr-
D-
LDS-
LDTACK-
51
Timing assumptions
  • (a before b) for concurrent events
    concurrency reduction for firing and
    enabling
  • (a before b) for ordered events
    early enabling
  • (a simultaneous to b wrt c) for triples of
    events combination of the above

52
Ordered events early enabling
b
b
a
c
c
F
G
a
b
c
53
Adding timing assumptions (II)
DSr
DTACK-
LDS
LDTACK
D
DTACK
DSr-
D-
LDS-
LDTACK-
D
DTACK
LDS
DSr
LDTACK
54
State space domain
LDS-
D-
DSr-
Reachable space is unchanged
For LDS- enabling can be changed in one state
55
Boolean domain
LDS 1
LDS 0
-
-
-
0
1
-
0
1
-
-
-
-
-
-
-
-
1
1
1
-
-
-
-
-
0
0
-
0
0
1
-
-
56
Boolean domain
LDS 1
LDS 0
-
-
-
0
1
-
0
1
-
-
-
-
-
-
-
-
-
1
1
-
-
-
-
-
0
0
-
0
0
1
-
-
One more DC vector for one signal LDS
If used LDS DSr, otherwise LDS DSr D
57
Before early enabling
DSr
DTACK-
LDS
LDTACK
D
DTACK
DSr-
D-
LDS-
LDTACK-
D
DTACK
LDS
DSr
LDTACK
58
Netlist with two constraints
DTACK-
DSr
LDS
LDTACK
D
DTACK
DSr-
D-
LDS-
LDTACK-
D
DTACK
DSr
LDS
LDTACK
Both timing assumptions are used for optimization
and become constraints
59
Value of Relative Timing
  • RT circuits provides up to 2-3x (1.3-2x)
    delayarea reduction with respect to SI circuits
    synthesized without (with) concurrency reduction
  • Automatic generation of timing assumptions gt
    foundation for automatic synthesis of RT circuits
    with area/performance comparable/better than
    manual
  • Back-annotation of timing constraints gt minimal
    required timing information for the back-end
    tools
  • Timing-aware state encoding allows significant
    area/performance optimization

60
Design Flow with Timing
Specification(STG user assumptions)
Reachability analysis
Lazy State Graph
Timing-aware state encoding
Automatic Timing Assumptions
Lazy SG withCSC
Boolean minimization
Next-state functions
Logic decomposition
Decomposed functions
Technology mapping
Required Timing Constraints
Gate netlist
61
FIFO example
ro
li
FIFO
lo
ri
62
Speed-Independent Implementation
without concurrency reduction 3 state signals are
required
63
SI implementation with concurrency reduction
x
li
ro-
lo-
ri-
ri
li
-

gC
x
gC

ro
lo
li-
lo
ro
ri
x-
64
RT implementation
ri
li
x
lo
ro
65
RT implementation
x
li
lo-
ro-
ri-
To satisfy the constraint Delay(x- ) lt Delay
(ri )
and Delay(lo) Delay(x- ) lt Delay(ro ) Delay
(ri )
li-
lo
ro
ri
x-
All constraints are either satisfied by default
or easy to satisfy by sizing
66
Other synthesis paradigms outline
  • Synthesis from HDL (Verilog) Lavagno et al,
    Async00
  • Subset for asynchronous specification
  • Data-path/control partitioning
  • Circuit architecture. Control generation
  • Synthesis from asynchronous HDL (CSP, Tangram)
  • CSP for control generation A. Martin et al,
    Caltech
  • Tangram for silicon compilation K. van Berkel et
    al, Philips
  • Control synthesis using FSMs K. Yun, S. Nowick
  • Burst-mode machines
  • Comparison with STGs

67
Motivation
  • Language-based design key enabler to synchronous
    logic success
  • Use HDL as single language for
  • specification
  • logic simulation and debugging
  • synthesis
  • post-layout simulation
  • HDL must support multiple levels of abstraction

68
Control-data partitioning
  • Splitting of asynchronous control and synchronous
    data path
  • Automated insertion of bundling delays

CONTROL UNIT
DATA PATH
request
delay
acknowledge
69
Design flow
HDL specification
Synthesizable HDL (data)
Control/data splitting
STG (control)
Synthesis (Synopsys)
Logic delays
Synthesis (petrify)
Timing analysis (Synopsys)
HDL implementation
Logic implementation
Delay insertion
70
Asynchronous Verilog subset by example
always begin wait(start) R SMP 3 RES
SMP 4 R if(RES7 1) RES 0 else
begin if(RES6 1) RES 1 end done
1 wait(!start) done 0 end
SMP
R
R E S
RES
C.U.
done
start
  • begin-end for sequencing, fork-join for
    concurrency, if-else for input choice
  • Only structured mix of sequencing, concurrency
    and choice can be specified

71
Synthesis from asynchronous HDL
  • CSP based languages
  • CSP communicating sequential processes Hoare
  • Two synthesis techniques
  • based on program transformations Caltech
  • based on direct compilation Philips
  • Tools are more mature than for asynchronous
    synthesis from standard HDL
  • Complete shift in design methodology is required

72
Using CSP for control generation
  • After li goes high do full handshake at the
    right, then complete handshake at the left and
    iterate.

ro
li
Q element
ri
lo
STG
li
ro
ri
ro-
ri-
lo
li-
lo-
liroriro-not rilonot lilo-
CSP
  • sequencing operator
  • ro ro goes high ro- ro goes low
  • li wait until li is high not li wait
    until li is low

73
Using CSP for control generation
liroriro-not rilonot lilo-
CSP
weak
ri
Production rules li -gt ro ri -gt ro- not ri
-gt lo not li -gt lo-
ro
li
  • Conflict ro and ro- are not mutually exclusive
    (since ri and li are not)
  • Eliminate conflict by state signal insertion (
    CSC)

74
Conflict elimination
lirorixxro-not rilonot
lix-not xlo-
CSP
Production rules not x and li -gt ro x or not
li -gt ro- x and not ri -gt lo not x or ri -gt
lo- ri -gt x not li -gt x-
ro
li
x
FF
not x
lo
ri
75
Buffer example in Tangram
(a?byte b!byte) begin x0 var byte
forever do a?x0 b!x0 od end
a
b
Buffer
passive port
Each circle mapped to a netlist
active port

Q element
a
b
Data path
76
Summary
  • Tangram program is partitioned into data path and
    control
  • Data path is implemented as dual or single rail
  • Control is mapped to composition of standard
    elements ( etc)
  • Each standard element is mapped to a circuit
  • Post-optimization is done
  • Composing islands of control elements and
    re-synthesis with STG can give more aggressive
    optimization
  • Philips made a few chips using Tangram, including
    a product 8051 micro-controller in low-power
    pager Muna (25 wks battery life from one AAA
    battery)
  • Similar approach used in Balsa (Manchester Univ.)

77
Burst mode FSM
  • Close to synchronous FSMs with binary encoded I/O
  • Work in bursts
  • Input transitions fire
  • Output transitions fire
  • State signals change
  • Mostly limited to fundamental mode next input
    burst cannot arrive before stabilization at the
    outputs

s1
b-/x-
ab/y
a-/xy-
s2
s4
c-/y
c/y-
s3
78
Extended Burst mode
  • Directed dont cares (b) some concurrency is
    allowed for input transitions that do not
    influence an output burst
  • Conditional guards ltbgt if b1 then

s1
b-/x-
ab/y
ltbgta-/xy-
s2
s4
c-/y
ltbgtc/y-
s3
79
Synthesis of XBM
  • Next state and output functions free of
    functional and logic hazards
  • Sequential feedbacks should not introduce new
    hazards
  • State assignment
  • one state of the BM spec to one layer of Karnaugh
    map
  • compatible layers are merged
  • layers are compatible if merging does not
    introduce CSC violations or hazards
  • Layers are encoded using race free encoding

80
XBM and STG
x-
a
b
s1
b-/x-
ab/y
y
ltbgta-/xy-
s2
s4
c-/y
ltbgtc/y-
a-
c
s3
eps
y-
c-
y-
x
y
b-
Write a Comment
User Comments (0)
About PowerShow.com