Scalable and Scalably-Verifiable Sequential Synthesis - PowerPoint PPT Presentation

1 / 23

About This Presentation

Title:

Scalable and Scalably-Verifiable Sequential Synthesis

Description:

Title: PLAyer: A Tool for Fast Mapping of Combinational Logic for Design Emulation Author: Alan Last modified by: Alan Created Date: 3/17/2006 1:04:40 AM – PowerPoint PPT presentation

Number of Views:131

Avg rating:3.0/5.0

Slides: 24

Provided by: Alan1195

Learn more at: https://people.eecs.berkeley.edu

Category:

more less

Transcript and Presenter's Notes

Title: Scalable and Scalably-Verifiable Sequential Synthesis

1
Scalable and Scalably-Verifiable Sequential
Synthesis

Alan Mishchenko Mike Case Robert
Brayton
UC Berkeley

2
Overview

Introduction
Computations
SAT sweeping
Induction
Partitioning
Verification
Experiments
Future work

3
Introduction

Combinational synthesis
Cuts at the register boundary
Preserves state encoding, scan chains test
vectors
No sequential optimization easy to verify
Sequential synthesis
Runs retiming, re-encoding, use of sequential
dont-cares, etc
Changes state encoding, invalidates scan chains
test vectors
Some degree of sequential optimization
non-trivial to verify
Scalably-verifiable sequential synthesis
Merges sequentially equivalent registers and
internal nodes
Minor change to state encoding, scan chains
test vectors
Some degree of sequential optimization easy to
verify!

4
Combinational SAT Sweeping

Naïve CEC approach SAT solving
Build output miter and call SAT
works well for many easy problems

Better CEC approach SAT sweeping
based on incremental SAT solving
Detects possibly equivalent nodes using
simulation
Candidate constant nodes
Candidate equivalent nodes
Runs SAT on the intermediate miters in a
topological order
Refines the candidates using counterexamples

5
Sequential SAT Sweeping

Sequential SAT sweeping is similar to
combinational one in that it detects node
equivalences
The difference is, the equivalences are
sequential
They hold only in the reachable state space
Every comb. equivalence is a seq. one, not vice
versa
It makes sense to run comb. SAT sweeping
beforehand
Sequential equivalence is proved by K-step
induction
Base case
Inductive case
Efficient implementation of induction is key!

6
Base Case Inductive Case
?
Candidate equivalences A,B, C,D
?
Proving internal equivalences in a topological
order in frame K
?
?
PIk
0
0
PI1
C
?
D
A
Assuming internal equivalences to in
uninitialized frames 0 through K-1
?
B
PI1
0
0
PI0
C
D
Initial state
A
B
Proving internal equivalences in initialized
frames 0 through K-1
PI0
Symbolic state
7
Efficient Implementation

Two observations
Both base and inductive cases of K-step induction
are runs of combinational SAT sweeping
Tricks and know-hows of combinational sweeping
are applicable
The same integrated package can be used
Starts with simulation
Performs node checking in a topological order
Benefits from the counter-example simulation
Speculative reduction
Has to do with how the assumptions are made (see
next slide)

8
Speculative Reduction

Inputs to the inductive case
Sequential circuit
The number of frames to unroll (K)
Candidate equivalence classes
One node in each class is designated as the
representative node
Currently the representatives are the first nodes
in a topological order
Speculative reduction moves fanouts to the
representative nodes
Makes 80 of the constraints redundant
Dramatically simplifies the resulting timeframes
(observed 3x reductions)
Leads to saving 100-1000x in runtime during
incremental SAT solving

0
0
A
A
B
B
Adding assumptions with speculative
reduction
Adding assumptions without speculative
reduction
9
Partitioning for Induction

A simple output-partitioning algorithm was
implemented
One person-day of programming
CEC and induction became more scalable
Typical reduction in runtime is 20x for a 1M-gate
design
Partitioning is meant to make SAT problems
smaller
The same partitioning is useful for
parallelization!
Partitioning algorithm
Pre-processing For all POs, finds PIs they
depend on
Main loop For each PO, in a degreasing order of
support size
Finds a partition by looking at the supports
Chooses partition with min linear combination of
attraction and repulsion (determined by the
number of common and new variables in this PO)
Imposes restrictions on the partition size
Post-processing Compacts smaller partitions
Complexity O( numPis(AIG) numPos(AIG) )

10
Partitioning Details

Currently induction is partitioned only for
register correspondence
In this case, it is enough to partition only one
timeframe!
In each iteration of induction
The design is re-partitioned
Nodes in each candidate equiv class are added to
the same partition
Constant candidates can be added to any partition
Candidates are merged at the PIs and proved at
the POs
After proving all partitions, the classes are
refined
The partitioned induction has the same
fixed-point as the monolithic induction while the
number of iterations can differ (different
c-examples lead to different refinements)

Partition 1
Illustration for two cand equiv classes A,B,
C,D
Partition 2
11
Other Observations

Surprisingly, the following are found to be of
little or no importance for speeding up the
inductive prover
The quality of initial equivalence classes
How much simulation (semi-formal filtering) was
applied
AIG rewriting on speculated timeframes
Although AIG can be reduced 20, incremental SAT
runs the same
The quality of AIG-to-CNF conversion
Naïve conversion (1 AIG node 3 clauses) works
just fine
Open question Given these observations, how to
speed up this type of incremental SAT?

12
Verification after PSS

Poison and antidote are the same!
The same inductive prover is used
during synthesis to prove seq equivalence of
registers and nodes
during verification to prove seq equivalence of
registers, nodes, and POs of two circuits
Verification is unbounded and general-case
No limit on the input sequence is imposed (unlike
BMC)
No information about synthesis is passed to the
verification tool
The runtimes of synthesis and verification are
comparable
Scales to 10K-register designs due to
partitioning for induction

Synthesis problem
Equivalence checking problem
13
Integrated SEC Flow

The following is the sequence of transformations
currently applied by the integrated SEC in ABC
(command dsec)
creating sequential miter (miter -c)
PIs/POs are paired by name if some registers
have dont-care init values, they are converted
by adding new PIs and muxes all logic is
represented in the form of an AIG
sequential sweep (scl)
removes logic that does not fanout into POs
structural register sweep (scl -l)
removes stuck-at-constant and combinationally-equi
valent registers
most forward retiming (retime M 1) (disabled
by switch r, e.g. dsec r)
moves all registers forward and computes new
initial state
partitioned register correspondence (lcorr)
merges sequential equivalent registers
(completely solves SEC after retiming)
combinational SAT sweeping (fraig)
merges combinational equivalent nodes before
running signal correspondence
for ( K 1 K ? 16 K K 2 )
signal correspondence (ssw) // merges seq
equivalent signals by K-step induction
AIG rewriting (drw) //
minimizes and restructures combinational logic
most forward retiming // moves
registers forward after logic restructuring
sequential AIG simulation // targets
satisfiable SAT instances
post-processing (write_aiger)

14
Example of PSS in ABC

abc 01gt r iscas/blif/s38417.blif // reads in an
ISCAS89 benchmark
abc 02gt st ps // shows the AIG statistics after
structural hashing
s38417 i/o 28/ 106 lat 1636 and
9238 (exor 178) lev 31
abc 03gt ssw K 1 -v // performs one round of
signal correspondence using simple induction
Initial fraiging time 0.27 sec
Simulating 9096 AIG nodes for 32 cycles ... Time
0.06 sec
Original AIG 9096. Init 2 frames 84. Fraig
82. Time 0.01 sec
Before BMC Const 5031. Class 430. Lit
9173.
After BMC Const 5031. Class 430. Lit
9173.
0 Const 5031. Class 430. L
9173. LR 1928. NR 3140.
1 Const 4883. Class 479. L
8964. LR 1554. NR 2978.
28 Const 145. Class 177. L
756. LR 198. NR 9099.
29 Const 145. Class 176. L
753. LR 195. NR 9090.
SimWord 1. Round 2025. Mem 0.38 Mb.
LitBeg 9173. LitEnd 753. ( 8.21 ).
Proof 5022. Cex 2025. Fail 0. FailReal 0.
C-lim 10000000. ImpRatio 0.00
NBeg 9096. NEnd 8213. (Gain 9.71 ).
RBeg 1636. REnd 1345. (Gain 17.79 ).

15
Experimental Results

Public benchmarks
25 test cases
ITC 99 (b14, b15, b17, b20, b21, b22)
ISCAS 89 (s13207, s35932, s38417, s38584)
IWLS 05 (systemcaes, systemcdes, tv80,
usb_funct, vga_lcd, wb_conmax, wb_dma, ac97_ctrl,
aes_core, des_area, des_perf, ethernet, i2c,
mem_ctrl, pci_spoci_ctrl)
Industrial benchmarks
50 test cases
Nothing else is known
Workstation
Intel Xeon 2-CPU 4-core, 8Gb RAM

16
ABC Scripts

Baseline
choice if choice if choice if //
comb synthesis and mapping
Register correspondence (Reg Corr)
scl l // structural register sweep
lcorr // register correspondence using
partitioned induction
dsec r // SEC
choice if choice if choice if //
comb synthesis and mapping
Signal correspondence (Sig Corr)
scl l // structural register sweep
lcorr // register correspondence using
partitioned induction
ssw // signal correspondence using
non-partitioned induction
dsec r // SEC
choice if choice if choice if //
comb synthesis and mapping

17
Public Benchmarks
Columns Baseline, Reg Corr and Sig Corr
show geometric means.
18
ITC / ISCAS Benchmarks (details)
19
IIWLS05 Benchmarks (details)
20
ITC / ISCAS Benchmarks (runtime)
21
IWLS05 Benchmarks (runtime)
22
Industrial Benchmarks
In case of multiple clock domains, optimization
was applied only to the domain with the largest
number of registers.
23
Future