Title: EEGN-CSCI 660 Introduction to VLSI Design Lecture 5
1EEGN-CSCI 660 Introduction to VLSI
DesignLecture 5
2Overview of Synthesis flow
3Fundamental Steps to a Good design
- If you have a good start, the project will go
smoothly - Partitioning the Design is a good start
- Partition by
- Functionality
- Dont mix two different clock domains in a single
block - Dont make the blocks too large
- Optimize for Synthesis
4Block diagram of the Framer Receiver
directionIs it partitioned well? Does it follow
previous suggestions of the previous slide?
5Partitioning
6Recommended rules for Synthesis
- Share resources whenever possible
- When implementing combinatorial paths do not have
hierarchy - Register all outputs
- Do not implement glue logic between block,
partition them well - Separate designs on functional boundary
- Keep block sizes to a reasonable size
- Separate core logic, pads, clock and JTAG
7Resource Sharing
HDL Description if (select) then sum lt A
B Else sum lt C D
A
mux
C
select
sum
B
mux
D
Another Implementation shared resource
Implementation -gt Area-efficient
One Possible Implementation
8Sharable HDL Operators
- Following HDL (VHDL and Verilog) synthetic
operators can result in shared implementation - -
- gt lt lt
- /
- Within the same blocks, the operators can be
shared (i.e. they are in the same process)
9DesignWare Implementation Selection
- DesignWare implementation is dependent on Area
and timing goals - Smallest implementation is selected based on
timing goals being met
fastest
Carry Look Ahead
smallest
Ripple Carry
Synthetic Module
10Sharing Common Sub-Expressions
- Design compiler tries to share common
sub-expressions to reduce the number of resources
necessary to implement the design -gt area savings
while timing goals are met
A
B
C
D
E
SUM1 lt A B C SUM2 lt A B D SUM3 lt A
B E
SUM1
SUM2
SUM3
11Sharing Common Sub-Expressions Limitations
- Sharable terms must be in the same order within
the each expression - sum1 lt A B C
- sum2 lt B A D -gt not sharable
- sum3 lt A B E -gt sharable
- Sharable terms must occur in the same position
(or use parentheses to maintain ordering) - sum1 lt A B C
- sum2 lt D A B -gt not sharable
- sum3 lt E (A B) -gt sharable
12How to Infer Specific Implementation (Adder with
Carry-In
- Following expression infers adder with carry-in
- sum lt A B Cin
- where A and B are vectors, and Cin is a single
bit
A
B
Cin
sum
13Operator Reordering
- Design Compiler has the capability to produce the
reordering the arithmetic operators to produce
the fastest design - For example
- Z lt A B C D (Z is time constrained)
- Initially the ordering is from left to right
A
B
C
Z
D
14Reordering of the Operator for a Fast Design
- If the arrival time of all the signals, A, B, C
and D is the same, the Design Compiler will
reorder the operators using a balanced tree type
architecture
A
B
Z
C
D
15Reordering of the Operator for a Fast Design
- If the arrival time of the signal A is the
latest, the Design Compiler will reorder the
operators such that it accommodates the late
arriving signal
C
B
D
Z
A
16Avoid hierarchical combinatorial blocks
The path between reg1 and reg2 is divided between
three different block Due to hierarchical
boundaries, optimization of the combinatorial
logic cannot be achieved Synthesis tools
(Synopsys) maintain the integrity of the I/O
ports, combinatorial optimization cannot be
achieved between blocks (unless grouping is
used).
17Recommend way to handle Combinatorial Paths
All the combinatorial circuitry is grouped in the
same block that has its output connected the
destination flip flop It allows the optimal
minimization of the combinatorial logic during
synthesis Allows simplified description of the
timing interface
18Register all outputs
Simplifies the synthesis design environment
Inputs to the individual block arrive within the
same relative delay (caused by wire delays) Dont
really need to specify output requirements since
paths starts at flip flop outputs. Take care of
fanouts, rule of thumb, keep the fanout to 16
(dependent on technology and components that are
being driven by the output)
19NO GLUE LOGIC between blocks
Due to time pressures, and a bug found that can
be simply be fixed by adding some simple glue
logic. RESIST THE TEMPTATION!!! At this level in
the hierarchy, this implementation will not allow
the glue logic to be absorbed within any lower
level block.
20Separate design with different goals
reg1 may be driven by time critical function,
hence will have different optimization
constraints reg3 may be driven by slow logic,
hence no need to constrain it for speed
21Optimization based on design requirements
- Use different entities to partition design blocks
- Allows different constraints during synthesis to
optimize for area or speed or both.
22Separate FSM with random logic
- Separation of the FSM and the random logic allows
you to use FSM optimized synthesis
23Maintain a reasonable block size
- Partition your design such that each block is
between 1000-10000 gates (this is strictly tools
and technology dependent) - Larger the blocks, longer the run time -gt quick
iterations cannot be done.
24Partitioning of Full ASIC
- Top-level block includes I/O pads and the Mid
block instantiation - Mid includes Clock generator, JTAG, CORE logic
- CORE LOGIC includes all the functionality and
internal scan circuitry
25Synthesis Constraints
- Specifying an Area goal
- Area constraints are vendor/library dependent
(e.g. 2 input-nand gate, square mils, grid etc) - Design compiler has the Max Area constraint as
one of the constraint attributes.
26Timing constraints for synchronous designs
- Define timing paths within the design, i.e. paths
leading into the design, internal paths and
design leading out of the design - Define the clock
- Define the I/O timing relative to the clock
27Define a clock for synthesis
- Clock source
- Period
- Duty cycle
- Defining the clock constraints the internal
timing paths
28Timing goals for synchronous design
- Define timing constraints for all paths within a
design - Define the clocks
- Define the I/O timing relative to the clock
29Constraining input path
- Input delay is specified relative to the clock
- External logic uses some time within the clock
period and i.e. - TclkToQ(clock to Q delay) Tw (net delay) -gtAt
input to B - Example command for this in synopsys design
compiler - dc_shellgt set_input_delay clock clk 5 (where 5
represents the input delay)
30Constraining output path
- Output delay is specified relative to the clock
- How much of the clock period does the external
logic (shown by cloud b) use up? - Tb Tsetup The amount to be specified as the
output delay
31Generic statement for input and output delays
- Normally the input and the output delay values
are set by using some rule of thumb value which
is dependent on the fanout, external logic, and
the technology being used - The design compiler (Synthesis tools have to work
with time (Tclk-Tin-Tout) during synthesis.
32False and Multicycle paths
- False path
- Very slow signals like reset test mode enable,
that are not used under normal conditions are
classified as false paths - Multicycle path
- Paths that take more than one clock cycle are
known as multicycle paths. - Have to take define the multicylce paths in the
analyzer and it takes those constraints into
account when synthesizing
33Timing paths
34Combinatorial logic may have multiple paths
- Static Timing Analysis uses the longest path to
calculate a maximum delay or the shortest path to
calculate a minimum delay.
35Schematic converted into a timing graph
36Calculating a paths delay
37Selecting a Semiconductor vendor
- One of the first things that needs to be done
when designing a chip is to select the
semiconductor vendor and technology one wants to
use. The following issues need to be considered
during the selection process - Maximum frequency of operation
- Power restrictions
- Packageing restrictions
- Clock tree implementation
- Floor planning
- Back-annotationsupport
- Design support for libraries, megacells, and RAMs
- Available cores
- Available test methods and scans
38 Synthesis tool centric slides Will be using
design_vision
39Understanding the library
- Design Compiler (DC) uses these libraries
- Technology libraries
- Symbol libraries
- DesignWare libraries
- Will use design vision from synopsys for
synthesis - Type design_vision to invoke the tool
40Technology libraries
- Contain information about the characteristics and
functions of each cell provided in a
semiconductor vendors library. The manufacturers
maintain and distribute the technology libraries - Cell characteristics include information such as
cell name, pin names, area, delay arcs and pin
loading. - The technology library also defines the
conditions that must be met for a functional
design (e.g., the maximum transition time for
nets). These conditions are called design rule
constraints. - Also specify the operating conditions and wire
load models specific to that technology - DC requires the technology libraries to be in
.db format. These libraries are typically
provided by the semiconductor manufacturer
41Symbol libraries
- Symbol libraries contain definitions of the
graphic symbols that represent library cells in
the design schematics. Semiconductor vendors
maintain and distribute the symbol libraries. - Design Compiler uses symbol libraries to generate
the design schematic. You must use Design Vision
to view the design schematic. - When you generate the design schematic, Design
Compiler performs a one-to-one mapping of cells
in the netlist to cells in the symbol library.
42DesignWare Library
- A DesignWare library is a collection of reusable
circuit-design building blocks (components) that
are tightly integrated into the Synopsys
synthesis environment. - DesignWare components that implement many of the
built-in HDL operators are provided by Synopsys.
These operators include , -, , lt, gt, lt, gt,
and the operations defined by if and case
statements. - You can develop additional DesignWare libraries
at your site by using DesignWare Developer, or
you can license DesignWare libraries from
Synopsys or from third parties.
43Specifying Libraries
- Use dc_shell variables to specify the libraries
used by the Design Compiler as shown in the table
below
44Target Library
- Design Compiler uses the target library to build
a circuit. During mapping, Design Compiler
selects functionally correct gates from the
target library. It also calculates the timing of
the circuit, using the vendor-supplied timing
data for these gates. - Use the target_library variable to specify the
target library. - The syntax is
- set target_library my_tech.db
45Link Library
- Design Compiler uses the link library to resolve
references. For a design to be complete, it must
connect to all the library components and designs
it references. This process is called linking the
design or resolving references. During the
linking process, Design Compiler uses the
link_library system variable, the
local_link_library attribute, and the search_path
system variable to resolve references - The syntax is
- set link_library my_tech.db
46Specifying DesignWare Library
- You do not need to specify the standard synthetic
library, standard.sldb, that implements the
built-in HDL operators. The software
automatically uses this library. - If you are using additional DesignWare libraries,
you must specify these libraries by using the
synthetic_library variable (for optimization
purposes) and the link_library variable (for cell
resolution purposes).
47Describing environmental attributes
set_max_capacitance Set_max_transition
set_max_fanout on Inputs and Output ports or
current design
set_operating_conditions on the whole design
48Environmental attributes
- Design environment consists of defining the
process parameters, I/O port attributes, and
statistical wire load models. - Set_min_library ltmax_library filenamegt
- -min_version ltmin library
filenamegt - dc_shellgt set_min_library ex25_worst.db \
- -min_version ex25_best.db
- This command allows the users to simultaneously
specify the best case and worst case libraries.
Can be used to fix set up and hold violation. The
user should set both the min and the max values
for the operating conditions
49Setting operating conditions
- set_operating_conditions
- Specifies the process, voltage and temperature
conditions of the design. - Synopsys library consists of WORST, TYPICAL and
BEST cases. Each vendor has their own naming
convention for the libraries! - Changing the value of the operating condition
command, full range of process variations are
covered.
50Setting operating conditions
- set_operating_conditions
- WORST is generally used during pre-layout
synthesis phase to optimize the maximum set-up
time. - BEST is normally used to fix any hold violations.
- TYPICAL is generally not used since it is covered
when both WORST and BEST cases are used.
51Setting operating conditions
- set_operating_conditions
- It is possible to optimize the design with both
WORST and BEST cases simultaneously - dc_shellgt set_operating_conditions WORST
- dc_shellgt set_operating_conditions min BEST
- -max WORST
52Operating conditions
53Modeling wire loads
- DC uses wire loads models to estimate
capacitance, resistance and the area of the nets
prior to floor planning or layout. - The wire load model is based upon a statistically
average length of a net for a given fan out for a
given area
20 x 20
10 x 10
54Wire load command
- DC uses wire load information to model the delay
which is a function of loading - Synopsys provides wire load models in the
technology library, each representing a
particular size. - Designer can create their own wire load models
for better accuracy - set_wire_load_model name ltwire-load modelgt
- dc_shellgtset_wire_load_model name MEDIUM
55Wire load mode
- There are 3 modes associated with the
set_wire_load_mode top, enclosed and segmented - top
- Defines that all nets in the hierarchy will
inherit the same wire load model as the top level
block. Use it if when the plan is to flatten the
design later for layout. - enclosed
- Specifies all the nets (of the sub-blocks)
inherit the wire load model of the block that
completely encloses the sub-blocks. For example,
if blocks X and Y are enclosed within block Z,
then the blocks X and Y will inherit the wire
load models defined for block Z.
56Wire load mode
- segmented
- Used when wires are crossing hierarchical
boundaries. From the previous example, the
sub-blocks X and Y will inherit the wire load
models specific to them, while nets between
sub-blocks X and Y(which are contained within Z)
will inherit wire-load model specified for block
Z - Not used often, as the wire load models are
specific to the net segments - set_wire_load_mode lttopenclosedsegmentedgt
- dc_shellgtset_wire_load_mode top
- Accurately using wire load models is highly
recommended as this directly affects the
synthesis runs. Wrong model can generate
undesired results. Use slightly pessimistic wire
load models. This will provide extra time margin
that may be absorbed later in the test circuit
insertion or layout
57Wire load models across hierarchy
mode top (ignores lower level wire loads)
mode enclosed (uses best fitting wire loads)
mode segmented (uses several wire loads)
50x50
40x40
30x30
20x20
40x40
30x30
20x20
58set_drive
- set_drive is used at the input ports of the
block. It is used to specify the drive strength
at the input port. Is typically used to model the
external drive resistance to the ports of the
block or chip. 0 signifies highest strength and
is normally used for clock or reset ports. - set_drive ltvaluegtltobject listgt
- dc_shellgt set_drive 0 clk rst
59set_driving_cell
- set_driving_cell is used to model the drive
resistance of the driving cell to the input
ports. - set_driving_cell cell ltcell namegt -pin ltpin
namegt ltobject listgt - dc_shellgtset_driving_cell cell BUFF1 pin Z
all_inputs
60set_load
- set_load sets the capacitive load in the units
defined in the technology library (pf), to the
specified ports or nets of the design. It
typically sets capacitive loading on output ports
of the blocks during pre-layout synthesis, and on
nets, for back annotating the extracted post
layout capacitive information - set load ltvaluegt ltobject listgt
- dc_shellgtset_load 1.5 all_outputs
- dc_shellgt set_load 0.3 get_nets blockA/n1234
61Design rule constraints
- Design rule constraints consist of
set_max_transition, set_max_fanout and
set_max_capacitance. These rules are technology
dependent and are generally set in the technology
library. The DRC commands are applied to input
ports, output ports or on the current_design. It
can be useful if the technology library is not
adequate of is too optimistic, then these
commands can be used to control the buffering in
the design - set_max_transition ltvaluegt ltobject listgt
- set_max_capacitance ltvaluegt object listgt
- set_max_fanout ,valuegt ltobject listgt
- dc_shell tgtset_max_transition 0.3 current_design
- dc_shell tgtset_max_capacitance 1.5 get_ports
out1 - dc_shell tgtset_max_fanout 3.0 all_outputs
- (dc_shell tgt corresponds to DC operating in tcl
mode)
62Some more design constraints
- dc_shell t gtcreate_clock period 40
- -waveform list 0 20 CLK
- set_dont_touch_network is a very useful command
and is usually used for clock and reset. It is
used to set_dont_touch property on a port, or a
net. This prevents DC from buffering the net in
order to meet DRCs. - dc_shell tgtset_dont_touch_network clk, rst
63Some more design constraints
- If a block generates a secondary clock from the
primary, e.g. byte clock from the serial clock,
in this apply set_dont_touch_network on the
generated clock output port of the block. Helps
prevent DC from buffering it up. Clock trees can
later be inserted to balance the clock skew.
64Some more design constraints
- set_dont_touch is used to set a dont_touch
property on the current design, cells, references
or net. This is frequently used during
hierarchical compilations of the block. - dc_shell tgtset_dont_touch current_design
- Useful in telling DC not to touch the current
design if it has been optimized to designers
satisfaction. For example, if some spare gates
block is instantiated, DC will not touch it or
optimize it.
65Summarizing High level synthesis is constraint
driven
- Resource sharing, sharing common sub-expressions
and implementation selection are all dependent on
design constraints and coding style - Design Compiler based on timing constraints
decides what to share, how to implement and what
ordering should be done. - If no constraints are given, area based
optimization is performed (maybe a good start to
get an idea of the synthesized circuit) - It is imperative that realistic constraints
should be set prior to compilation - High Level synthesis takes place only when
optimizing an HDL description