Title: Synthesis
1Synthesis
2What is Synthesis?
- Transformation of an abstract description into a
more detailed description - "" operator is transformed into a gate netlist
- "if (VEC_A VEC_B) then"is realized as a
comparator which controls a multiplexer - Transformation depends on several factors
- ???????? ???? (??? AND? OR? ??????) ?? ??????
????? ????? ?? ???? ??? ???????? ?????? ?? ???
??? ????? ?? ?????????? ??? ?? tool ????? ?? ????.
3Field Programmable Gate Array (FPGA)
4???? ? ????? ???? FPLD??
- ?????? ????? (????? ? ????? ?? ????? ??????
?????? ????? ?? ???) (????? ?? ??????? ??? ????)
- Debug ??? ????? ?????? ? ??????.
- ????? ???? ?????? ?????
- ??????? ?? ??? ????? ??????.
5Synthesizability
- Only a subset of VHDL is synthesizable
- Different Tools support different subsets
- records?
- arrays of integers?
- clock edge detection?
- sensitivity list?
- ...
6Different Language Support for Synthesis
7How to Do?
- Macrocells
- adder
- comparator
- Bus interface
- Constraints
- speed
- area
- power
- Optimizations
- boolean mathematic
- gate technological
8Non-functional requirements
- Performance
- Clock speed is generally a primary requirement.
- Usually expressed as a lower bound.
- Design cycle and Timing Closure
- Size
- Determines manufacturing cost.
- If your design doesnt fit into one size FPGA,
you must use the next larger FPGA. - For very large designs multi-FPGAs.
- Power/energy
- Power/Energy related to battery life and heat.
- May have more cost
- More expensive packaging to dissipate heat.
- More extreme measures (e.g. cooling fans).
- Many digital systems are power- or energy-limited.
9Mapping into an FPGA
- Must choose the FPGA
- Capacity.
- Pinout/package type.
- Maximum speed.
10Synthesis Process in Practice
- ?????? ?????????? ????? ????? ???? ??? ??? ??
????? ??? ????????? ??????? ???? ????? ? ?????
11Path delay
- Combinational network delay is measured over
paths through network. - Can trace a causality chain from inputs to
worst-case output.
12Path delay example
network
graph model
13Critical path
- Critical path path which creates longest delay.
- Can trace transitions which cause delays that are
elements of the critical delay path.
14Critical path through delay graph
15Delay Paths in a design
16False paths
- Logic gates are not simple nodessome input
changes dont cause output changes. - A false path is a path which never happens due to
Boolean gate conditions. - False paths cause pessimistic delay estimates.
17Placement and delay
- Placement helps determine routing.
- Routing determines wire length.
- Wire length determines capacitive load.
- Capacitive load determines delay.
18Example Adder placement and delay
- N-bit adder (optimal placement)
19Bad placement and routing
With no delay constraints.
routing
placement
20Bad placement and routing
- Adder has been distributed throughout the FPGA.
- I/O pins have been spread around the chip.
- ? PR algorithms do not catch on to regularity.
21Better placement and routing
With delay constraints.
- Better but far from optimal (less spread out
horizontally but spread out vertically)
22How to improve?
- Use macros (optimized),
- Put constraints on the placement of objects,
- Hand place objects.
- Example later.
23Power Optimization
24Power optimization
- Transitions cause power consumption.
- Logic network design helps control power
consumption - minimizing capacitance
- eliminating unnecessary glitches.
25Power optimization
- Leakage in more advanced processes.
- Even when logic is idle.
- The only way disconnect the power supply from
the logic when not needed for some time. - It generally takes a considerable period (larger
than a clock period) to reconnect power and let
the circuits stabilize.
26Glitching example
27Glitching example behavior
- NOR gate produces 0 output at beginning and end
- beginning bottom input is 1
- end NAND output is 1
- Difference in delay between application of
primary inputs and generation of new NAND output
causes glitch.
28Adder Chain Glitching
good
bad
29Explanation
- Unbalanced chain has signals arriving at
different times at each adder. - A glitch downstream propagates all the way
upstream. - Balanced tree introduces multiple glitches
simultaneously, reducing total glitch activity.
30Factorization for low power
- Proper factorization reduces glitching.
ac
ac
a High transition probability
bad
good
31Factorization techniques
- In example, a has high transition probability, b
and c low probabilities. - Reduce number of logic levels through which
high-probability signals must travel in order to
reduce propagation of glitches.
32Example (ALU)
- ALU output is not used for every cycle
- ? If ALU inputs change, the energy is needlessly
consumed
33Example (ALU)
- Control Signal selects whether data is allowed to
pass the logic or the previous value is held to
avoid transitions.
Data
Logic
D
Q
Control
34Layout for low power
- Place and route to minimize capacitance of nodes
with high glitching activity. - Feed back wiring capacitance values to power
analysis for better estimates.
35State assignment for low power
36Case Study
- 16 x 16 multiplier example.
37The FPGA design process
- Xilinx ISE (Integrated Synthesis Environment)
- Translation from HDL.
- (Synthesis, Translation)
- Logic synthesis.
- (Mapping)
- Placement and routing.
- (Place and Route)
- Configuration generation.
- (Program File Generation)
38Design experiments
- Synthesize with no constraints.
- Synthesize with timing constraint.
- Tighten timing constraint.
- Synthesize with placement constraints.
- Power
- Many tools dont allow us to directly specify
power consumption - ? must rewrite our h/w description for better
power consumption characteristics.
39Post-translation simulation model
- No timing or area constraints
- HDL model in terms of FPGA primitives.
- Example
- X_LUT4 \p12_Madd__n0015_Mxor_Result_Xolt1gt1 (
- .ADR0(x_7_IBUF),
- .ADR1(y_13_IBUF),
- .ADR2(c127),
- .ADR3(row128),
- .O(row137)
- )
40Mapping report
- Design Summary
- --------------
- Number of errors 0
- Number of warnings 0
- Logic Utilization
- Number of 4 input LUTs 501 out of
1,024 48 - Logic Distribution
- Number of occupied Slices 255 out of
512 49 - Number of Slices containing only related logic
255 out of 255 100 - Number of Slices containing unrelated logic
0 out of 255 0 - See NOTES below for an explanation of
the effects of unrelated logic - Total Number 4 input LUTs 501 out of
1,024 48 - Number of bonded IOBs 64 out of
92 69 - Total equivalent gate count for design 3,006
- Additional JTAG gate count for IOBs 3,072
- Peak Memory Usage 64 MB
41Related vs. Unrelated Logic (Hidden)
- Related logic logic that shares connectivity.
- Unrelated logic logic that shares no
connectivity. - When assembling slices, mapper gives priority to
combine logic that is related - ? best results.
- Mapper will only begin packing unrelated logic
into a slice once all of the slices are occupied.
42Static timing analysis report
- Timing constraint TS_P2P MAXDELAY FROM TIMEGRP
"PADS" TO TIMEGRP "PADS" 99.999 uS - 20135312 items analyzed, 0 timing errors
detected. (0 setup errors, 0 hold errors) - Maximum delay is 20.916ns.
- --------------------------------------------------
------------------------------
After Mapping ? estimated delays (no information
about interconnects)
43Static timing report delays along paths
- Data Sheet report
- -----------------
- All values displayed in nanoseconds (ns)
- Pad to Pad
- ------------------------------------------------
--- - Source Pad Destination Pad Delay
- ------------------------------------------------
--- - xlt0gt plt0gt 5.824
- xlt0gt plt10gt 10.675
- xlt0gt plt11gt 11.214
- xlt0gt plt12gt 11.753
44Routing report
- Phase 1 1975 unrouted REAL time 11 secs
- Phase 2 1975 unrouted REAL time 11 secs
- Phase 3 619 unrouted REAL time 12 secs
- Phase 4 619 unrouted (0) REAL time 12
secs - Phase 5 619 unrouted (0) REAL time 12
secs - Phase 6 619 unrouted (0) REAL time 12
secs - Phase 7 0 unrouted (0) REAL time 12 secs
- The NUMBER OF SIGNALS NOT COMPLETELY ROUTED for
this design is 0
- REAL time Routing algorithm run time.
45Static timing after routing
- Timing constraint TS_P2P MAXDELAY FROM TIMEGRP
"PADS" TO TIMEGRP "PADS" 99.999 uS - 20135312 items analyzed, 0 timing errors
detected. (0 setup errors, 0 hold errors) - Maximum delay is 38.424ns.
- --------------------------------------------------
-----------------
- (vs 20.916 ns in mapping report) Because of
interconnect delays.
46Timing constraint
- Use timing constraint editor
47Post-map static timing report
- Timing constraint TS_P2P MAXDELAY FROM TIMEGRP
"PADS" TO TIMEGRP "PADS" 32 nS - 20135312 items analyzed, 0 timing errors
detected. (0 setup errors, 0 hold errors) - Maximum delay is 20.916ns.
Pad to pad
Hasnt changed since this design has limited
opportunities for logic synthesis to change
delays by restructuring logic.
48Post-routing static timing report
- Timing constraint TS_P2P MAXDELAY FROM TIMEGRP
"PADS" TO TIMEGRP "PADS" 32 nS - 20135312 items analyzed, 0 timing errors
detected. (0 setup errors, 0 hold errors) - Maximum delay is 31.984ns.
Tools generally try to meet the delay goal as
closely as possible to minimize area.
49Tighter timing constraints
- Tighten requirement to 25 ns.
- Post-place-route timing report
- Timing constraint TS_P2P MAXDELAY FROM TIMEGRP
"PADS" TO TIMEGRP "PADS" 25 nS - 20135312 items analyzed, 11 timing errors
detected. (11 setup errors, 0 hold errors) - Maximum delay is 31.128ns.
50Report on a violated path
- Slack -6.128ns (requirement -
data path) - Source ylt0gt (PAD)
- Destination plt30gt (PAD)
- Requirement 25.000ns
- Data Path Delay 31.128ns (Levels of Logic
31) -
Modify the logic and/or physical design to
improve the delay.
51Power report
- Power summary I(mA)
P(mW) - --------------------------------------------------
-------------- - Total estimated power consumption
333 - ---
- Vccint 1.50V 0
0 - Vccaux 3.30V 100
330 - Vcco33 3.30V 1
3 - ---
- Inputs 0
0 - Logic 0
0 - Outputs
- Vcco33 0
0 - Signals 0
0 - ---
- Quiescent Vccaux 3.30V 100
330 - Quiescent Vcco33 3.30V 1
3 - Thermal summary
- --------------------------------------------------
--------------
Helps us determine whether we need additional
cooling.
52Improving area
- Floorplanner window
- Floorplanner ? View/edit placed design
Chip floorplan
LEs
- Green rectangles mapped components to CLBs
53Rats nest wiring
- If you click on a component in the deign
hierarchy window, its rats nest is shown.
54Routing editor view
- FPGA Editor ? View/Edit Routed Design
55Editing constraints
- Use constraints editor to place constraints
- This tool allws you to constrain the placement of
logic as well as the assignment of chip I/Os to
IOBs (e.g useful for PCB design)
56Design browser pane
57Drag and drop constraints
58Change the shape of constraints
59Full set of placement constraints
- We place the rows of the multiplier one below the
other to create the row structure of the
floorplan.
60Placement results
61New timing report
- After placement constraints
- 19742142 items analyzed, 0 timing errors
detected. (0 setup errors, 0 hold errors) - Maximum delay is 29.934ns.
- Compares to 31 ns for unconstrained placement.
62Combinational Process Sensitivity List
Library IEEEuse IEEE.Std_Logic_1164.all entit
y IF_EXAMPLE isport (A, B, C, X in std_ulogic_v
ector(3 downto 0) Z ou
t std_ulogic_vector(3 downto 0))end IF_EXAMPLE
architecture A of IF_EXAMPLE isbegin pro
cess (A, B, C, X) begin if (
X "1110" ) then Z lt A elsif
(X "0101") then Z lt B else
Z lt C end if end proce
ssend A
63Combinational Process Sensitivity List
process (A, B, SEL)begin if SEL 1 then
Z lt A else Z lt B end ifend process
- If SEL is missing in the sensitivity list, what
will the behavior (simulation) be?
- Sensitivity list is usually ignored during
synthesis. - Equivalent behavior of simulation model and
hardware - ? All signals which are read are entered into the
sensitivity list. - Complete if-statement for the synthesis of
combinational logic.
64Combinational ProcessIncomplete Assignments
Library IEEEuse IEEE.Std_Logic_1164.all ent
ity INCOMP_IF isport (A, B, SEL in std_ulogic
Z out std_ulogic)end INCO
MP_IF architecture RTL of INCOMP_IF isbegin
process (A, B, SEL)begin if SEL 1 then
Z lt A end ifend processen
d RTL
- What is the value of Z,if SEL 0 ?
- What hardware wouldbe generated during synthesis
?
- Latch? ?? ????? SEL 1 ???? ???
- (Transparent latch).
- ?? ???????? ???????? ???
- ?? ?? ??????? ?????? FF?? ?????? ??? ??? ??
??????? ???? ?????? ?? ?????? ????????? ????? ???
???? ??????? ?? ???.
65Modeling of Flip-Flops
Library IEEEuse IEEE.Std_Logic_1164.all enti
ty FLOP isport (D, CLK in std_ulogic
Q out std_ulogic)end F
LOP architecture A of FLOP isbegin proc
ess begin wait until CLKevent and
CLK1 Q lt D end processend
A
66Description of Rising Clock Edge for Synthesis
- ????????? ?? ??????? ???? ?????? ?? ?????? ??
?????. - ??? ? wait?? ?? ?? ???????? ??? ????.
- ? ???? ?????????? if ?? wait until ?? ???? ???
- Standard for synthesis IEEE 1076.6
- ... if condition
- RISING_EDGE ( clock_signal_ name) (not always
supported) clock_signal_ name'EVENT and
clock_signal _name'1'clock_signal _name'1'
and clock_signal_ name'EVENTnot clock_signal_
name'STABLE and clock_signal_ name'1'clock_sign
al _name'1' and not clock_signal_ name'STABLE
67Description of Rising Clock Edge for Synthesis
- ... wait until condition
- RISING_EDGE ( clock_signal_ name)clock_signal_
name'EVENT and clock_signal _name'1'clock_signa
l _name'1' and clock_signal_ name'EVENTnot
clock_signal_ name'STABLE and clock_signal_
name'1'clock_signal _name'1' and not
clock_signal_ name'STABLEclock_signal _name'1'
68Description of Rising Clock Edge for Synthesis
- In Std_Logic_1164 package
processbegin wait until RISING_EDGE(CLK) Q
lt Dend process
function RISING_EDGE (signal CLK std_ulogic)
return boolean isbegin if ( CLKe
vent and CLK 1 and
CLKlast_value0) then return true
else return false end ifend RI
SING_EDGE
69Gated Clock
- Designers avoid using gated clocks because of
problematic timing behavior of the circuit (adds
skew). - Low power designs deliberately disable clocks to
reduce or eliminate power waste by useless
switching of transistors.
processbegin wait until RISING_EDGE(CLK)
if (DGATE) then Q lt Dend process
mux
D
Q
DFF
DGATE
CLK
70Register Inference
Library IEEEuse IEEE.Std_Logic_1164.all enti
ty COUNTER isport ( CLK in std_ulogic
Q out integer range 0 to 15
)end COUNTER architecture A of COUNTER is
signal COUNT integer range 0 to 15
begin process (CLK) begin if CLKeve
nt and CLK 1 then if (COUNT gt 9) the
n COUNT lt 0 else
COUNT lt COUNT 1 end if end if
end process Q lt COUNTend A
??????? ?? ???? BCD
- For all signals which receive an assignment in
clocked processes, memory is synthesized. - COUNT 4 FF
- (constrained integer)
- Q not used in clocked process.
????? ??????? reset ????? ???? ??? ?????
71Asynchronous Set/Reset
Library IEEEuse IEEE.Std_Logic_1164.all enti
ty ASYNC_FF isport ( D, CLK, SET, RST in std
_ulogic Q
out std_ulogic)end ASYNC_FF architecture A
of ASYNC_FF isbegin process (CLK, RST, SET)
begin if (RST 1) then Q
lt 0 elsif SET '1' then Q lt
'1' elsif (CLKevent and CLK 1) then
Q lt D end if end processe
nd A
- if/elsif - structure
- The last elsif has an edge
- No else
- ???? set/reset ?????? ??? clk ?? ???? ??????
???? ?? ???? (?? ???? ?? wait until ?? ???????
???). - ??? ???? ??????? ??? ?? ???? ?????? ?? ????
??????? ??? - ????? ??? ???????? ??????? ?? ???? ?????? ????
???? ???? ????? ???? ???? ?? ???? ?????? ?? ???.
72Coding Style Influence
EXAMPLE1process (SEL,A,B)begin if SEL 1
then Z lt A B else Z lt A C
end ifend process EXAMPLE1
- Manual resource sharing
- ??? ?? ??? ????? ???? ????.
- ??? SEL ????? ?? ??? ???? ?????? ?????? ??? ??
???.
EXAMPLE2process (SEL,A,B) variable
TMP bitbegin if SEL 1 then TMP
B else TMP C end if Z lt
A TMPend process EXAMPLE2
73Source Code Optimization
- An operation can be described very efficiently
for synthesis, e.g.
- In one description the longest path goes via
five, in the other description via three addition
components - some optimization tools
automatically change the description according to
the given constraints.
74Source Code Optimization
- If one of the inputs arrives later than others,
it can be chosen for IN6 in the left
implementation. - If power is a consideration, IN6 could be used
for the signal that changes more frequently in
the left implementation since it passes through
only one adder.
75???? ???????
- ???? ?? ????? ? ????? ???????? ?? (????? ??????
????????? ?? ????? logic cell?? (?? ????????? ??
ASIC)) ?????? ?? netlist ????? ?? ???. - ??? ??????? ????? ???? ?? ????? ????? ??? ??? (??
????? ???? ?? ???) - ?? ???? ???? ????? ?? ?? ???? ?? ?? VHDL
comment???? ???? ?? ????? Carry-Lookahead ??
Ripple Carry ?????? ???.
76Example Adder
entity ADD is port (A, B in integer range 0
to 7 Z out integer range 0 to
15)end ADDarchitecture ARITHMETIC of ADD
isbegin Z lt A Bend ARITHMETIC
library VENDOR_XYuse VENDOR_XY.p_arithmetic.all
entity MVL_ADD is port (A, B in
stdlogic_vector (3 downto 0) Z out
stdlogic_vector (4 downto 0) )end
MVL_ADDarchitecture ARITHMETIC of MVL_ADD
isbegin Z lt A B // not allowedend
ARITHMETIC
- NoticeAdvantages of a range declaration with
integer types a) During simulation check for
"out of range..." b) During synthesis only 4
bit bus width
77IF Structure lt-gt CASE Structure
- Different descriptions may be synthesized
differently
case IN is when 0 to 16 gt OUT
lt B when 17 gt OUT lt C when
others gt OUT lt A end case
if (IN gt 17) then OUT lt A elsif (IN lt
17) then OUT lt B else OUT lt C end if
- ????????? ?? ???? ??? optimize ????.
78Variables in Clocked Processes
- Registers are generated for all variables that
might be read before they are updated
VAR_1 process(CLK) variable TEMP integerbe
gin if (CLK'event and CLK '1')
then TEMP INPUT 2 OUTPUT_A lt TE
MP 1 OUTPUT_B lt TEMP 2 end
ifend process VAR_1
VAR_2 process(CLK) variable TEMP integerbe
gin if (CLK'event and CLK '1')
then OUTPUT lt TEMP 1 TEMP INPUT
2 end ifend process VAR_2
- How many registers are generated?
79ELSE for Clock Checking
- ?? ????? ???? ????? ?? ??? ???? ????????? ????,
else ?? ??? ???? ????? ????? ??? ???? (??? ?????
???? ???? ?? ?? ???? ???).
process(CLK)begin if (CLKevent and CLK1) th
en Q lt D else Q lt A end ifend proc
ess
80Dont Care
- ?????? ????? ?????? ?? - ?? ????? ?????? ?????
FALSE ?? ??? (?????? ????? ?????? - ??? ???)
81Synthesis Tips
- Real ? character ? time ??? ???? ????
- ????? ?? testbench ????? ? ???????.
82Synthesis Tips
83Finite State Machines and VHDL
- One- , two- or three-processes
- State Coding
- FSM Types
- Medvedev
- Moore
- Mealy
- Registered Output
84One-Process FSM
FSM_FF process (CLK, RESET)begin if RESET'1
' then STATE lt START elsif
CLK'event and CLK'1' then case STATE
is when START gt if XGO_MID
then
STATE lt MIDDLE en
d if when MIDDLE gt if
XGO_STOP then
STATE lt STOP end
if when STOP gt if
XGO_START then
STATE lt START end
if when others gt
STATE lt START end case end i
f end process FSM_FF
85Two-Process FSM
FSM_LOGIC process ( STATE , X)begin case
STATE is when START gt if
XGO_MID then
NEXT_STATE lt MIDDLE end
if when MIDDLE gt ... w
hen others gt NEXT_STATE lt START
end case end process FSM_LOGIC FSM_
FF process (CLK, RESET) begin if RESET'1'
then STATE lt START elsif
CLK'event and CLK'1' then STATE lt
NEXT_STATE end ifend process FSM_FF
86How Many Processes?
- Structure and Readability
- Asynchronous combinatoric ? synchronous storing
elementsgt 2 processes - Graphical FSM (without output equations)
resembles one state processgt 1 process - Simulation
- Error detection easier with two state processes
due to access to intermediate signals.gt 2
processes - Synthesis
- 2 state processes can lead to smaller generic net
list and therefore to better synthesis
results(depends on synthesizer but in general,
it is closer to hardware)gt 2 processes
87State Encoding
type STATE_TYPE is ( START, MIDDLE, STOP )
signal STATE STATE_TYPE
- State encoding responsiblefor safety of FSM
START -gt " 00 "MIDDLE -gt " 01
"STOP -gt " 10 "
START -gt " 001 "MIDDLE -gt " 010
"STOP -gt " 100 "
- Speed optimized defaultencoding one hot
88Encoding of CASE Statement
type STATE_TYPE is (START, MIDDLE, STOP) signal
STATE STATE_TYPE case STATE is
when START gt when MIDD
LE gt when STOP gt
when others gt
end case
- Adding the "when others" choice
89Extension of Type Declaration
type STATE_TYPE is (START, MIDDLE, STOP, DUMMY)
signal STATE STATE_TYPE case STATE is
when START gt when M
IDDLE gt when STOP gt
when DUMMY gt -- or when
others end case
- Adding dummy values
- Only for binary encoding
- Advantages
- Safe FSM after synthesis
90Hand Coding
subtype STATE_TYPE is std_ulogic_vector (1 downto
0) signal STATE STATE_TYPE constant START
STATE_TYPE "01"constant MIDDLE STATE_TYP
E "11"constant STOP STATE_TYPE "00"
case STATE is when START
gt when MIDDLE gt wh
en STOP gt when others gt
end case
- Defining constants
- Control of encoding
- Safe FSM
- Portable design
- Disadvantage
- More effort (especially when design changes)
91FSM Medvedev
- The output vector resembles the state
vector Y S
Two Processes architecture RTL of MEDVEDE
V is ...begin REG process (CLK, RESET)
begin -- State Registers Inference
end process REG CMB process (X, STATE)
begin -- Next State Logic end proces
s CMB Y lt S end RTL
One Process architecture RTL of MEDVEDEV
is ...begin REG process (CLK, RESET)
begin -- State Registers Inference wit
h Logic Block end process REG Y lt S e
nd RTL
92Medvedev Example (2-Process)
CMB process (A,B,STATE) begin case
STATE is when START gt if (A or B)'
0' then NEXTSTATE lt M
IDDLE end if
when MIDDLE gt if (A and B)'1' then
NEXTSTATE lt STOP
end if when STOP gt
if (A xor B)'1' then
NEXTSTATE lt START end
if when others gt NEXTSTATE lt START
end case end process CMB --
concurrent signal assignments for output
(Y,Z) lt STATE end RTL
architecture RTL of MEDVEDEV_TEST is signal ST
ATE,NEXTSTATE STATE_TYPE begin
REG process (CLK, RESET) begin if RES
ET'1' then STATE lt START e
lsif CLK'event and CLK'1' then STATE
lt NEXTSTATE end if end process RE
G
93Medvedev Example Waveform
- (Y,Z) STATE gt Medvedev machine
94FSM Moore
- The output vector is a function of the state
vector Y f(S)
Three Processes architecture RTL of
MOORE is ...begin REG -- Clocked
Process CMB -- Combinational
Process OUTPUT process (STATE) begin
-- Output Logic end process OUTPUT
end RTL
Two Processes architecture RTL of MOORE
is ...begin REG process (CLK,
RESET) begin -- State Registers
Inference with Next State Logic end process
REG OUTPUT process (STATE) begin
-- Output Logic end process OUTPUT end
RTL
95Moore Example
- Since outputs depend only on the current state,
no signals other than STATE appears in the
sensitivity list.
CMB process (A,B,STATE) begin
case STATE is when START gt if (A or
B)'0' then NEXTSTATE
lt MIDDLE end if
when MIDDLE gt if (A and B)'1'
then NEXTSTATE lt
STOP end if
when STOP gt if (A xor B)'1'
then NEXTSTATE lt
START end if
when others gt NEXTSTATE lt START
end case end process CMB --
concurrent signal assignments for output Y lt
1 when STATEMIDDLE else 0 Z lt 1
when STATEMIDDLE
or STATESTOP else 0end RTL
architecture RTL of MOORE_TEST is signal
STATE,NEXTSTATE STATE_TYPE begin REG
process (CLK, RESET) begin if RESET'1'
then STATE lt START elsif
CLK'event and CLK'1' then STATE lt
NEXTSTATE end if end process
REG
96Moore Example Waveform
- (Y,Z) changes simultaneously with STATE ?
Moore machine
97FSM Mealy
- The output vector is a function of the state
vector and the input vector Y f(X,S)
Two Processes architecture RTL of MEALY
is ...begin MED process (CLK,
RESET) begin -- State Registers
Inference with Next State Logic end process
MED OUTPUT process (STATE,
X) begin -- Output Logic end
process OUTPUT end RTL
Three Processes architecture RTL of MEALY
is ...begin REG -- Clocked
Process CMB -- Combinational
Process OUTPUT process (STATE,
X) begin -- Output Logic end
process OUTPUT end RTL
98Mealy Example
REG -- clocked STATE process CMB
-- Like Medvedev and Moore Examples OUTPUT
process (STATE, A, B) begin case STATE
is when START gt
Y lt '0'
Z lt A and B
when MIDLLE gt
Y lt A nor B
Z lt
'1' when STOP gt
Y lt A nand B
Z lt
A or B when others gt
Y lt '0'
Z lt
'0' end case end process
OUTPUTend RTL
architecture RTL of MEALY_TEST is signal
STATE,NEXTSTATE STATE_TYPE begin
99Mealy Example (Another Code)
REG -- clocked STATE process CMB
-- Like Medvedev and Moore Examples--
Concurrent signal assignments for outputs Y lt
1 when (STATE MIDDLE and (A or B)
0) or (STATE STOP and (A and B) 0)
else 0Z lt 1 when (STATE START and
(A and B) 1) or (STATE MIDDLE) or (STATE
STOP and (A or B) 1) else 0end RTL
architecture RTL of MEALY_TEST is signal
STATE,NEXTSTATE STATE_TYPE begin
100Mealy Example Waveform
- (Y,Z) changes with input gt Mealy machine
- Note the "spikes" of Y and Z in the waveform
101Modeling Aspects
- Medvedev is too inflexible
- but less hardware (no combinational circuit for
output) - More effort to calculate state vector.
- Moore is preferred because of safe operation
- since o/p depends only on state vector.
- ? next output values are stable long before the
next clock edge. - Mealy more flexible, but danger of
- Spikes
- Unnecessary long paths (maximum clock period)
- Combinational feed back loops
102Registered Output
- Avoiding long paths and combinational loops.
- With one additional clock period
103Registered Output Example (1)
REG -- clocked STATE
process CMB -- Like other
Examples OUTPUT process (STATE, A,
B) begin case STATE
is when START gt
Y_Ilt '0'
Z_Ilt
A and B end process
OUTPUT -- clocked output process OUTPUT_R
EG process(CLK) begin if CLK'event
and CLK'1' then Y lt Y_I Z
lt Z_I end if end process
OUTPUT_REG end RTL
architecture RTL of REG_TEST is signal Y_I ,
Z_I std_ulogic signal STATE,NEXTSTATE
STATE_TYPE begin
104Reg. Output Example Waveform
- One clock period delay between STATE and output
changes. - Input changes with clock edge result in an output
change.(Danger of unmeant values )
105Registered Output Example (2)
REG -- clocked STATE process CMB
-- Like other Examples OUTPUT process (
NEXTSTATE , A, B) begin case NEXTSTATE
is when START gt
Y_Ilt '0'
Z_Ilt
A and B end process
OUTPUT OUTPUT_REG process(CLK) begin
if CLK'event and CLK'1' then Y lt Y_I
Z lt Z_I end if end
process OUTPUT_REG end RTL
architecture RTL of REG_TEST2 is signal Y_I ,
Z_I std_ulogic signal STATE,NEXTSTATE
STATE_TYPE begin
106Reg. Output Example Waveform
- No delay between STATE and output changes.
- "Spikes" of original Mealy machine are gone!
107Case Study (Memory Controller)
BUS_ID
Address
Data
OE
Reset
WE
READY
FSM
SRAM Memory Array
BURST
ADDR1
READ_WRITE
ADDR0
CLK
- ????????? ??? ??? ?? ????? bus id? mem_buffer
(F3) ?????? ?? ??? ?? ???? ?? ????.
108Case Study (Memory Controller)
BUS_ID
Address
Data
OE
Reset
WE
READY
FSM
SRAM Memory Array
BURST
ADDR1
READ_WRITE
ADDR0
CLK
- ?? ???? ???, READ_WRITE 1 ?? ??? ?? ????? ??
?? ?????? ?? ????? ????? ??? (?? 0 ???? ?????).
109Case Study (Memory Controller)
BUS_ID
Address
Data
OE
Reset
WE
READY
FSM
SRAM Memory Array
BURST
ADDR1
READ_WRITE
ADDR0
CLK
- ???? ?????? ???? ??? 4???? ?? (burst read)????
???? ?? ??? ????? ????, burst ???? ????.
110Case Study (Memory Controller)
BUS_ID
Address
Data
OE
Reset
WE
READY
FSM
SRAM Memory Array
BURST
ADDR1
READ_WRITE
ADDR0
CLK
- ?????? ?? 4 ??? ?? ???? ?????? ?? ???? (?? ??????
???? ??? ?? ???? ??????? ?????? ready ?????? ??
????).
111Case Study (Memory Controller)
BUS_ID
Address
Data
OE
Reset
WE
READY
FSM
SRAM Memory Array
BURST
ADDR1
READ_WRITE
ADDR0
CLK
- ?????? oe ?? ???? mem_buffer ?? ??? ?????? ????
?? ??? ? ?? ???? ????? ???? ?? ?? ???? burst
?????? ?? ???.
112Case Study (Memory Controller)
BUS_ID
Address
Data
OE
Reset
WE
READY
FSM
SRAM Memory Array
BURST
ADDR1
READ_WRITE
ADDR0
CLK
- ????? ????? ?? ???? ?? ???.
113Case Study (Memory Controller)
BUS_ID
Address
Data
OE
Reset
WE
READY
FSM
SRAM Memory Array
BURST
ADDR1
READ_WRITE
ADDR0
CLK
- ????? ????? we ???? ?? ??? ? data ?? ???address
????? ?? ???.
- ?????? ? ????? ?? ????? ready ????? ?? ????.
114??????? ????
synch reset
idle
ready
ready . burst
ready
ready
Decision
read4
Read_write
Read_write
ready
ready . burst
ready
Write
read1
read2
read3
115Memory Controller
- ???? ??? ?????? ????? ???
ready
state
116VHDL Code (2-process)
library ieee use ieee.std_logic_1164.all entity
memory_controller is port ( reset,
read_write, ready, burst, clk
in std_logic bus_id
in std_logic_vector(7 downto 0) oe, we
out std_logic addr
out std_logic_vector(1
downto 0)) end memory_controller architecture
state_machine of memory_controller is type
StateType is (idle, decision, read1, read2,
read3, read4, write) signal present_state,
next_state StateType
117VHDL Code
begin state_combprocess(reset, bus_id,
present_state, burst, read_write, ready) begin
if (reset '1') then oe lt '-' we lt '-'
addr lt "--" next_state lt idle else
case present_state is when idle gt oe
lt '0' we lt '0' addr lt "00" if
(bus_id "11110011 and ready 1) then
next_state lt decision else
next_state lt idle end if when
decisiongt oe lt '0' we lt '0' addr lt
"00" if (read_write '1') then
next_state lt read1 else
--read_write'0' next_state lt
write end if
- Dont cares assigned to outputs ? optimized
In every case, a signal must be assigned to the
outputs otherwise, unwanted latches.
118 when read1 gt oe lt '1' we lt '0'
addr lt "00" if (ready '0') then
next_state lt read1 elsif (burst
'0') then next_state lt idle
else next_state lt read2 end
if when read2 gt oe lt '1' we lt
'0' addr lt "01" if (ready '1') then
next_state lt read3 else
next_state lt read2 end if
when read3 gt oe lt '1' we lt '0' addr
lt "10" if (ready '1') then
next_state lt read4 else
next_state lt read3 end if
119VHDL Code
when read4 gt oe lt '1' we lt '0'
addr lt "11" if (ready '1') then
next_state lt idle else
next_state lt read4 end if when
write gt oe lt '0' we lt '1' addr lt
"00" if (ready '1') then
next_state lt idle else
next_state lt write end if end
case end if end process state_comb
120VHDL Code
state_clockedprocess(clk) begin if
rising_edge(clk) then present_state lt
next_state end if end process
state_clocked end
121????? ??????? ?? ???????? Moore
- ????????? ?? ?? ?????? ???? ?? ??? ?????? ????
??? ??? (?? ???)
Inputs
Current-State
Next-State
State Registers
outputs
Next-State Logic
Output Logic
- ?????
- ?????? ??
- ??????? ??????
122????? ??????? ?? ???????? Moore
2) ????????? ?? ?? ????????? ????? ?? ??? ?????
???? ?? ????
outputs
Output Logic
Output Registers
Inputs
Current-State
Next-State
State Registers
Next-State Logic
- ?????? ?? ??????? ???? ?? ???? ???????? ?? ??????
????? ?? ?? ????? ?? ??? ????? ????.
123 architecture state_machine of memory_controller
is type StateType is (idle, decision, read1,
read2, read3, read4, write) signal
present_state, next_state StateType signal
addr_d std_logic_vector(1 downto 0)
-- D-input to addr f-flops begin state_combproces
s(bus_id, present_state, burst, read_write,
ready) begin case present_state is
-- addr outputs not defined
when idle gt oe lt '0' we lt '0'
-- addr is absent. if (bus_id
"11110011 and ready 1) then
next_state lt decision else
next_state lt idle end if
when decisiongt oe lt '0' we lt
'0' if (read_write '1') then
next_state lt read1 else
--read_write'0'
next_state lt write end if
- ??? ??? ???? addr ??? ??? ?? ????? ?? ???? (????
we ? oe ???? ????? ??????)
124 when read1 gt oe lt '1' we lt
'0' if (ready '0') then
next_state lt read1 elsif
(burst '0') then next_state lt
idle else next_state
lt read2 end if when read2
gt oe lt '1' we lt '0' if
(ready '1') then next_state lt
read3 else
next_state lt read2 end if
when read3 gt oe lt '1' we lt '0'
if (ready '1') then
next_state lt read4 else
next_state lt read3 end if
125 when read4 gt oe lt '1' we lt
'0' if (ready '1') then
next_state lt idle else
next_state lt read4 end if
when write gt oe lt '0' we lt
'1' if (ready '1') then
next_state lt idle else
next_state lt write end if
end case end process state_comb with
next_state select -- D-input to
addr flip-flops addr_d lt "01" when
read2, -- defined here.
"10" when read3, "11" when
read4, "00" when others
126 state_clockedprocess(clk, reset) begin if
reset '1' then present_state lt idle
addr lt "00" -- asynchronous
reset for addr flops elsif rising_edge(clk)
then present_state lt next_state
addr lt addr_d -- value of addr_d
stored in addr end if end process
state_clocked end state_machine
127????? ??????? ?? ???????? Moore
- ??????
- 2 FF ?????.
- ???? ?????? ?????? ???? ?? FF??? addr, ?? ?? ????
?????? ?? ?? ??? (??? ?? PLD ?? 2 ???? ???????
??? ?? ????? ?????? ??????? ?? ????? ???)
128????? ??????? ?? ???????? Moore
3) ????????? ?? ???????? ?? ?????? ???? ???? ???
??? (Medvedev) (????? ??????? ??)
outputs
Inputs
Next-State
Current-State
State Registers
Next-State Logic
- State encoding ???? ?? ??? ????? ???.
- FF??? ?????? ???? ????.
- ???? ????? ?? ???? ?????? ???? ????? (????
?????).
129State Encoding
- ??? ??? ???? addr ??? ??? ?? ????? ?? ???? (????
we ? oe ???? ????? ??????)
Addr(0) Addr(1)
0 0 Idle
0 0 decision
0 0 Read1
1 0 Read2
0 1 Read3
1 1 Read4
0 0 Write
s2 s1
0 0
1 0
0 1
x x
x x
x x
1 1
130State Encoding
- ??? ???? we ? oe ?? ??????? ?? ???? ???? encode
????
Addr(0) Addr(1)
0 0 Idle
0 0 decision
0 0 Read1
1 0 Read2
0 1 Read3
1 1 Read4
0 0 Write
we oe
0 0
0 0
0 1
0 1
0 1
0 1
1 0
s0
0
1
0
0
0
0
0
131VHDL Code
architecture state_machine of memory_controller
is -- state signal is a std_logic_vector rather
than an enumeration type signal state
std_logic_vector(4 downto 0) constant idle
std_logic_vector(4 downto 0) "00000"
constant decision std_logic_vector(4 downto 0)
"00001" constant read1
std_logic_vector(4 downto 0) "00100"
constant read2 std_logic_vector(4 downto 0)
"01100" constant read3
std_logic_vector(4 downto 0) "10100"
constant read4 std_logic_vector(4 downto 0)
"11100" constant write
std_logic_vector(4 downto 0)
"00010" begin state_trprocess(reset, clk)
begin -- One-process FSM if reset '1'
then state lt idle elsif
rising_edge(clk) then case state is
-- outputs not defined here
when idle gt if (bus_id
"11110011") then state lt
decision end if --
no else implicit memory
132VHDL Code
when decisiongt if
(read_write '1') then state lt
read1 else
--read_write'0' state lt write
end if when read1 gt
if (ready '0') then state
lt read1 elsif (burst '0') then
state lt idle else
state lt read2 end if
when read2 gt if (ready '1')
then state lt read3
end if -- no else implicit memory
133 when read3 gt if (ready
'1') then state lt read4
end if -- no else implicit
memory when read4 gt if
(ready '1') then state lt idle
end if -- no else
implicit memory when write gt
if (ready '1') then state lt
idle end if -- no
else implicit memory when others gt
state lt "-----" -- don't
care if undefined state end case end
if end process state_tr -- outputs associated
with register values we lt state(1) oe lt
state(2) addr lt state(4 downto 3) end
state_machine
134 One-Hot Encoding
One-Hot Sequential State
000000000000000001 00000 State0
000000000000000010 00001 State1
000000000000000100 00010 State2
000000000000001000 00011 State3
000000000000010000 00100 State4
000000000000100000 00101 State5
000000000001000000 00110 State6
000000000010000000 00111 State7
000000000100000000 01000 State8
000000001000000000 01001 State9
000000010000000000 01010 State10
000000100000000000 01011 State11
000001000000000000 01100 State12
000010000000000000 01101 State13
000100000000000000 01110 State14
001000000000000000 01111 State15
010000000000000000 10000 State16
100000000000000000 10001 State17
135 One-Hot Encoding
??? ???? ?? FSM
136 One-Hot Encoding
???) Sequential Encoding
s4s3s2s1s0(????) cond1cond2cond3. s4s3s2s1s0(????)
state0
state1
01111 1 - - - - - - - - - 00010 state2
...
01111 - - 0 - - - - - - - - 01111 state15
state16
01111 - 1 - - - - - - - - 10001 state17
137 One-Hot Encoding
s4s3s2s1s0(????) cond1cond2cond3. s4s3s2s1s0(????)
state0
state1
01111 1 - - - - - - - - - 00010 state2
...
01111 - - 0 - - - - - - - - 01111 state15
state16
01111 - 1 - - - - - - - - 10001 State17
???) Sequential Encoding
? ???? ?????? ????? ????? ??????.
138 One-Hot Encoding
One-Hot State
000000000000000001 State0
000000000000000010 State1
000000000000000100 State2
000000000000001000 State3
000000000000010000 State4
000000000000100000 State5
000000000001000000 State6
000000000010000000 State7
000000000100000000 State8
000000001000000000 State9
000000010000000000 State10
000000100000000000 State11
000001000000000000 State12
000010000000000000 State13
000100000000000000 State14
001000000000000000 State15
010000000000000000 State16
100000000000000000 State17
- ???? ?????? ????? ????
- ??? ????? FF?? ????
- 18 ?????? ????? ???? ?? ??? 5 ?????? ????? ??????
- ? ???? ???? ???? ??? ????????? ????
- ? ?????? ??????
- ????? ???? FPGA.
139Power Reduction
- State assignment ????? ?? ????? ???? ????? ??
???? ???. - ????? One-hot ?? ?? ????? ??? 2 ????? ?????? ????
????. - ????? ????
- ????? ????? ?????? ?? ?????
- ???? ????? ????? ???? ????
- ? ???? ?????? ???.
- Gray Encoding ???? FSM??? ???? ??????? ?? ?????
??.
140Pipelining
- ???? ???? ?????? datapath ????? ?? ?? ?? ?? ????
???? ????? ?? ??? ?? ??? ??? ???? ?? ?? ??? ????
????? ?? ???? ????? ????
141Pipelining
- f ??????? 3 ????? ?? ??? (??? ??? ?? ??????? tco
? tsu ???? ????????? (pipeline - throughput 3 ????? ?? ??? ??? ??????? 3 ????
????? ???? ?? ???? latency - ? ??? ????? ?????? ???????? ?? ????.
- ????? FPGA?? ????? ?????? ??? ?? CPLD??
pipeline ???? ?? ??? ?? ???. - CPLD?? ?? ?? pass ?? logic array? ?????? ????? ??
?? ?????? ????? ????
142Example AMD AM2901
143AMD AM2901
library ieee use ieee.std_logic_1164.all use
work.numeric_std.all use work.am2901_comps.all e
ntity am2901 is port( clk, rst in
std_logic a, b in unsigned(3 downto 0)
-- address inputs d in unsigned(3
downto 0) -- direct data i in
std_logic_vector(8 downto 0) -- micro
instruction c_n in std_logic
-- carry in oe in std_logic
-- output enable ram0, ram3 inout
std_logic -- shift lines to ram
qs0, qs3 inout std_logic -- shift
lines to q y buffer unsigned(3 downto
0) -- data outputs (3-state)
g_bar,p_barbuffer std_logic -- carry
generate, propagate ovr buffer
std_logic -- overflow c_n4
buffer std_logic -- carry out f_0
buffer std_logic -- f 0 f3
buffer std_logic) -- f(3) w/o
3-state end am2901
144architecture am2901 of am2901 is alias
dest_ctl std_logic_vector(2 downto 0) is i(8
downto 6) alias alu_ctl std_logic_vector(2
downto 0) is i(5 downto 3) alias src_ctl
std_logic_vector(2 downto 0) is i(2 downto 0)
signal ad, bd unsigned(3 downto 0) signal
q unsigned(3 downto 0) signal r, s
unsigned(3 downto 0) signal alu_out
unsigned(3 downto 0) begin -- instantiate and
connect components u1 ram_regs port map(clk gt
clk, rst gt rst, a gt a, b gt b, alu_out gt
alu_out, dest_ctl gt dest_ctl,
ram0 gt ram0, ram3 gt ram3, ad gt
ad, bd gt bd) u2 q_reg port map(clk gt clk, rst
gt