ELEN 468 Advanced Logic Design - PowerPoint PPT Presentation

1 / 85
About This Presentation
Title:

ELEN 468 Advanced Logic Design

Description:

Design Cycles. System/Architectural Design. Logic Design. Physical Design/Layout. Fabrication ... fabricated, fast automated design, low cost. Prototyping, ... – PowerPoint PPT presentation

Number of Views:107
Avg rating:3.0/5.0
Slides: 86
Provided by: Jian93
Category:

less

Transcript and Presenter's Notes

Title: ELEN 468 Advanced Logic Design


1
ELEN 468Advanced Logic Design
  • Lecture 31
  • Midterm-2 Review

2
Design Cycles
System/Architectural Design
HDL
Logic Design
Verification/Simulation
Physical Design/Layout
Parasitic Extraction
Fabrication
Testing
3
Design and Technology Styles
  • Custom design
  • Mostly manual design, long design cycle
  • High performance, high volume
  • Microprocessors, analog, leaf cells, IP
  • Standard cell
  • Pre-designed cells, CAD, short design cycle
  • Medium performance, ASIC
  • FPGA/PLD
  • Pre-fabricated, fast automated design, low cost
  • Prototyping, reconfigurable computing

4
Primitives
5
Primitives
  • Pre-defined primitives
  • Total 26 pre-defined primitives
  • All combinational
  • Tri-state primitives have multiple output, others
    have single output
  • User-Defined Primitives (UDP)
  • Combinational or sequential
  • Single output
  • UDP vs. modules
  • Used to model cell library
  • Require less memory
  • Simulate faster

6
Edge-sensitive Behavior
  • primitive d_flop( q, clock, d )
  • output q
  • input clock, d
  • reg q
  • table
  • // clock d state q/next_state
  • (01) 0 ? 0 // Parentheses indicate
    signal transition
  • (01) 1 ? 1 // Rising clock edge
  • (0?) 1 1 1
  • (0?) 0 0 0
  • (?0) ? ? - // Falling clock edge
  • ? (??) ? - // Steady clock
  • endtable
  • endprimitive

clock
d
q
d_flop
7
Delay Model and Simulation
8
Asymmetric Delay Assignment
  • module nand1 ( O, A, B )
  • input A, B
  • output O
  • nand ( O, A, B )
  • specify
  • specparam
  • T01 1.133.097.75
  • T10 0.932.507.34
  • ( AgtO ) ( T01, T10 )
  • ( BgtO ) ( T01, T10 )
  • endspecify
  • endmodule

Min delay Typical delay Max delay
Falling time
Rising time
9
Simulation with Delay
A
X
B
X
A
C
D
C
3
2
X
13
B
D
X
0 10 20 30
40 50
15
tsim
A x B x C x D x
A 1 B 0
B 1
A 0
B 0
C 1
C 0
C 0
D 0
D 1
D 1
10
Delay Models
  • Gate delay
  • Intrinsic delay
  • Layout-induced delay due to capacitive load
  • Waveform slope-induced delay
  • Net delay/transport delay
  • Signal propagation delay along interconnect wires
  • Module path delay
  • Delay between input port and output port

11
Inertial Delay
  • Delay is caused by charging and discharging node
    capacitors in circuit
  • Gate delay and wire delay
  • Pulse rejection
  • If pulse with is less than delay, the pulse is
    ignored

A
C
D
B
12
Gate Delay
  • and (yout, x1, x2) // default, zero gate delay
  • and 3 (yout, x1, x2) // 3 units delay for all
    transitions
  • and (2,3) G1(yout, x1, x2) // rising, falling
    delay
  • and (2,3) G1(yout, x1, x2), G2(yout2, x3, x4)
  • // Multiple instances
  • a_buffer (3,5,2) (yout, x) // UDP, rise, fall,
    turnoff
  • bufif1 (345, 679, 578) (yout, xin,
    enable)
  • // mintypmax / rise, fall, turnoff
  • Simulators simulate with only one of min, typ and
    max delay values
  • Selection is made through compiler directives or
    user interfaces
  • Default delay is typ delay

13
Gate and Wire Model
C
R
r resistance per unit length c capacitance
per unit length
L
rL
cL/2
cL/2
14
Example of Model
15
Delay Estimation
2
R2
R
R1
C2
0
1
C0
C1
3
R3
C3
  • D0 R ( C0 C1 C2 C3 )
  • D1 D0 R1 ( C1 C2 C3 )
  • D2 D1 R2 C2
  • D3 D1 R3 C3

16
Net Delay
  • wire 2 y_tran
  • and 3 (y_tran, x1, x2)
  • buf 1 (buf_out, y_tran)
  • and 3 (y_inertial, x1, x2)

x1
x2
y_inertial
x1
y_tran
buf_out
y_tran
x2
y_inertial
buf_out
17
Clock Scheduling
LD logic delay
i
j
ti
tj
Clock
18
Timing Constraints
hold
setup
tj
LDmin
ti
LDmax
  • skewij ti tj gt holdmax LDmin
  • skewij ti tj lt CP LDmax setupmax
  • CP clock period

19
Time Scales
  • Time scale directive timescale
    lttime_unitgt/lttime_precisiongt
  • time_unit -gt physical unit of measure, time
    scale of delay
  • time_precision -gt time resolution/minimum step
    size during simulation
  • time_unit ? time_precision

20
Example of Time Scale
  • timescale 1 ns / 10 ps
  • module modA( y, x1, x2 )
  • nand (3.225, 4.237) ( y, x1, x2 )
  • endmodule
  • timescale 10 ns / 10 ns
  • module modB()
  • modA M1(y, x1, x2)
  • initial begin
  • monitor ( time,
  • f x1 b x2 b y b,
  • realtime, x1, x2, y )
  • end
  • initial begin
  • 5 x1 0 x2 0
  • 5 x2 1
  • 5 x1 1
  • 5 x2 0
  • t real_t x1 x2 y
  • -------------------------------------------------
  • 0 0.000000 x1x x2x yx
  • 5 5.000000 x10 x20 yx
  • 5 5.323000 x10 x20 y1
  • 10 10.000000 x10 x21 y1
  • 15 15.000000 x11 x21 y1
  • 15 15.424000 x11 x21 y0
  • 20 20.000000 x11 x20 y0
  • 20 20.323000 x11 x20 y1

21
Assignment
22
Blocking and Non-blocking Assignment
  • initial
  • begin
  • a 1
  • b 0
  • a b // a 0
  • b a // b 0
  • end
  • initial
  • begin
  • a 1
  • b 0
  • a lt b // a 0
  • b lt a // b 1
  • end
  • Blocking assignment
  • Statement order matters
  • A statement has to be executed before next
    statement
  • Non-blocking assignment lt
  • Concurrent assignment
  • Normally the last assignment at certain
    simulation time step
  • If it triggers other blocking assignments, it is
    executed before the blocking assignment it
    triggers
  • If there are multiple non-blocking assignments to
    same variable in same behavior, latter overwrites
    previous

23
Procedural Continuous Assignment
  • Continuous assignment establishes static binding
    for net variables
  • Procedural continuous assignment (PCA)
    establishes dynamic binding for variables
  • assign deassign for register variables only
  • force release for both register and net
    variables

24
Intra-assignment Delay Blocking Assignment
  • // B 0 at time 0
  • // B 1 at time 4
  • 5 A B // A 1
  • C D
  • A 5 B // A 0
  • C D
  • A _at_(enable) B
  • C D
  • A _at_(named_event) B
  • C D
  • If timing control operator(,_at_) on LHS
  • Blocking delay
  • RHS evaluated at (,_at_)
  • Assignment at (,_at_)
  • If timing control operator(,_at_) on RHS
  • Intra-assignment delay
  • RHS evaluated immediately
  • Assignment at (,_at_)

25
Example
initial begin a 10 1 b 2 0 c 3 1
end initial begin d lt 10 1 e lt 2 0 f lt
3 1 end
t a b c d e f 0 x x x x x x 2 x x x x 0 x
3 x x x x 0 1 10 1 x x 1 0 1 12 1 0 x 1 0 1 15
1 0 1 1 0 1
26
Tell the Differences
always _at_ (a or b) y ab always _at_ (a or
b) 5 y ab always _at_ (a or b) y 5
ab always _at_ (a or b) y lt 5 ab
Which one describes or gate?
Event control is blocked
27
Race Condition
  • always _at_ ( posedge clk ) // c will get previous
    b or new b ?
  • c b
  • always _at_ ( posedge clk )
  • b a

28
Avoid Race Condition
  • always _at_ ( posedge clk ) // Solution 1 merge
    always
  • begin
  • c b b a
  • end
  • always _at_ ( posedge clk ) // Solution 2
    intra-assignment delay
  • c 1 b
  • always _at_ ( posedge clk )
  • b 1 a
  • always _at_ ( posedge clk ) // Solution 3
    non-blocking assignment
  • c lt b
  • always _at_ ( posedge clk )
  • b lt a

29
Finite State Machine
30
FSM Example Speed Machine
31
Verilog Code for Speed Machine
  • // Explicit FSM style
  • module speed_machine ( clock, accelerator, brake,
    speed )
  • input clock, accelerator, brake
  • output 10 speed
  • reg 10 state, next_state
  • parameter stopped 2b00
  • parameter s_slow 2b01
  • parameter s_medium 2b10
  • parameter s_high 2b11
  • assign speed state
  • always _at_ ( posedge clock )
  • state lt next_state
  • always _at_ ( state or accelerator or brake )
  • if ( brake 1b1 )
  • case ( state )
  • stopped next_state lt stopped
  • s_low next_state lt stopped
  • s_medium next_state lt s_low
  • s_high next_state lt s_medium
  • default next_state lt stopped
  • endcase
  • else if ( accelerator 1b1 )
  • case ( state )
  • stopped next_state lt s_low
  • s_low next_state lt s_medium
  • s_medium next_state lt s_high
  • s_high next_state lt s_high
  • default next_state lt stopped
  • endcase
  • else next_state lt state
  • endmodule

32
State Encoding Example
33
State Encoding
  • A state machine having N states will require at
    least log2N bits register to store the encoded
    representation of states
  • Binary and Gray encoding use the minimum number
    of bits for state register
  • Gray and Johnson code
  • Two adjacent codes differ by only one bit
  • Reduce simultaneous switching
  • Reduce crosstalk
  • Reduce glitch

34
One-hot Encoding
  • Employ one bit register for each state
  • Less combinational logic to decode
  • Consume greater area, does not matter for certain
    hardware such as FPGA
  • Easier for design, friendly to incremental change
  • case and if statement may give different result
    for one-hot encoding
  • Runs faster
  • define state_0 3b001
  • define state_1 3b010
  • define state_2 3b100

35
Synthesis
36
Unexpected and Unwanted Latch
  • Combinational logic must specify output value for
    all input values
  • Incomplete case statements and conditionals (if)
    imply
  • Output should retain value for unspecified input
    values
  • Unwanted latches

37
Example of Unwanted Latch
  • module myMux( y, selA, selB, a, b )
  • input selA, selB, a, b
  • output y
  • reg y
  • always _at_ ( selA or selB or a or b )
  • case ( selA, selB )
  • 2b10 y a
  • 2b01 y b
  • endcase
  • endmodule

b
selA
en
selB
y
selA
latch
selB
a
38
Synthesis of Register Variables
  • A hardware register will be generated for a
    register variable when
  • It is referenced before value is assigned in a
    behavior
  • Assigned value in an edge-sensitive behavior and
    is referenced by an assignment outside the
    behavior
  • Assigned value in one clock cycle and referenced
    in another clock cycle
  • Multi-phased latches may not be supported in
    synthesis

39
Synthesis of Arithmetic Operators
  • If corresponding library cell exists, an operator
    will be directly mapped to it
  • Synthesis tool may select among different options
    in library cell, for example, when synthesize an
    adder
  • Small wordlength -gt ripple-carry adder
  • Long wordlength -gt carry-look-ahead adder
  • Need small area -gt bit-serial adder
  • Implementation of and /
  • May be inefficient when both operands are
    variables
  • If a multiplier or the divisor is a power of two,
    can be implemented through shift register

40
Synthesis of fork join Blocks
  • Synthesis tools may
  • Either fail
  • Or require that it does not contain event and
    delay controls that are equal to or longer than a
    clock cycle equivalent to a set of non-blocking
    assignments

41
Static Loops without Internal Timing Controls gt
Combinational Logic
  • module count1sA ( bit_cnt, data, clk, rst )
  • parameter data_width 4 parameter cnt_width
    3
  • output cnt_width-10 bit_cnt
  • input data_width-10 data input clk, rst
  • reg cnt_width-10 cnt, bit_cnt, i reg
    data_width-10 tmp
  • always _at_ ( posedge clk )
  • if ( rst ) begin cnt 0 bit_cnt 0 end
  • else begin cnt 0 tmp data
  • for ( i 0 i lt data_width i i 1 )
  • begin
  • if ( tmp0 ) cnt cnt 1
  • tmp tmp gtgt 1 end
  • bit_cnt cnt
  • end
  • endmodule

42
Static Loops with Internal Timing Controls gt
Sequential Logic
  • module count1sB ( bit_cnt, data, clk, rst )
  • parameter data_width 4 parameter cnt_width
    3
  • output cnt_width-10 bit_cnt
  • input data_width-10 data input clk, rst
  • reg cnt_width-10 cnt, bit_cnt, i reg
    data_width-10 tmp
  • always _at_ ( posedge clk )
  • if ( rst ) begin cnt 0 bit_cnt 0 end
  • else begin
  • cnt 0 tmp data
  • for ( i 0 i lt data_width i i 1 )
  • _at_ ( posedge clk )
  • begin if ( tmp0 ) cnt cnt 1
  • tmp tmp gtgt 1 end
  • bit_cnt cnt
  • end
  • endmodule

43
Non-Static Loops without Internal Timing Controls
gt Not Synthesizable
  • module count1sC ( bit_cnt, data, clk, rst )
  • parameter data_width 4 parameter cnt_width
    3
  • output cnt_width-10 bit_cnt
  • input data_width-10 data input clk, rst
  • reg cnt_width-10 cnt, bit_cnt, i reg
    data_width-10 tmp
  • always _at_ ( posedge clk )
  • if ( rst ) begin cnt 0 bit_cnt 0 end
  • else begin
  • cnt 0 tmp data
  • for ( i 0 tmp i i 1 )
  • begin if ( tmp0 ) cnt cnt 1
  • tmp tmp gtgt 1 end
  • bit_cnt cnt
  • end
  • endmodule

44
Non-Static Loops with Internal Timing Controls gt
Sequential Logic
  • module count1sD ( bit_cnt, data, clk, rst )
  • parameter data_width 4 parameter cnt_width
    3
  • output cnt_width-10 bit_cnt
  • input data_width-10 data input clk, rst
  • reg cnt_width-10 cnt, bit_cnt, i reg
    data_width-10 tmp
  • always _at_ ( posedge clk )
  • if ( rst ) begin cnt 0 bit_cnt 0 end
  • else begin bit_counter
  • cnt 0 tmp data
  • while ( tmp )
  • _at_ ( posedge clk ) begin
  • if ( rst ) begin cnt 0 disable
    bit_counter end
  • else begin cnt cnt tmp0 tmp
    tmp gtgt 1 end
  • bit_cnt cnt
  • end
  • end
  • endmodule

45
VHDL
46
Example
  • -- eqcomp4 is a four bit equality comparator
  • -- Entity declaration
  • entity eqcomp4 is
  • port ( a, b in bit_vector( 3 downto 0 )
  • equals out bit ) -- equal is
    active high
  • end eqcomp4
  • -- Architecture body
  • architecture dataflow of eqcomp4 is
  • begin
  • equals lt 1 when ( a b ) else 0
  • end dataflow

47
Behavioral Descriptions
  • library ieee
  • use ieee.std_logic_1164.all
  • entity eqcomp4 is port (
  • a, b in std_logic_vector( 3 downto 0 )
  • equals out std_logic )
  • end eqcomp4
  • architecture behavioral of eqcomp4 is
  • begin
  • comp process ( a, b ) -- sensitivity list
  • begin
  • if a b then equals lt 1
  • else equals lt 0 -- sequential
    assignment
  • endif
  • end process comp
  • end behavioral

48
Dataflow Descriptions
  • library ieee
  • use ieee.std_logic_1164.all
  • entity eqcomp4 is port (
  • a, b in std_logic_vector( 3 downto 0 )
  • equals out std_logic )
  • end eqcomp4
  • architecture dataflow of eqcomp4 is
  • begin
  • equals lt 1 when ( a b ) else 0
  • end dataflow
  • -- No process
  • -- Concurrent assignment

49
Structural Descriptions
  • library ieee
  • use ieee.std_logic_1164.all
  • entity eqcomp4 is port (
  • a, b in std_logic_vector( 3 downto 0 )
    equals out std_logic )
  • end eqcomp4
  • use work.gatespkg.all
  • architecture struct of eqcomp4 is
  • signal x std_logic_vector( 0 to 3)
  • begin
  • u0 xnor2 port map ( a(0), b(0), x(0) ) --
    component instantiation
  • u1 xnor2 port map ( a(1), b(1), x(1) )
  • u2 xnor2 port map ( a(2), b(2), x(2) )
  • u3 xnor2 port map ( a(3), b(3), x(3) )
  • u4 and4 port map ( x(0), x(1), x(2), x(3),
    equals )
  • end struct

50
Test and Design For Testability
51
Single Stuck-at Fault
  • Three properties define a single stuck-at fault
  • Only one line is faulty
  • The faulty line is permanently set to 0 or 1
  • The fault can be at an input or output of a gate
  • Example XOR circuit has 12 fault sites ( ) and
    24 single stuck-at faults

Faulty circuit value
Good circuit value
c
j
0(1)
s-a-0
d
a
1(0)
g
h
1
z
i
0
1
e
b
1
k
f
Test vector for h s-a-0 fault
52
Stuck-Open Example
Vector 1 test for A s-a-0 (Initialization vector)
Vector 2 (test for A s-a-1)
VDD
pMOS FETs
Two-vector s-op test can be constructed
by ordering two s-at tests
A
1 0
0 0
Stuck- open
B
C

0
1(Z)
Good circuit states
nMOS FETs
Faulty circuit states
53
Stuck-Short Example
Test vector for A s-a-0
VDD
PFETs
IDDQ path in faulty circuit
A
Stuck- short
1 0
B
Good circuit state
C

0 (X)
NFETs
Faulty circuit state
54
Test Pattern for Stuck-At Faults
Ygood (a?b?c)
SA1
Ya-SA1 (b?c)
No need to enumerate all input combinations to
detect a fault
Test pattern a,b,c 011
55
Fault Simulation
  • Fault simulation Problem Given
  • A circuit
  • A sequence of test vectors
  • A fault model
  • Determine
  • Fault coverage - fraction (or percentage) of
    modeled faults detected by test vectors
  • Set of undetected faults
  • Motivation
  • Determine test quality and in turn product
    quality
  • Find undetected fault targets to improve tests

56
Goal of Design for Testability (DFT)
  • Improve
  • Controllability
  • Observability
  • Predictability

57
Scan Storage Cell
Q, So
D
Si
SSC
N/T
SSC
Clk
Q
D
58
Integrated Serial Scan
PI
PO
SFF
SCANOUT
Combinational logic
SFF
SFF
Control
SCANIN
59
Interconnect Timing Optimization
60
Buffers Reduce Wire Delay
t_unbuf R( cx C ) rx( cx/2 C ) t_buf
2R( cx/2 C ) rx( cx/4 C ) tb t_buf
t_unbuf RC tb rcx2/4
61
Buffers Improve Slack
RAT 300 Delay 350 Slack -50
slackmin -50
RAT 700 Delay 600 Slack 100
RAT Required Arrival Time Slack RAT - Delay
RAT 300 Delay 250 Slack 50
Decouple capacitive load from critical path
slackmin 50
RAT 700 Delay 400 Slack 300
62
Candidate Solution Characteristics
  • Each candidate solution is associated with
  • vi a node
  • ci downstream capacitance
  • qi RAT

63
Van Ginnekens Algorithm
  • Start from sinks
  • Candidate solutions are generated

64
Solution Pruning
  • Two candidate solutions
  • (v, c1, q1)
  • (v, c2, q2)
  • Solution 1 is inferior if
  • c1 gt c2 larger load
  • and q1 lt q2 tighter timing

65
Slew Constraints
  • When a buffer is inserted, assume ideal slew rate
    at its input
  • Check slew rate at downstream buffers/sinks
  • If slew is too large, candidate is discarded

66
Cost-Slack Trade-off
67
Continuous Wire Sizing
x
Min delay wire shape w(x) a(e-bx)
68
Wire Sizing Monotone Property
  • Ancestor edges cannot be narrower than downstream
    edges

69
Simultaneous Buffer Insertion and Wire Sizing
70
Area or Radius?
Radius the longest source-sink path length
  • Dijkstras shortest path tree
  • Short path to sinks
  • Large total wire length
  • Prims minimum spanning tree
  • Small total wire length
  • Long path to sinks

71
Area Radius Trade-off
  • Find a solution in middle
  • Not too much area
  • Not too long radius
  • How to find an ideal point?

72
Prims and Dijkstras Algorithms
  • d(i,j) length of edge (i, j)
  • p(i) length of path from source to i
  • Prim min d(i,j) Dijkstra min d(i,j) p(i)

p(i)
i
j
73
The Prim-Dijkstra Trade-off
  • Prim add edge minimizing d(i,j)
  • Dijkstra add edge minimizing p(i) d(i,j)
  • Trade-off c?p(i) d(i,j) for 0 c 1
  • When c0, trade-off Prim
  • When c1, trade-off Dijkstra

74
Spanning Tree ? Steiner Tree
75
P-Tree Abstract Tree
g
d
c
f
a
e
g
b
f
76
P-Tree Embedding
Hanan grid
j
i
d
c
a
h
b
77
Gate Characteristics
78
I-V Characteristics
  • Cutoff region
  • Vgs lt Vt
  • Ids 0
  • Linear region
  • Vgs gt Vt, 0 lt Vds lt Vgs-Vt
  • Ids B(Vgs-Vt)Vds V2ds/2
  • Saturation region
  • Vgs gt Vt, 0 lt Vgs-Vt lt Vds
  • Ids B(Vgs-Vt)2/2
  • B a W/L

d
g
s
Ids
Vds
79
Switching Characteristics
Vin
Vdd
in
out
d
t
Vout
Ids
t
Vds
tfall
tdelay
80
Falling and Rising Procedure
Input rising
Input falling
Vdd
Vdd
Vdd
Vdd
out
out
out
out
Saturation
Linear
Saturation
Linear
81
Falling Time
  • Falling time t1 t2
  • t1 Vout drops from 0.9Vdd to Vdd-Vt
  • t2 Vout drops from Vdd-Vt to 0.1Vdd
  • Falling time rising time k
    C / (B Vdd)
  • Delay Falling time / 2

82
Gate Power Dissipation
  • Leakage power
  • Dynamic power
  • Short circuit power

83
Leakage Power
  • Static
  • Leakage current a ? Vdd
  • Leakage current b/Vt
  • Killer to CMOS technology

Vdd
Vdd
Leakage
out
out
Leakage
Linear
Saturation
84
Dynamic Power
  • Occurs at each switching
  • Pd CL?Vdd2?fp
  • fp switching frequency

Vdd
Vdd
out
out
Linear
Saturation
85
Short Circuit Power
  • During switching, there is a short moment when
    both PMOS and CMOS are partially on
  • Ps Q?(Vdd-Vt)3?tr?fp
  • tr rising time

Input falling
Vdd
Vdd
out
out
Input rising
Write a Comment
User Comments (0)
About PowerShow.com