Title: Advanced VLSI Design
 1Advanced VLSI Design
Timing Issues 
 2Architecture of the Motorola DSP 56K family 
- 24-bit general purpose Digital Signal Processors 
- It has a dual Harvard architecture optimized for 
 MAC operations.
- It features a three stage instruction pipeline, 
 which is essentially invisible to the programmer
3Harvard architecture
- Harvard architecture refers to a memory structure 
 wherein the processor is connected to two
 independent memory banks via two independent sets
 of buses
- The key advantage of the Harvard architecture is 
 that two memory accesses can be made during any
 one instruction cycle
4Major components of the central processing module 
- Data Buses 
- Address Buses 
- Data Arithmetic Logic Unit (data ALU) 
- Address Generation Unit (AGU) 
- Program Control Unit (PCU) 
- Memory Expansion (Port A) 
- On-Chip Emulator (OnCE) circuitry 
- Phase-locked Loop (PLL) based clock circuitry
5Synchronization
- Well defined ordering of switching events for 
 circuit to operate correctly
- In synchronous system approach---all memory 
 elements are simultaneously updated using a
 global clock.
- Register based clocking (robust, reliable) , 
 latch based clocking
6Pipelining
- To accelerate the operation of data path, 
 pipelining is used
- Computation is performed in assembly line like 
 fashion
- Pipelined network outperforms original circuit 
 with respect to speed
- Macro pipeline, micro pipeline
7Pipeline PCU MACRO LEVEL 
 8Pipelined datapathMICRO-LEVEL 
 9Timing Parameters 
- Assume positive edge triggered system 
10Timing Definitions
CLK
Register
t
D
Q
t
t
hold
su
D
DATA
CLK
STABLE
t
t
c
q
2
Q
DATA
STABLE
t 
 11Timing constraint
Minimum cycle time T gt tc-q  tsu  tlogic
Hold time constraint thold lt t(c-q, cd)  
t(logic, cd)  
 12Clock Non-idealities
- Clock skew 
- Spatial variation in arrival time of a clock 
 transition.
- It is caused by mismatches in clock path or clock 
 load
- It can be positive or negative depending upon 
 routing direction and position of clock source
- Clock skew does not result in clock period 
 variation
13Positive and Negative Skew 
 14Positive Skew
Launching edge arrives before the receiving edge 
 15Impact of positive clock skew
Minimum cycle time T  ?  tc-q  tsu  tlogic
Worst case is when receiving edge arrives early 
(positive ?) 
 16Race condition
- Hold time constraint 
- thold  ? lt t(c-q, cd)  t(logic, cd)
17Negative Skew
Receiving edge arrives before the launching edge 
 18Impact of negative clock skew
Minimum cycle time T - ?  tc-q  tsu  tlogic
Worst case is when receiving edge arrives early 
(positive ?) 
 19No Race condition
- Probability of race condition is reduced or nil 
 
- thold - ? lt t(c-q, cd)  t(logic, cd) 
- System never fails as new data latched on to R1 
 never gets transferred to R2 as it would turn off
20Clock Non-idealities
- Clock jitter 
- Temporal variations of the clock period at a 
 given point on the chip. i. e Clock period
 reduces or expands on a cycle by cycle basis
- Absolute jitter (tjitter)---worst case variation 
 of a clock edge at a given location with respect
 to an ideal clock.
- Worst case--- Tclk reduces by 2t jitter 
- Cycle to cycle jitter (T jitter) ---deviation of 
 single clock period relative to ideal clock.
21Impact of Jitter---always slows down 
 22Clock Non-idealities
- Variation of the pulse width 
- Important for level sensitive clocking 
23Combined Impact
- Minimum time available (neg skew) 
- Tclk -d - 2tjitter  tc-q  tlogic  t su 
-  or 
- Tclk  tc-q  tlogic  t su d 2tjitter 
24Hold time constraint (pos skew)
thold  tjitter d - tjitter  tc-q cd  tlogic, 
cd 
- Minimum time available (pos skew) 
- Tclk d - 2tjitter  tc-q  tlogic  t su 
- Tclk  tc-q  tlogic  t su -d 2tjitter 
25Clock Skew and Jitter
Clk
tSK
Clk
tJS
- Both skew and jitter affect the effective cycle 
 time
- Only skew affects race condition 
26Sources of skew and jitter
- Clock signal generation 
- Manufacturing device variations 
- Interconnect variations 
- Environmental variations 
- Capacitive coupling 
- Design clock distribution network carefully
27Latch-Based Design
 L2 latch is transparent when clk f  1
 L1 latch is transparentwhen clk f  0
f
L1
L2
Logic
Latch
Latch
Latch is a soft barrier 
 28Performance Similar to 
 29Slack borrowing
- Enhanced performance due to flexible timing, yet 
 no design changes
- Possible for logic block to utilize time that is 
 left over from the previous logic block.
- Total logic delay can be more than one clock 
 cycle
30(No Transcript) 
 31Reg based vs. latch based--example
  32Less Tclk 
 33Maximum slack possible
- Max time that can be borrowed is 0.5 Tclk 
- So max logic cycle delay can be 1.5 Tclk 
- But for n stages overall delay would be 
-  n Tclk
34Drawbacks
- We have to use 
- two phase clocking scheme, 
- Glitches-power dissipation increases
35Asynchronous systems--Self timed approach
- Syn systems 
- logical ordering of events by clk. It provides a 
 time base
- Physical timing constraint- next edge comes when 
 all blocks have reached steady state
- ProblemCLB has to wait even though it may finish 
 earlier. Clock distribution network
36Asynch. designmeeting constraints
- Advnext block can start computation as soon as 
 previous block has finished.
- Problem when to latch the output ? When output 
 is a correct value?
- Remedysystem has to meet timing constraints 
37Local signals
- Logical ordering and physical timing -- 
- START, DONE, -- physical timing 
- REQUEST , ACKNOWLEDGE - Logical ordering 
38Self timed system
- System generate its own timing signal 
39Self timed system --Hand shake protocol
- Hand shaking- synchronize by mutual agreement 
- adv.--timing signals generated locallyless prop. 
 Delay, high speed, no glock routing
- Disadv. hand shaking circuit design
40Implementation of HS protocol-2 phase 
 414 phase protocol 
 42Dual rail protocol
- I bit information coded using two wires 
- Request is merged with data wires
43Bundled data protocol 
 44Event Logic  The Muller-C Element 
 454-Phase bundled data Protocol--FIFO 
 462-Phase bundled data Protocol--FIFO 
 47(No Transcript) 
 484-Phase dual rail Protocol--FIFO 
 492-Phase dual rail Protocol--FIFO
Ack
Done / Req
start
data 
 50Ack
Done / Req
start
data 
 51(No Transcript) 
 52(No Transcript) 
 53(No Transcript) 
 54Completion Signal Generationno glitches 
 55Completion Signal in DCVSL 
 56Self-Timed Adder--example 
 57Bundled data protocol 
 58Memory element design 
 59PERFORMANCE PARAMETERS
- CLOCK LOAD 
- NO OF TRANSISTORS 
- CLOCKING SCHEME
60Latch versus Register
- Latch 
-  stores data when clock is low/ HIGH 
- Register 
-  stores data when clock rises 
D
Q
D
Q
Clk
Clk
Clk
Clk
D
D
Q
Q 
 61Storage Mechanisms
Dynamic (charge-based)
Static
CLK
D
Q
CLK 
 62Static-----Mux-Based Latch-1Q  CLK . Q CLK . D
CLK LOAD-4 2 PHASE CLOCKING 10-TRANSISTORS 
 63Mux-Based Latch(2)-LESS CLK LOAD , 
CLK LOAD-2, 2 PHASE CLOCKING, 6-TRANSISTORS 
 64Mux-Based Latch(3)-LESS CLK LOAD , Vt DEGRADATION
Non-overlapping clocks
NMOS only 
 65Master-Slave (Edge-Triggered) Register
Two opposite latches trigger on edge Also called 
master-slave latch pair 
 66Master-Slave Register
Multiplexer-based latch pair 
 67TIMING METRICS
- T set up  I1T1I3I2 
- T CLK-Q  T3 I6 
- T HOLD  0 
- EXACT VALUES CAN BE OBTAINED THROUGH SIMULATION
68Reduced Clock Load Master-Slave RegisterSIZING 
IMPORTANT-REVERSE CONDUCTION
CLK
T
I
Q
2
3
I
4
CLK
I2 MUST BE WEAK WHEN SLAVE IS ON----REVERSE 
CONDUCTION 
 69TIMING METRICS
- T set up  T1I1 
- T CLK-Q  T2 I3 
- T HOLD  0 (OR T1) 
- EXACT VALUES CAN BE OBTAINED THROUGH SIMULATION
70Avoiding Clock Overlap
(a) Schematic diagram
CLK
CLK
(b) Overlapping clock pairs 
 71Non overlapping phases 
 72TIMING METRICS
- T set up  T1I1 
- T CLK-Q  T2 I3 
- T HOLD  0 (OR T1) 
- EXACT VALUES CAN BE OBTAINED THROUGH SIMULATION
73Overpowering the Feedback Loop -Cross-Coupled 
Pairs
NOR-based set-reset 
 74Cross-Coupled NAND
Added clock
Cross-coupled NANDs
This is not used in datapaths any more,but is a 
basic building memory cell 
 75Dynamic registers 
 76TIMING METRICS
- T set up  T1 
- T CLK-Q  I1T2 I2 
- T HOLD  0 (OR T1) 
- EXACT VALUES CAN BE OBTAINED THROUGH SIMULATION 
- IN OVERLAP--
77OVELAPS 
 78Other Latches/Registers C2MOS
Keepers can be added to make circuit 
pseudo-static 
 79Insensitive to Clock-Overlap
V
V
V
V
DD
DD
DD
DD
M
M
M
M
2
6
2
6
M
0
0
M
4
8
X
X
D
Q
D
Q
M
1
M
1
3
7
M
M
M
M
1
5
1
5
(a) (0-0) overlap
(b) (1-1) overlap 
 80Dual edge registers 
 81Single phase clock Latches/Registers TSPC
Negative latch (transparent when CLK 0)
Positive latch (transparent when CLK 1) 
 82Including Logic in TSPC
Example logic inside the latch
AND latch 
 83Reduced complexity 
 84TSPC Register 
 85Pulse-Triggered LatchesAn Alternative Approach
Ways to design an edge-triggered sequential cell
Master-Slave Latches
Pulse-Triggered Latch
L1
L2
L
Data
Data
D
Q
D
Q
D
Q
Clk
Clk
Clk
Clk
Clk 
 86Pulsed register-avoid race, single latch 
 87Pulsed Latches 
 88Sense amplifier based register