Title: The Design of High Performance Digital Systems Using Threshold Logic
1The Design of High Performance Digital Systems
Using Threshold Logic
Technical University Delft, February 2002 Peter
Celinski, D. Abbott, S. F. Al-Sarawi Centre for
High Performance Integrated Technologies and
Systems (CHiPTec) Adelaide University, Australia
J. F. López Instituto Universitario de
Microelectronica Aplicada (IUMA) Universidad de
Las Palmas de G.C., Spain
2Presentation Outline
- Brief introduction to Threshold Logic (TL)
- Problems with existing TL implementations
- Development of CRTL
- Development of STTL
- Proposed CLA architecture
- Novel Parallel Counter Design using STTL
- Brief conclusion
3Introduction to Threshold Logic
- Dates back to 1950s - J. von Neumann
- The (Linear) Threshold Logic Gate model
4CMOS TL implementations
5CMOS TL implementations
- Capacitive Threshold-Logic
6Motivation for interest in TL
- Some motivating reported results
- neuron-MOS full adder 50 of area of CMOS full
adder - neuron-MOS based (7,3) parallel counter based
multipliers 30 faster than CMOS full adder based
multipliers - High performance CTL based (31,5) parallel
counter demonstrated - etc
- TL designs offer, in some cases, significant
area, speed and power advantages over
conventional logic - AIM - To overcome drawbacks in existing TL
techniques and to develop high performance TL
based applications
7Motivation for interest in TL
- Classic example - the full adder cell - CMOS vs
TL - 28 transistors vs. only 8
8Some problems with existing TL techniques
- Relatively high power dissipation and low speed
- High power dissipation due to
- short circuit current (neu-MOS),
- large clock loads (CTL),
- multi-phase non-overlapping clock (CCCL)
- Low speed due to
- low current available to charge/discharge load
- high internal gate capacitance (LPTL)
9Charge Recycling Threshold Logic
- Based on Asynchronous Sense Differential Logic
developed by Bai-Sun et al.
10Operation of CRTL Gate
- 2 phases of operation - Equalize and Evaluate
- E0, Ei 1 - Equalize outputs Y and Yi to Vdd/2
- E1, Ei 0 - Evaluate outputs Y and Yi
- depending on whether ? gt T,
- after delay of the Enable inverters
- Charge recycling because
- previously drawn charge from
- Vdd is re-used during
- equalization
- Fast sensing
- Low power
11CRTL performance evaluation
- 20 input majority gate
- Comparison in 0.25?m process using CRTL, CCCL,
CTL and LPTL - Cunit 5fF, 50-300MHz clock, power compared with
transistor sized for equal delay - CRTL improved power dissipation by 15-30 at
200MHz - 45 corners tested (process, Vdd, and temp) -
robust.
12CRTL power dissipation comparison
13Self Timed Threshold Logic (STTL)
- Based on differential sense amplifier latch
- Self-timed domino data propagation
- Relatively high
- speed and low
- power
- Negative weights
- possible
- Single phase
- operation
14Operation of STTL Gate
- 2 phases of operation - Equalize and Evaluate
- E1, Eb 0 - Equalize nodes A and B to Gnd
- E0, Eb 1 - Evaluate outputs A and B
- depending on whether ? gt T
- All STTL gates in pipeline
- equalize simultaneously
- Evaluation of 1st gate
- causes 2nd gate to also
- evaluate, then 3rd etc
- Simulations indicate
- operation at over 400 MHz
15Carry Lookahead Addition
- O(log2w) logic depth, one of the fastest, large
range of trade-offs possible in terms of latency,
power and area - In prefix notation
-
carry generate
carry propagate
carry
sum
carry
Brent-Kung operator
16TL CLA architecture
- Wish to take advantage of high fan-in capability
of TL gates - We group input bits into n-bit blocks, calculate
the group generate and propagate signals - group carry generate 1 if sum of bits gt largest
number representable in n bits - group carry propagate 1 if sum of n bits ?
largest number representable in n bits - This formulation allows us to generate the TL
prefix tree directly using - (Bit propagate and bit generate signals need not
be computed) - For example for the 3-bit grouping of bits 3 to
5
rtghjhgjh
rt
174-bit CLA design
184-bit Carry-Lookahead Adder Comparison
- CRTL vs. Multiple-Output Differential Cascode
Voltage Switch logic (MODCVS)
19A 16-bit TL adder
20A Novel (7,3) STTL Parallel Counter
- Conventional Minnick (7,3) counter
-
21A Novel (7,3) STTL Parallel Counter
- Modified Minnick implementation
- Sum input bits once
- Only 22 capacitors
- vs. 39 using CTL
- Reduced routing
- cost
- Reduced area
- Reduced power
- dissipation
-
22STTL (7,3) Parallel Counter Simulation Results
- Simulated in HSPICE using 0.25?m process
parameters - Power dissipation
- at 300 MHz is
- 870 ?W
- Delay lt 1.4 ns
- 22 capacitors
- Cunit 5 fF
- Vdd 2 V
23Ongoing work
- Quantitative robustness analysis of CRTL, STTL
- Mismatch, noise immunity
- Detailed comparison of adder performance, power
dissipation and layout area - Development of a prefix-tree synthesis algorithm
- TL requires only one gate to be designed (
capacitive network) hence suitable for automated
layout - Design of a 32-bit multiplier using (7,3) and
(15,4) STTL counters (depth six with approx. 1300
TL gates) - Test chips scheduled for April
24Conclusions
- Presented design of CRTL gate
- Showed CRTL dissipates significantly less power
compared to other TL gate implementations - Presented design of STTL gate
- Presented a CLA architecture based on TL
- Proposed a novel parallel counter design
technique - Questions ?