Title: Digital and Analog IC Designs
1Digital and Analog IC Designs
- Borivoje Nikolic
- bora_at_eecs.berkeley.edu
2Students
- Continuing Students
- Socrates Vamvakos, Ph.D. 2005
- Dejan Markovic, Ph.D. 2005
- Radu Zlatanovici, Ph.D. 2005/6
- Bill Tsang, Ph.D. 2006/7
- Liang-Teck Pang, Ph.D. gt2006
- Farhana Sheikh Ph.D. gt2006
- New Students
- Zhengya Zhang, Ph.D.
- Renaldi Winoto, M.S.
- Zheng Guo, Ph.D.
- Melinda Ler, M.S.
- PostDoc
- Sebastian Hoyos
- VIF
- Yasutoshi Aibara (Renesas)
- Undergrads
- Yong Yang, B.S. 2005
- Imran Haque, B.S. 2006
- Daniel Cheng, B.S. 2005
- Ranjeet Jhutti, B.S. 2005
3Graduated Students
- Yun Chiu, Ph.D. University of Illinois,
Urbana-Champaign - Sean Kao, M.S. Xilinx, Inc.
- Joshua Garrett, M.Eng. Manhattan Routing, Inc.
- David Fang, M.S. Intel
4Recognition
- ISSCC 2004 Jack Kilby Outstanding Student Paper
Award - A 1.8V 14b 12MS/s Pipelined ADC in 0.18 micron
CMOS with 103dB SFDR by Yun Chiu, Paul R. Gray
and Borivoje Nikolic - Fifthsixth outstanding paper award at ISSCC for
BWRC!
5What Are We Working On?
- Power-performance optimization
- Power-performance optimization methodology
- Implementation in ASICs
- Variability
- Impact of layout on variations
- Variability in SRAM
- Adaptable analog
- Jitter optimization in PLLs
- Background-calibrated ADCs
- Cognitive radios
- TV-band receivers
- TV-band signal synthesis
- LDPC decoding
- Maskless lithography
- Parallel DAC array
61. Power Performance Optimization
Power
Initial design
Design within power budget
Power budget
Cycle time
OPTIMAL POWER PERFORMANCE TRADEOFF CURVE
- How to find the best performance under the power
budget
7Circuit Optimization Framework
Models
Netlist
Plug-ins
Variables
Optimization Core
Radu Zlatanovici
Optimal Design
Results
8Model Comparison and Validation
- Define near-optimality boundary
- Use tabulated models to compute the delay and
energy of the optimal design obtained with
analytical models - Validate optimal design obtained with tabulated
models if within near-optimality boundary
9Example Optimizing 64-Bit Adders
R. Zlatanovici, B. Nikolic Power Performance
Optimal 64-Bit Carry-Lookahead Adders, ESSCIRC
2003
10Implementation Fastest 64-Bit Adder
- Chip taped out 12/04
- ST 90nm 7M 1P technology
- 1.7 x 1.6 mm2
- 98 pads
- 8 adder cores with different power strappings
- 210 ps cycle time _at_ VDD 1V (extracted
simulations) - Peak current 350mA / adder
- Test circuitry included
- Additional measurement circuitry to to study
impact of supply noise () - Sean Kao, Radu Zlatanovici
- with Elad Alon, Valentin Abramzon, Mark
Horowitz (Stanford)
11Optimization for ASICs
Farhana Sheikh
122. Dealing with Variations
1.4
1.3
1.2
0.18 micron 1000 samples
Normalized Frequency
1.1
30
20X
1.0
0.9
0
5
10
15
20
Normalized Leakage (Isb)
From S. Borkar,Intel
- 20X variation in SD leakage
- 30 variation in Frequency
13Investigating Layout Effects
- Quantify the effects of layout, Vdd and body
biasing on variations by measuring ring
oscillator frequencies and leakage currents - Extraction of spatial correlation information
from measurement data
Non-stacked gates
Stacked gates
Liang-Teck Pang
14Test Chip
- 90nm CMOS
- 160 tiles of each layout configuration, 40 sets
of variable length RO - Die size 1.5mm x 1.2mm (incl. I/O pads)
- Taped out in Sept 04, dies arrived end Dec 04
- Testing started
15Variability in Memory
Has to remain open for 5-6 ? for large arrays
- Memory design is becoming interesting again
Zheng Guo, Sriram Balasubramanian, Radu
Zlatanovici (with Prof. T.-J. King
16FinFET Memory
- 6-T SRAM with back-gating (FinFETs have two
separate gates)
Gate length Lg
Gate length Lg
Source
Source
Gate
Fin Width TSi
Gate2 Vth Control
Drain
Drain
Gate1
Fin Height HFIN W/2
Fin Height HFIN W
Switching Gate
17FinFET Memory
18Memory Variability
193. Analog Design Challenges
- Variability is a huge problem
- Scaling challenges
- Lower intrinsic gain
- Higher noise
- Lower SNR
- Scaling benefit fT , lots of digital gates
- Approach Measure and correct
20PLL Jitter
21PLL Jitter Analysis
Adjusting the loop characteristics (wN, z)
modulates the output jitter..
We would like to operate at the minimum all the
time!
22Implementation
Die Photo
Experimental Setup
Socrates. Vamvakos
23Measurement Results
- Non-Gaussian Distribution is not the reason for
the discrepancy. - The issue is jitter masking
due to correlated noise between the PLL and the
jitter block.
24Measurement Results
Correction of dead-zone estimates by correlation
error
25Background-Calibrated ADC
S/H
S/D ADC
?n
fclk/n
Pipelined ADC
Vin
Dout
FIR filter
fclk
- Adaptive, data-driven, background, and
all-digital - Analog signal path of ADC under calibration is
intact - Speed Accuracy enabled simultaneously
26Background-Calibrated ADC
- Main pipeline and front-end S/H nearly finished
- Sigma-delta reference ADC re-taped out
- Antenna rule violations
- DLL needed for precise phase alignment (in
design) - Yun Chiu, Bill Tsang, Johan Vanderhaegen,
Sebastian Hoyos
274. Reusing the TV-Band ADC
Frequency Range
UHF band 400-850MHz
Bandwidth
6MHz(U.S.A.)
High dynamic range (70 dB)
Yasutoshi Aibara, Sebastian Hoyos
28Band-Pass Sigma Delta
90nmCMOS
Technology
Single Down
Conv
.
Architecture
400M
-
850MHz(UHF)
Freq. Range
235MHz(fc
x4)
Sampling Freq.
BPF
58.75
MHz
Center Freq. of BPF
6MHz/ch(UHF)
Bandwidth
70dB
DR
20 (235M/2
/
6M
)
OSR
4th( or higher)
Order
29TV Signals Cable Modem
Cable Modem Standard DOCSIS / ITU J.83
Implementation Block Diagram
Zhengya Zhang, Renaldi Winoto (with help from C.
Naegli (Cisco))
30QAM Transmitter BEE at Work
31Transmitted Signal Characteristics
Time domain distribution of signal
amplitude based on 30K samples
Power Spectrum Density 4 channels
combined, signal power is normalized to 0 dB
32QAM Receiver
256-QAM Constellation
Scatter Plot
33Impairments
34BER Curve
BER curve for 4096 symbols. Simulated Jitter
using linear interpolation with Gaussian 2ps
(3.2e-6 UI) RMS jitter
35Low Density Parity Check Codes
- Parity Check Matrix
-
C1 -
C2 -
C3 -
C4 -
- V1 V2 V3 V4 V5 V6 V7 V8
-
- Tanner Graph
- V1
- V2
C1 - V3
- C2
- V4
- V5
-
C3 - V6
-
- V7 C4
- V8
Lara Dolecek, Zhengya Zhang
36Decoding via Message Passing
- bits to checks checks to bits
bits to checks .. -
-
-
hard
decision
(either all checks satisfied
or - initialize
maxIter reached)
v1
c1
v2
.
v3
c2
c2?v4
v4?c2
.
v4
HD(v4)1if 0 else
c3
v5
.
c4?v4
v4?c4
v6
c4
v7
.
v8
37Fully Parallel Architectures
Direct-mapped architecture using an interconnect
fabric
Interconnects can be simplified by exploiting the
structure of the code construction
38Partially Parallel Architecture
Parallel process among multiple groups and serial
process checks inside the group
Partition the H matrix into regularly structured
groups by choosing H that consists of stacked
permutation sub-matrices
Line up check node (horizontal) and bit node
(vertical) groups
39Error Floors
- Flattening of a BER-SNR waterfall curve occurs
beyond the reach of a simulation. - Usually due to the so-called near-codewords,
trapping sets, etc. - Lack of an analytical tool for describing error
floors for different channels necessitates the
FPGA implementation.
Simulation and error floor predictions for
regular (3,6) LDPC codes From top Margulis graph
(n2640), an n2048 code, an n2640 code, and
an8096 code T. Richardson, Allerton 2003
405. Mask Costs on a Rise
Mask costs follow Moores law as well
41Maskless Lithography Project
Maskless Lithography Implementation
Mirror Chip Details
- To achieve 3wph given a 10Khz light source
requires 12 million mirrors on the mirror chip - Aspect Ratio 8,000 x 1,500 mirrors
Physical Details
- A single mirror cell is 1.2um by 1.2um which is
optically focused to a single 45nm feature size
pixel - 32 level adjustable levels allow for a 1nm edge
placement on silicon - Mirrors are controlled electro-statically via
tilting comb or parallel plate mirrors which have
a non-linear displacement response to voltage
Throughput Requirements
- Using an 5-bits per mirror requires a data
throughput of 1.22 Tbps
D. Fang
42Grayscale Interface
- 22nm features _at_ 10kHz 12 million mirrors (3
wafers/hour) - Maximize aspect ratio for parallelism
- 20mm x 4.25mm (8,000 x 1,500 mirrors) of 2.5µm x
2.5µm mirrors - Need to write analog inputs in 2ms (for 3w/hr)
Address
1,000 (4x MUX)parallel DACs 10µm pitch 512
decompression paths 64 input pins
1,000 (4x MUX)parallel DACs 10µm pitch 512
decompression paths 64 input pins
Demux
Demux
DACs
DACs
I/O
Writers
Decoders
Decoders
I/O
2.4mm
Address
43Prototype Design
44Prototype Design
- Chip finished in December, ready to tape out
(90nm CMOS)
45Summary
- Power-performance optimization (R. Zlatanovici,
F. Sheikh, D. Markovic) - Characterizing variations (L.T. Pang)
- FinFET memory (Z. Guo, R. Zlatanovici)
- Adaptable PLL (S. Vamvakos)
- Background-calibrated ADC (Y. Chiu, B. Tsang, J.
Vanderhaegen) - TV-band receiver (Y. Aibara, S. Hoyos)
- Synthesis of TV signals (Z. Zhang, R. Winoto)
- Maskless lithography (D. Fang)
- Optimal power-performance tradeoffs, SVD (D.
Markovic) - Flexible, high-throughput LDPC decoders (L.
Dolecek, Z. Zhang)