Title: Finding and Sharing Brick Walls CANDE September 22, 2001 Andrew B. Kahng, UCSD CSE
1Finding and Sharing Brick
WallsCANDESeptember 22, 2001Andrew B.
Kahng, UCSD CSE ECE Departmentsemail
abk_at_ucsd.eduURL http//vlsicad.ucsd.edu
21999 ITRS Design Technology Metrics and Red
Bricks
Solutions Exist Solutions Being Pursued No Known
Solutions
3Hold These Thoughts
- ITRS is created by SIA companies and top
semi/system houses worldwide all star customers - EDA has one chapter out of 12
- EDA is just another part of SISA (semiconductor
industry supplier association) - EDA is small 6000 RD worldwide, 4B market
- Hold this thought Dataquest ? 3.9 annual
growth in tools spent per designer
integration costs gt tool costs - Hold this thought small industry with poor
perceived ROI will stay small vicious cycle - Hold this thought How do we turn a vicious
cycle into a virtuous cycle?
4Six Riffs
- Riff 1 ITRS acceleration, silicon technology,
and system drivers - Riff 2 A big picture on red bricks
- Riff 3 A Dark Riff on D and DT productivity
- Riff 4 On the design-manufacturing handoff
- Riff 5 On cost, variability and value
- Riff 6 Its lunchtime
5Riff 1 ITRS Acceleration, Silicon Technology,
and System Drivers
6Roadmap Acceleration Since 2000
- Major accelerations continue
- E.g., 90nm node is in 2004, with physical gate
length at 45nm - MPU/ASIC half-pitch were separate, now unified
- ASIC is at the same process node as MPU
- 2-year cycles b/w MPU/ASIC generations through
2004 - Node 0.7x multiplier of half-pitch or minimum
feature size, generally allowing 2x the
transistors on the same size die - Normal pace 3-year cycle
- MPU/ASIC half-pitch converges w/DRAM HP in 2004
- Previous ITRS (2000) convergence predicted for
2015 - Extremely aggressive scaling for density, cost
improvement and competitive positioning
7Slide courtesy of A. Allan (Intel Corp.)
8System Drivers
- Define IC products that drive mfg, design
technologies - ORTCs SDs consistent framework for tech
requirements - Four system drivers
- MPU traditional processor core
- SOC (focus on ASIC-LP, high-pins,
high-signaling network driver) - AM/S four basic circuits and FOMs
- DRAM
- Each driver section
- Nature, evolution, formal definition of this
driver - What market forces apply to this driver ?
- What technology elements (process, device,
design) does this drive? - Key figures of merit, and roadmap
9MPU Driver
- Old MPU model 3 flavors
- New MPU model - 2 flavors
- Cost-performance at production (CP)
- 140 mm2 die, desktop
- High-performance at production (HP)
- 310 mm2 die, server
- Both have multiple cores (helper engines),
on-board L3 cache, - Multi-cores more dedicated, less
general-purpose logic driven by power and reuse
considerations reflect convergence of MPU and
SOC - Doubling of transistor counts is each per each
node, NOT per each 18 months - Clock frequencies stop doubling with each node
10Example Supporting Analyses (MPU)
- Diminishing returns
- Pollacks Rule In a given process technology,
new microarchitecture takes 2-3x area of previous
generation one, and provides only 50 more
performance - Corroboration SPECint/MHz, SPECfp/MHz,
SPECint/Watt all decreasing - Power knob running out
- Speed Power
- Large switching currents, large power surges on
wakeup, IR drop control issues all limited by AP
roadmap (e.g., improvement in bump pitch, package
power) - Power management 2500 improvement needed by
2016 - Speed knob running out (new clock frequency
model) - Historically, 2x clock frequency every node
- 1.4x/node from device scaling but running into
tox, other limits (PIDS) - 1.4x/node from fewer logic stages (from 40-100
down to around 14 FO4 INV delays) - Clocks cannot be generated with period lt 6-8 FO4
INV delays - Pipelining overhead (1-1.5 FO4 INV delay for
pulse-mode latch, 2-3 for FF) - Around16 FO4 INV delays is limit for clock period
in core (L1 access, 64b add) - Cannot continue 2x frequency per node trend in
ITRS
11SOC-LP Driver
- Power gap
- Must reduce dynamic and static power to avoid
zero logic content limit - Hits low-power SOC before hits MPU
- SOC degree of freedom low-power (not high-perf)
process - SOC-LP model drives ASIC-LP (PIDS) device model
- Lgate lags high-performance devices by 2 years,
but layout density same - Accompanying device parameter changes
- Vth higher, Vdd higher
- Ig, Ioff starts at 100pA/um (L(Operating)P),
1pA/um (L(STandby)P) - Tox higher
- Slower devices (larger CV/I)
- Even with four LP device flavors, Design still
faces large static power management challenge,
and must handle multi (Vt,tox,Vdd) - SOC-LP driver low-power PDA
- Composition CPU cores, embedded cores,
SRAM/eDRAM - Roadmap for IO bandwidth, processing power,
GOPS/mW efficiency - Die size grows at 20 per node
12SOC-LP Driver Model
- Required performance trend of SOC-LP PDA driver
- Drives PIDS/FEP LP device roadmap, Design power
management challenges
13 LP Device Roadmap
14Power Management Gap (x) (with utterly
optimistic device assumptions...)
15Riff 2 The Big Picture on Red Bricks
16Big Picture
- ITRS takes Moores Law as a constraint
- Problem ITRS signed up for the wrong Moores
Law - 2x frequency, 2x xtors,bits every node ? power,
utility contradictions - Each increment of performance is more and more
costly - Compounding problems
- no architecture awareness
- no application awareness (e.g., low-power
networked-embedded SOC) - planar CMOS-centric (no DGFET, FinFET in
requirements) - uneven acknowledgment of cost (mask NRE cost,
design NRE cost, cost of technology development,
manufacturing cost, manufacturing test ) - New in 2001 Can Design help solve it?
- PIDS 17/year improvement in CV/I metric ?
punt Ioff, Rds, - AP bump pitch improves slowly ? punt IR drop,
power, signaling ? impacts Test as well - Interconnect, Litho, PIDS/FEP what variability
can Designers tolerate?
17DT Integration With Other Technologies
- Problem Design has always been metric-free
- Metric ? red brick wall ? requirement for RD
investment - EDA Goal 1 show red bricks in Design Technology
- EDA Goal 2 shift red bricks from other
supporting technologies - e.g., lithography CD variability requirement ?
solved by new Design techniques that can better
handle variability - e.g., mask data volume requirement ? solved by
Design/Mfg interfaces and flows that pass
functional requirements, verification knowledge
to mask writing and inspection - e.g., Simplex X initiative ? as much impact as
copper ? - Its an ROI issue !!!
- Need metrics of design cost, design quality/value
? DT ROI - Need serious validation/participation from EDA
community before we can expect help from system,
ASIC companies
18Dielectric Permittivity Near Term Years
Bulk and effective dielectric constants
described Porous low-k requires alternative
planarization solutions Cu at all nodes -
conformal barriers
C. Case, BOC Edwards ITRS-2001 preliminary
19Effect Of Line Width On Cu Resistivity
Conductor resistivity increases expected to
appear around 100 nm linewidth - will impact
intermediate wiring first - 2006
Courtesy of SEMATECH
C. Case, BOC Edwards ITRS-2001 preliminary
20Device Roadmap Changes
- Process Integration, Devices and Structures
(PIDS) - CV/I delay metric historically decreases by
17/year - Since frequency improvement from shorter
pipelines no longer available, perhaps we do need
to keep scaling CV/I - Bottom line PIDS is running up against limits
of planar CMOS, and is shifting at least some of
the pain to design/architecture improvements - Continuing CV/I trend necessitates huge growth in
Ioff - Subthreshold Ioff at room temperature increases
from 0.01 uA/um in 2001 to 10 uA/um at end of
ITRS (22nm node) - Ioff increases by at least order of magnitude at
100 deg C operating temps (40x difference
between 25 deg C and 125 deg C) - Static power becomes a huge problem multi-Vt,
multi-Vdd, substrate biasing, constant-throughput
power minimization, etc. must be coherently and
simultaneously applied/optimized by automatic
tools - Also necessitates aggressive reduction in tox
- Physical tox thickness hovers at lt 1.4nm (down to
1.0nm) starting in 2001, even assuming arrival of
high-k gate dielectrics starting in 2004 - Implies huge variability mitigation challenges
for Design Technology 10 lt one monolayer
21Assembly/Packaging Roadmap
- MPU pad counts flat from 2001-2005 chip current
draw increases 64 - Effective bump pitch roughly constant at 350mm
- Bump/pad counts scale with chip area only, do not
increase with technology demands (IR drop,
Ldi/dt) - ? metal resources needed to control lt10 IR drop
skyrocket since Ichip and wiring resistance
increase ? challenge for DT - Later technologies (30-40nm) also have too few
bumps to carry maximum current draw (e.g., 1250
Vdd pads at 30nm with bump pitch of 250mm can
each carry 150mA ? 187.5A max capability but
Ichip/Vdd gt 300A - AP Rationale cost control (puts pain onto
Design) - Design Rationalization must add power
constraints - ITRS2001 will have strong power-constrained focus
- Cost of liquid cooling, refrigeration, etc.
impractical anyway (???) - 30-50 W/cm2 limit for forced-air cooling with
fins - MPU power dissipation capped at 200W MPU chip
area held constant (more area cant be used well
within 150W power budget)
22Design Technology and the ITRS
- Cost biggest hole in ITRS and in DT
- Manufacturing cost, NRE cost (design, mask, ),
technology development cost ( who should
have/solve red brick walls?) - Challenges for DT (with respect to ITRS)
- Circuit/layout optimizations in the face of
manufacturing variability - System cost-driven design technology
- Holistic analysis, management of power (both
dynamic and static) - Circuit- and methodology-level IP global
signaling and synchronization, off-chip IO power
delivery and management - Metrics, needs roadmap for quality/cost/ROI of
design and design process - Verification and test (else cost of mfg test soon
exceeds cost of mfg) - Software
23Riff 3 A Dark Riff on D and DT Productivity
24The Productivity Gap
Potential Design Complexity and Designer
Productivity
Equivalent Added Complexity
Logic Tr./Chip Tr./S.M.
68 /Yr compounded Complexity growth rate
21 /Yr compound Productivity growth rate
How many gates can I get for N?
3 Yr. Design
Year Technology Chip Complexity
Frequency Staff Staff Cost
- 250 nm 13 M
Tr. 400 MHz 210
90 M - 250 nm 20 M
Tr. 500 270
120 M - 180 nm 32 M
Tr. 600 360
160 M - 2002 130 nm 130
M Tr. 800 800
360 M
Source SEMATECH
_at_ 150 k / Staff Yr. (In 1997 Dollars)
25Mask Cost
But average only 500 wafers per mask set !
26Keep the Fabs Full
- Design technology must keep manufacturing
facilities fully utilized with - high-volume parts
- high-margin parts
- Foundry capital cost gt 2B
- How much value of new designs is needed to fill
the fab ???
27 Design Productivity Need DSM 2 EDA Trends
source MARCO GSRC
28Fab Amortization ? Close the Implementation Gap
Level of Abstraction
Effort/Value
source MARCO GSRC
29Design Productivity Gap ? Low-Value Designs?
Percent of die area that must be occupied by
memory to maintain SOC design productivity
Source Japanese system-LSI industry
30 Reduce Back-End Effort ?
Example repeating dense wiring fabric pattern
at minimum pitch
- Eliminates signal integrity, delay uncertainty
concerns - But has at least 60 - 80 density cost
source MARCO GSRC
31Improve IP Reuse Productivity ?
source MARCO GSRC
32QUALITY Problem gt 1000x Energy-Flexibility Gap
1000
100-200 MOPS/mW
Dedicated HW
100
10-50 MOPS/mW
ReconfigurableProcessor/Logic
Energy Efficiency MOPS/mW (or MIPS/mW)
10
ASIPs DSPs
1 V DSP 3 MOPS/mW
1
Embedded mProcessors
LP ARM 0.5-2 MIPS/mW
0.1
Flexibility (Coverage)
Source Prof. Jan Rabaey, UC Berkeley
33Keep the Fabs Full
- Design technology must keep manufacturing
facilities fully utilized with - high-volume parts
- high-margin parts
- What happens when design technology fails ?
- not enough high-value designs
- ? the semiconductor industry will find a
workaround - reconfigurable logic
- platform-based design
- extract value somewhere other than silicon
differentiation
34Dark Riff Conclusions
- Design productivity gap threatens design quality
? design starts, business
models at risk - TAT achieved at cost of QOR
- low QOR ? low silicon value
- electronics industry chooses reprogrammable,
platform-based workarounds - We need to understand cost and quality/value
35Two CANDE-01 Non-Predictions
- Jim Sproch, Synopsys
- Summary Rising NRE will force semiconductor
manufacturers to produce primarily high-volume,
general purpose components such as memory, FPGAs,
and standard processors. New EDA tools will then
have an impact on only a smaller fraction of the
semiconductor industry, and research funding will
evaporate, leaving only the service and support
functions, which dont need to be centralized. - Prediction EDA industry is reduced to a service
role as semiconductor design starts decline. - Prediction Design for Cost EDA tools will
reach the marketplace by 2006.
36Riff 4 Design-Manufacturing Handoff
37Optical Proximity Correction (OPC)
- Corrective modifications to improve process
control - improve yield (process window)
- improve device performance
38Phase Shifting Masks (PSM)
39Field-Dependent Aberration
- Field-dependent aberrations cause placement
errors and distortions
R. Pack, Cadence
40Optical Lithography (its not going away)
- Process window and yield enhancement forbidden
width-spacing combinations (defocus window
sensitivities), generally complex local DRCs - Lithography equipment choices forbidden
configurations such as wrong-way critical-width
doglegs, or diagonal features - Notch rules, critical-feature rules on local
metal due to OPC (subresolution assist features,
especially)
Numerical Technologies, Inc.
41RET Roadmap
0.25 um 0.18 um 0.13 um 0.10 um
0.07 um
Rule-based OPC Model-based OPC Scattering
Bars AA-PSM Weak PSM Rule-based
Tiling Optimization-driven MB Tiling
Litho
CMP
Number Of Affected Layers Increases /
Generation
248 nm
248/193 nm
193 nm
W. Grobman, Motorola DAC-2001
42About Mask Data and 1M Mask NRE
- Format proliferation
- Most tools have unique data format
- Raster-VSB conversion, reverse can be inefficient
- Real-time manufacturing tool switch, multiple
qualified tools ? duplicate fractures to avoid
delays if tool switch required - Data volume
- OPC drives figure count acceleration
- MEBES format is flat
- ALTA machines slow down with gt 1GB data
- Burden on globally distributed mfg resources
- Inefficient refractures
- Refractures!?
- Mask industry historically never touched mask
data unwilling to take risk, not enough margin
or reason - Today, 90 of mask data files manipulated /
refractured process bias sizing (iso-dense,
loading effects, linearity, ), mask write
optimization, multiple tool formats,
43P. Buck, Dupont Photomasks ISMT Mask-EDA
Workshop July 2001
44P. Buck, Dupont Photomasks ISMT Mask-EDA
Workshop July 2001
45P. Buck, Dupont Photomasks ISMT Mask-EDA
Workshop July 2001
46P. Buck, Dupont Photomasks ISMT Mask-EDA
Workshop July 2001
47DT Needs for RET and Mask NRE
- WYSIWYG broken ? (mask) verification bottleneck
- Need function- and cost-aware RET
- RET insertion is for predictable circuit
performance, function - RET tool must understand functional intent
- make only corrections that win , reduce
performance variation - make only corrections that can be manufactured
and verified (including mask inspection) - understand (data volume, verification) costs of
breaking hierarchy - Understand flow issues
- e.g., avoid making same corrections 3x (library,
router, PV tool) - Handoff to manufacturing MUCH more than GDSII
- Includes sensitivities to patterning
variation/error - Bidirectional pipe functionally robust layout
performed w.r.t. models of manufacturing errors
and electrical implications - Mask verification driven by functional
sensitivity information - Mask and ASIC folks arent asleep on this, either
48Another CANDE-01 Non-Prediction
- Prediction GDSII, in its present form, will no
longer be the handoff from design to
manufacturing.
49Riff 5 On Cost, Variability and Value
50Design is Also Part of NRE Cost
- Design cost model (Gary Smith/Dataquest, 2001)
- engineer cost per year increases 5 per year
(181,568 in 1990) - EDA tool cost per year (per engineer) increases
3.9 per year (99,301 in 1990) ( separate
term for interoperability) - Productivity due to 8 major Design Technology
innovations (3.5 of which are still unavailable)
RTL methodology In-house PR Tall-thin
engineer Small-block reuse Large-block reuse
IC implementation suite Intelligent testbench
ES-level methodology - Matched up against SOC-LP PDA content
- SOC-LP PDA design cost 15M in 2001
- Would have been 342M without EDA innovations and
the resulting improvements in design productivity - (Is this an effective message?)
51Design Cost of SOC-LP PDA Driver
52Process Variation Sources
- Design ? (manufacturing variability) ? Value
- Intrinsic variations
- Systematic due to predictable sources, can be
compensated during design stage - Random inherently unpredictable fluctuations and
cannot be compensated - Dynamic variations
- Stem from circuit operation, including supply
voltage and temperature fluctuations - Depend on circuit activity and hard to be
compensated - Correlations
- Tox and Vth0 are correlated due to
- Line width and spacing are anti-correlated by one
since the line pitch is fixed ILD and
interconnect thickness also anti-correlated
53Technology Trend Over Generations
- Values are from ITRS, BPTM, and industry red is
3s - From ongoing work at UCSD/UCB/Michigan some
values are off (e.g., Rvia)
54Copper CMP Variability Near Term Years
Combined dishing/erosion metric for global
wires Cu thinning due to dishing for isolated
lines/pads No significant dishing at local levels
- thinning due to erosion over large areas (50
areal coverage)
C. Case, BOC Edwards ITRS-2001 preliminary
55Variation Sensitivities Local Stage
- Sensitivity is evaluated by the percentage change
in performance when there is 3s variation at the
parameter - For local stage, device variations have larger
impact on line delay and interconnect variations
have stronger impact on crosstalk noise
56Value and Getting to ROI
57BTW Need a Quality Model ?
- Normalized transistor quality model normalizes
- speed, power, density in a given technology
- analog vs. digital
- custom vs. semi-custom vs. generated
- first-silicon success
- other simple / complex clocking,
verification/test effort and coverage,
manufacturing cost, - Need design and design process quality models?
- strongly related to establishing DT value?
- several private commercial and/or in-house
analogues - survey methodology being contemplated by MARCO
GSRC
58Riff 6 Its Lunchtime
59Design Grand Challenges gt 65nm
- Scaling of maximum-quality design implementation
productivity - Overall design productivity of quality-
(difficulty-) normalized functions on chip must
scale at 2x / node - Reuse (including migration) of design,
verification and test effort must scale at gt
2x/node - Develop analog and mixed-signal synthesis,
verification and test - Embedded software productivity
- Power Management
- Off-currents in low-power devices increase
10x/node design technology must maintain
constant static power - Power dissipation for HP MPU exceeds package
limits by 25x in 15 years design technology must
achieve power limits - Power optimizations must simultaneously and fully
exploit many degrees of freedom - multi-Vt,
multi-Tox, multi-Vdd in core - while guiding
architecture, OS and software - Deeper integration of Design technology with
other ITRS technology areas - Example Die-package co-optimization
- Example Design for Manufacturability (sharing
variability burden with Litho/PIDS/FEP and
Interconnect, reduction of system NRE cost) - Example Design for Test
ITRS-2001 preliminary
60Design Grand Challenges lt 65nm
- (Three Grand Challenges from gt 65nm, and)
- Noise Management
- Lower noise headroom especially in low-power
devices coupled interconnects supply voltage IR
drop and ground bounce thermal impact on device
off-currents and interconnect resistivities
mutual inductance substrate coupling
single-event upset (alpha particle) increased
use of dynamic logic families - Modeling, analysis and estimation at all levels
of design - Error-Tolerant Design
- Relaxing 100 correctness requirement may reduce
manufacturing, verification, test costs - Both transient and permanent failures of signals,
logic values, devices, interconnects - Novel techniques adaptive and self-correcting /
self-repairing circuits, use of on-chip
reconfigurability - No specific call-outs for verification, cost, ?
implicit in productivity
ITRS-2001 preliminary
61Conclusions
- Design Technology needs to prove ROI
- Prove quality and value
- Prove costs hidden costs include TAT/TTM
also include interoperability, integration,
designer productivity - Design Technology must show its Red Bricks
- Need METRICS! (Design Chapter has almost no
red/yellow/white) - Design Technology must share (take co-ownership
of) other technology domains Red Bricks - Plenty of possibilities
- Design Technology community must educate itself
and the rest of the ITRS community (esp.
customers!) - Virtuous cycle DT gives better ROI, achieves
higher value, improves technology delivery,