Title: 3D CMP and 3D IC Physical Design Flow
13D CMP and3D IC Physical Design Flow
- Jason Cong and Guojie Luo
- University of California, Los Angeles
- cong, gluo_at_cs.ucla.edu
2Outline
- Design Driver
- 3D Chip Multiprocessor
- Based on OpenRISC 1200
- NoC Interconnect
- RF Reconfigurable Interconnects
- Physical Design Flow
- Design Flow for 3DM2
- Design Flow in Development
33D Chip Multiprocessor (CMP)
- Three Silicon Layers
- Tier 3 Cache Data Components
- Tier 2 Interconnect and Cache Tags
- Tier 1 Cores
- Non-Uniform Cache Access
- Cores see different latencies to different cache
banks - Data can migrate among distributed caches
- Can hide latency
- Adds interconnect traffic
Heat sink
43DM2 - MITLL .18um 3D SOI Technology
53D CMP Test Chip Architecture
- Using OpenRISC 1200
- http//www.opencores.org
- Open source in-order RISC uniprocessor
- Has been tested in silicon and runs Linux
- Simple core used due to test chip area
constraints - MIT Lincoln Labs process
- 180nm, 25mm2 x 3 tiers
- Taped out on Nov 2006
63D CMP NoC Interconnect
- One example NoC using two 5-port routers
- Short vertical links to local L2 slices
- Links to NoC fabric for remote L2 traffic
73D Reconfigurable Interconnects
- 3D Integration
- Targets interconnect latency by reducing
wirelength - RF Interconnects
- Frequency-Division Multiple Access (FDMA)
- Targets interconnect congestion by improving
bandwidth - Multiple signals can occupy a common interconnect
- Further potential to dynamically tune frequencies
- Adapt to different communication patterns
- Interconnect density can be reduced while
minimizing performance impact
A
B
C
D
One shared RF Fabric can be configured to a wide
range of topologies.
2
3
0
1
8Carrier Frequency
- On/off digital switching noise main source of
noise couple to RF Interconnect - Higher freq carriers avoid all the base-band
digital noise - Clock rate of future CPU not exceeding 4-5GHz
(due to power consumption issues) - Bandwidth Base-band noise will be around the
clock rate - We need to pick a freq far away from the noise
- gt f1 8GHZ, f2 16 GHz, f3 24GHz, f4
32GHz -
9Bi-Directional FDMA-Link/Bus
- Advantages
- Higher combined data rate
- Simultaneous, bi-directional
communications - Re-configurable between bands
- Low in-band coupling for
parallel bus - Potentially with fewer I/O pins and smaller
routing area
Bi-directional Link
Bi-directional Bus
10FDMA-I I/O Data Eye Diagram
113D CMP Roadmap
- 3D CMP with direct interconnects
- Four OR1200 cores, four shared L2 cache banks,
and a simple, static interconnect topology
implemented on an FPGA first and then fabricated
at MIT LL - Simulation infrastructure to explore NUCA and RF
design space - Dynamic adaptation of RF interconnect to a
diverse set of multithreaded and multitasking
applications - FPGA prototyping of core, bus structures, NUCA
and RF - Choose the best power/performance point in the
design space - Final implementation on a 3D process (MIT-LL)
12Physical Design Flow for 3DM2 (1/2)
netlist
RTL
Synthesis
Partition
RC extraction
Floorplan
Trial Route
P/G network
Place
Routing Congestion
Timing Constraint
Clock Tree
Route
RC extraction
DRC, LVS
Layout
13Physical Design Flow for 3DM2 (2/2)
- Most 3D features are handled manually
- Ask for more 3D CAD tools
14Thermal-Aware 3D Physical Design Flow
Technology
Design constraints
Netlist (LEFDEF)
Thermal-Driven 3D Floorplanner
Thermal Simulation
Compact Thermal model
Open Access
Thermal-Driven 3D Placement
Timing Analysis
Thermal-Aware 3D Router w/ Thermal Via Planning
Parasitic Extraction
CIF/GDSII
Layout Verification
15Thermal Resistive Network Wilkerson04
- Circuit stack partitioned into tiles
- Tiles connected through thermal resistances
- Lateral resistances fixed
- Vertical resistances ? 1/via
- Heat sources modeled as current sources
- Current value power
- Heat sinks modeled as ground nodes
(a) Tiles stack array
(b) Single tile stack
Accurate and slow
16Thermal-Aware 3D Floorplanning ICCAD04
- Simulated Annealing (SA) Engine
- New local z-neighbor operations
- Cost function
- nwl ? normalized wirelength
- narea ? normalized chip area
- nvc ? normalized interlayer via number
- cT ? temperature cost
- Hybrid Thermal Evaluation
- At each move ? uses simplified chain model
- At each SA temperature drop ? the resistive
network model
173D Placement via TransformationASPDAC 07
- Idea
- Start from 2D placement
- Heuristic 2D to 3D transformation
- Reduce long nets
- Keep local connecting nets
- Window-based transformation
- balance WL and via
- RCN graph based refinement
- Reduce via and tempreture
18Multilevel TS-Via Planning and 3D Routing
ASPDAC05 ICCAD05
- Alternating Direction TS-Via Planning
- Decompose the NLP into simplified sub-problems
- In a multi-level framework with routing
19OpenAccess extension for 3D design
- Define additional 3D info.
- Device Layer
- Inter-layer via
- Provide interface for 3D cad tool
- Parameter extraciton
- Timing
- LVS
- Compatible with Cadence Encounter
20Summary
- Design Driver
- 3D Chip Multiprocessor
- Based on OpenRISC 1200
- NoC Interconnect
- RF Reconfigurable Interconnects
- Physical Design Flow
- Design Flow for 3DM2
- Design Flow in Development
21THE ENDThank You!