Title: BWRC and GSRC: Design Problems and Methodologies
1BWRC and GSRCDesign Problemsand Methodologies
- Professor Kurt Keutzer
- EECS
- University of California at Berkeley
- www-cad.eecs.berkeley.edu/keutzer
- keutzer_at_eecs.berkeley.edu
2Original GSRC Organization Emphasis
- T1 System-Level Design
- Component-Based Approach
- Heterogeneous (hardware, software, mixed-signal,
RF) - Re-use and technology migration
- T2 Circuit Fabrics for DSM
- Implementation of application-oriented circuit
components - Power-Delay-Area Product Issues
- Novel Circuit Solutions (transistors
interconnect)
M3 Design Collaboration
MARCO Interconnect Focus Center
M1 Design Driver - BWRC
M2 Design Metrics
T5 Design Test
T4 Design Validation
- T3 Technology Modeling
- Silicon Ground Truths and process abstraction
- Manufacturing-related issues in design
- Extraction, Reliability, Yield, Process Tradeoffs
3The Roster
- Thrust T3 Technology Modelling
- Wojciech Maly, CMU (leader)
- Andrzej Strojwas, CMU (second)
- Larry Pileggi, CMU
- Thrust T4 Design Validation
- Randy Bryant, CMU (leader)
- David Dill, Stanford (second)
- Robert Brayton, Berkeley
- Ed Clarke, CMU
- Tom Henzinger, Berkeley
- Kurt Keutzer, Berkeley
- Karem Sakallah, Michigan
- Thrust T5 Design Test
- Tim Cheng, UCSB (leader)
- Sujit Dey, UCSD (second)
- Srinivas Devadas, MIT
- Kurt Keutzer, Berkeley
- Thrust T1 System Design
- Alberto Sangiovanni, Berkeley (leader)
- Giovanni De Micheli, Stanford (second)
- Srinivas Devadas, MIT
- Kurt Keutzer, Berkeley
- Edward Lee, Berkeley
- Sharad Malik, Princeton
- Richard Newton, Berkeley
- Jan Rabaey, Berkeley
- Thrust T2 Circuit Fabrics for DSM
- Andrew Kahng, UCLA (co-lead)
- Larry Pileggi, CMU (co-lead)
- Robert Brayton, Berkeley (second)
- Jason Cong, UCLA
- Wayne Dai, UCSC
- Kurt Keutzer, Berkeley
- Malgorzata Marek-Sadowska, UCSB
- Karem Sakallah, Michigan
- Alberto Sangiovanni-Vincentelli, Berkeley
4GSRC Focus Research Center
Director A. Richard Newton Associate Director
Kurt Keutzer
24 Faculty from 9 Universities 40 grad students,
14 post-docs
5The Gigascale Silicon Research Centerhttp//www-
cad.eecs.berkeley.edu/GSRC
Empowering designers to realize the potential of
gigascale silicon by rebuilding the RTL
Foundation and by enabling scaleable,
heterogeneous, component-based design.
6Empowering Designers - at BWRC
- Universal Radios for 4th Generation
- Two generations beyond present digital cellular
- Low energy, high-performance programmable
computing platform - Systems and circuits focus to resolve rules of
engagement at the air interface and to allow for
peaceful co-existence - Picoradios
- Ultra-low power, low cost, embedded CMOS radios
( lt 1 mW) - EDA systems for rapid, optimized implementation
- Ultra-High Bandwidth Millimeter Radios
- Scaled CMOS solutions for 20 - 60 GHz operation
- Architectures and Device modeling
High Integration
Low Power
High Performance
7Rebuild the RTL Foundation
- Predictable Target for Higher-Level Design
- Forward constrained
- Reliable implementations
- One-Pass Implementation from RTL Down
- Elimination of multiple iterations in timing
loops - Meet complexity of multi-dimensional constraints
- Abstractable Target for Higher Levels of Design
- Easy to estimate, model
8The Long Wire Problem
- Intra-block (local) wires roughly scale
- Inter-block wires create timing closure suprises
- Global (inter-block) wires are the problem
- Dont scale with next generation design
9Fabrics and Long Wires
Fabrics can be small blocks, or the combinations
of blocks If the timing and area constraints
are loose enough, we can extend the hierarchy
that we use today to construct DSM fabrics
10Impact on PD Synthesis
And if the synthesis blocks are small, the block
will behave much like a complete die shrink ---
no long wire problems created
11Flexible, Constructive Fabrics
- With block- or component-based design, the block
sizes will not be growing as fast as the IC size - The more flexible the blocks are the better the
chance of assembling them and achieving timing
closure, etc. - Constructive Fabrics that provide PDA and
porosity trade-offs for blocks via remapping,
resynthesis, etc., in combination with physical
design
12Raw Technology Enables Components
- 200 - 2,000 50K gate IP modules
- What kind of components?
- How are they designed?
- How do they communicate?
13Enabling Component Based Design
- Cost of fabrication facilities and mask making
has increased significantly - NRE cost of new design has increased
significantly - Physical effects (parasitics, reliability issues,
power management) are increasingly significant in
the design process - Design complexity, and context complexity is
high and only increasing with each process
generation, yet time-to-market factors are
getting more stringent - These factors imply fewer design starts,
higher-design volume, and more highly
programmable platforms
14Tool support splits into two
Also, fewer design starts! More design volume!
15SOC Programmable Platforms
- Implemented using Energy-Power-Cost-efficient
medium-complexity (O(10M-100M) gates in 50nm
technology) chip - Platform-based approaches will dominate
- Represented to the designer/programmer by means
of an integral programmers model - Derived from a specific family of
microarchitecures - Extensible through the addition of large blocks
of functionality - These platforms will be highly-programmable
- Functionality via software
- Programmable inter-block communication, via
programmable interconnect or on-chip networks. - They will implement highly-concurrent
functionality - Run-time architectures must support predictable,
scaleable, and verifiable models of concurrency
at all levels.
16Example PicoNode for Sensor Networks
Heterogeneous Implementation Architecture
allows for Trade-off between Flexibility and
Efficiency
17The Research Playground
Restrict the problem by assuming
a mostly-programmable, parameterized
implementation
Application
What is the Programmers Model?
Estimation and Evaluation
18Organizational Structure
BWRC -picoradio
Restrict the problem by assuming
a mostly-programmable, parameterized
implementation
BWRC/ GSRC - Weber
Application
Architecture
What is the Programmers Model?
Microarchitecture
Component Assembly and Synthesis
Estimation and Evaluation - Malik, Devadas
19Architectural Explorations
Embedded Systems Implementation Spectrum
DSP/?P
Extensible
ASIP
ASIC
- An extensible processor has a predefined core ISA
with the capability of supporting new
instructions for specialized applications
what should coprocessor extensions look
like? whats the right microarchitecture?
20Compiler Research - Son of SPAM
ISA Extensions?
.c
.c
.c
?Arch
Designer
gen
Compiler
gen
.o
Simulator
- A compiler and simulator are generated for a
given micro-architecture - ISA extensions found from the target applications
are incorporated into the compiler and simulator
21Physical assembly of components
- 100nm gt 50 nm technology enables
- 200 - 2,000 50K gate IP modules
- 80,000 -200,000 global nets
- Each IP module may have 100s of pins
- Real design challenge becomes how to assemble
2000 such modules considering - Chip-level interconnect delay
- Chip-level noise and integrity issues
- Chip-level power dissipation
Nexsis project
22How do the components communicate?
- 200 - 2,000 50K gate IP modules
- 80,000 -200,000 global nets
- major requirements among high-level
communication as well
23Communication Based Design
- Determine a protocol so that no matter what the
communication topology and the nature of the IPs
the functionality of the overall system is
guaranteed (TCP/IP like) - Given the IP set and the interconnections,
automatically synthesize protocols and
macro-shells - Given the IP set and a set of time-varying
interconnections, automatically synthesize
adaptive protocol and macro-shells that optimize
performance according to the current topology
24Communication Refinement
- abstract from implementation concerns
- multi-target VC,
- bus-independent VC
system transaction, ANY data structure (e.g.
video line)
hardware or software
ANY BUS operation (data, address...) VSI-Allianc
e OCB Group.Virtual Component Interface (VCI)
Physical Bus (e.g. PIBus) fixed
bus-width, detailed protocol
Bus Wrapper
Communication Interface (e.g. bounded FIFO)
25Models of Computation
- Network of CFSMs
- Globally asynchronous, locally synchronous (GALS)
- Extend the model to loss-less communication
(abstract CFSM) - Communication refined to implementation
- Refinement steps
- - preserve desired properties at
- each transformation
- - propagate constraints to lower
- levels of abstraction (top- down).
26Summary
- GSRC consists of
- 25 faculty, 14 post-docs, 40 students
- representatives of 9 universities
- Driven principally by two BWRC designs
- Piconode
- Universal radio
- Now organized around four themes
- Constructive fabrics
- Measure, then improve
- Component/communication based-design
- Highly/Fully programmable systems
Enabling design
Rebuild RTL foundation
Component based design