Title: Reconfigurable Systems Development Research at BWRC
1Reconfigurable Systems Development Research at
BWRC
- John Wawrzynek
- University of California, Berkeley
- Berkeley Wireless Research Center
2System Development and Prototyping
A long tradition
- These systems prototypes validate novel
algorithms and circuits. - Enable synergy and join optimization not possible
otherwise. - Provide students with broad education.
Infopad (Brodersen, Rabaey et al - First ever
wireless multimedia terminal (1995)
Critical to the continued success of the
center Is our agility in building working system
prototypes.
3Central Themes in Systems Design Research
- Complexity Management controlling design and
verification costs. - Ease of specification/programming, modularity
and reuse. - Robustness designs that work in the presence of
uncertainly. - Scalability, design longevity, tolerance to
poorly controlled technologies, defects, faults,
etc. - Power, Price, Performance
4Keys to Advancement
- Proper design abstractions
- Models of computation / programming, etc.
- Reconfigurable computing platforms
- Ideal sandboxes (emulations platforms)
- Properly engineered reconfigurable systems
provide huge price/performance/power and
robustness advantages
5Promising Design Abstractions
Simulink (discrete time block-based diagrams) has
been used successfully at BWRC.
- Related, more general, models of computation have
emerged in recent years - e.g. KPN, SCORE, CLICK,
- Systems as networks of decoupled
software/hardware processes with stream-based
asynchronous communication links. - Models make communication explicit.
- Latency tolerance of streams enable efficient,
flexible execution schedules, and eases
placement/routing, deals with uncertainty in the
mapping process.
6BEE2 Platform Development
Chen Chang, Pierre Droz, Henry Chen, Andrew
Schultz, Dan Burke, Bob Broderson
- 5 Virtex-IIPro70 FPGAs ? 2.5M logic gates
equivalents - 20GB DRAM
- 20 10Gbps connections
- 10GigE/Infiniband
- Inter-module connections
- I/O, analog interfaces
7BEE2 Module Design
FPGAs
DRAM
10GigE ports
Compact Flash Card
DVI/HDMI
USB
10/100 Enet
14X17 inch 22 layer PC board 4K/module w/o
FPGAs or DRAM
8BEE2 Analog Interface
- Use IBOB to fanout the serial Infiniband/Enet
connections to parallel LVDS/LVPEL signals - IBOB can be connected to BEE2 modules or directly
to Infiniband/Enet switches - Several ADC and DAC modules have been developed.
With Dan Werthimer, UC Berkeley Space Sciences Lab
9If You Build it, They Will Come
Current and Soon to Be Users
- BWRC ASIC/SOC emulation, Cognitive Radio
Algorithm Exploration, PicoRadio simulation, LDPC
simulation, EM Antenna Simulation - SSL, UC Radio Astronomy Lab SETI, Allen
Telescope Array - GSRC Home Media Gateway
- RAMP UCB, Stanford, UW, UT Austin, CMU, MIT,
Intel Multiprocessor Emulation - Bob Conn/ Research Triangle Inst. Spice Circuit
Simulation - Rob Reutenbar/CMU Speech Recognition
- Stanford BioInformatics Group Biological
signaling research - Chris Dick, Kees Vissers / Xilinx Signal/Media
Processing - Microsoft Research / Chuck Thacker Computer
System Research - ST Microelectronics
- Widespread interest and dozens of other requests.
10Lots of BEE2s
- Hardware
- Module in production use, JPL Deep
Space-Network (Barstow, CA) - 10 modules in test/bring-up
- Currently allocated to BWRC, SSL, RAL, Xilinx
- Working with SAE Materials to move from prototype
to turn-key production and move production
management away from BWRC - Production of another 25 modules underway
- Gateware/Software
- Linux port
- Simulink/Xilinx-EDA integration for automatic
compilation to bit-files - Several Radio Astronomy applications (with ADC
interface) complete (spectrometers, correlators) - Board-support package close to release (test
suites, docs, app notes) - First BEE2 users hands-on workshop January 17-19.
11SOCRE (System-on-Chip Realtime Emulation)
Brian Richards, Pierre Droz, Chen Chang, Bob
Broderson
- Builds on success with BEE Chip in a Day
- Extends to systems-on-chip
- heterogeneous mixed-signal systems
- Processors cores, analog blocks
- Uses BEE2 platform for system emulation
- (Challenging issues in analog block emulation)
Could we place cell phone calls on the BEE2?
12SOCRE Demonstration for DARPA
Correlator for Image Formation
Custom XMAC chips
- Computing requirements grow with square of size
of array. - Representative of many signal processing tasks.
- Simple control processor for diagnostics and
control. - Demonstrates design flow on BEE2 and 90nm node.
In collaboration with Dan Werthimer, UCB SSL
13Research Accelerator for Multiple Processors
- Problem
- Compilers, operating systems, architectures not
ready for 1000s of CPU per chip, but thats where
were headed - How do research on 1000 CPU systems in compilers,
OS, architecture?
- Solution
- Create 1000 CPU system from 40 FPGAs
- Distribute out-of-the-box Massively Parallel
Processor that runs standard binaries of OS and
applications to all major research institutes in
the US
- 500K committed by Xilinx
- NSF Infrastructure grant under review
Core Team D. Patterson, J. Wawrzynek, J. Rabaey
(UCB), J. Hoe (CMU), D. Chiou (UT Austin), C.
Kozyrakis (Stanford), K. Asanovic (MIT), M.
Oskin (U Wash.), S. Lu (Intel)
14RAMP as a Multiprocessing Watering Hole
Parallel file system
Dataflow language/computer
Data center in a box
Thread scheduling
Internet in a box
Security enhancements
Multiprocessor switch design
Router design
Compile to FPGA
Fault insertion to check dependability
Parallel languages
- RAMP as next Standard Research Platform? (e.g.,
VAX/BSD Unix in 1980s) - RAMP attracts many communities to shared artifact
? Cross-disciplinary interactions ? Accelerate
innovation in multiprocessing
15 RAMP Design Framework
- Question How do we get contributing developers
from across the country to work independently on
CPUs, network interfaces, memory systems, etc.? - Answer RAMP Desription Language (RDL)
- Defines and supports standard module interfaces
and execution model. - Supports both cycle-accurate emulation of
detailed parameterized machine models and rapid
functional-only emulations - Carefully counts for Target Clock Cycles
- Units in any hardware design languages (will
work with Verilog, VHDL, BlueSpec, C, ...) - RDL used to describe plumbing to connect units
16 RAMP Description Language
Greg Gibeling, Andrew Schultz, Krste Asanovic
- Design composed of units that send messages over
channels via ports - Units (gt 10,000 gates)
- CPU L1 cache, DRAM controller.
- Channels ( FIFO)
- Lossless, point-to-point, unidirectional,
in-order message delivery
Similar to process network models (KPN, SCORE,
click/cliff) with explicit management of target
clock cycles.
17Smart Home Gateway The Challenge
- New devices are entering the home environment at
an increasing rate, often effectively replacing
older ones. VCR ? DVD - Standards are proliferating communication,
recording and playback, display - Devices do not interconnect
- Control is a nightmare
18Dealing with the Myriad of Protocols and Formats
Put the Intelligence in the Network Smart Home
Routers
Home routers Provide on-the-fly protocol
conversion and trans-codingbased on properties
of source and destination devices
Courtesy SIA-MARCO GSRC
19The Reconfigurable Home Gateway
- Research Opportunities
- Unique high-efficiency Codec development
- Reconfigurable and power-aware processor
architectures (targeted high-performance,
high-computational density) - Techniques for plug-and-play (discovery,
transcoding, etc.) - High-level control architecture (new level
abstraction to allow feedback to user in a device
independent way) - Adaptive wireless, soft radios
- User-aware adaptation (ex baby-cam feed follows
you around the house, audio sweet-spot
automatically adjusts) - Protocols, including encryption, compression
20Reconfigurability is Key
- Adaptation to changing requirements
- New devices with new Codecs constantly being
added - need true plug-and-play. - Residents make minute to minute system
configuration changes - system resources must be
reallocated as needed. - Reconfigurable devices offer high-computational
density (ASIC-level performance) needed to
efficiently process HD video, etc.
Current work uses off the shelf FPGA development
boards. New work involves design of novel
reconfigurable architectures.
Xilinx XUP Board
21Current Status
Dan Burke, Chris Baker, Stanley Chen, Yury
Markovskiy, Kaushik Ravindran, Ken Lutz, Jan
Rabaey