Title: Krste Asanovic
1SMASH The C Layer
- Krste Asanovic
- krste_at_mit.edu
- MIT Computer Science and Artificial Intelligence
Laboratory - http//cag.csail.mit.edu/scale
- RAMP Retreat, UC Berkeley
- January 11, 2007
2SMASH SiMulation And SyntHesis
- Goal
- One framework for both architectural exploration
and chip design and verification - Approach
- High-level design discipline where design
expressed as network of transactors
(transactional actor) - Transactors (aka units) refined down to RTL
implementations - Design structure preserved during refinement
- From my perspective, RDL RAMP are pieces of
SMASH
3Transactor Anatomy
- Transactor unit comprises
- Architectural state (registers RAMs)
- Input queues and output queues connected to other
units - Transactions (guarded atomic actions on state and
queues) - Scheduler (selects next ready transaction to run)
Transactions
Output queues
Input queues
Scheduler
Transactor
- Advantages
- Handles non-deterministic inputs
- Allows concurrent operations on mutable state
within unit - Natural representation for formal verification
4RAMP Design Framework Overview
- Target System the machine being emulated
- Describe structure as transactor netlist in RAMP
Description Language (RDL) - Describe behavior of each leaf unit in favorite
language (Verilog, VHDL, Bluespec, C/C, Java)
- Host Platforms systems that run the emulation or
simulation - Can have part of target mapped to FPGA emulation
and part mapped to software simulation
SMASH/C is the way to write leaf units in C,
either for use with RDL or for standalone C
simulation
5Whats in SMASH/C?
- A C class library plus conventions for writing
transactor leaf units - These should work within a RDL-generated
C-harness - In addition, libraries for channels,
configuration, and parameter passing code to
support standalone C elaboration and simulation - Also, can convert HDL modules into C units for
co-simulation - Verilog -gt C using either Verilator or Tenison
VTOC - Bluespec -gt C using Bluespec Csim
6Why C? I thought RAMP was FPGAs?
- Initial design in C, eventually mapped into RTL
- Much faster to spin C design than to spin FPGA
design - Hardware verification needs golden model
- Some units might only ever be software
- Power/temperature models
- Disk models
7SMASH/C Code Example Leaf Unit
struct Increment public smashIUnit_LeafImpl
// Parameters static const
smashParameterltintgt inc_amount // Port
functions smashInputPortltIntMsggt
in()return m_in smashOutputPortltIntMsggt
out()return m_out  void elaborate(smashPar
ameterList plist) m_inc
plist.get(Incrementinc_amount, 1) Â
bool tick() if ( xactInc() )
return true else if ( xactBumpInc() )
return true
return false  private // Ports
smashInputPortltIntMsggt m_in
smashOutputPortltIntMsggt m_out // Private
state int m_inc  // Private transactions
8Example Leaf Unit Transactions
bool xactInc() bool xactIncFired
m_in.deqRdy() m_out.enqRdy()
(m_in.first() ! 0) if ( !xactIncFired
) return false  m_out.enq( m_in.first()
m_inc ) m_in.deq() return true  bool
xactBumpInc() bool xactBumpIncFired
m_in.deqRdy() (m_in.first() 0) if (
!xactBumpIncFired ) return false  m_inc
1 m_in.deq() return true Â
9SMASH/C Example Structural Unit
struct IncPipe public smashIUnit_StructuralImp
l // Port functions smashInputPortltIntMsggt
in() return m_in smashOutputPortltIntMsg
gt out() return m_out void elaborate(
smashParameterList plist )
regPort ( "in", m_in ) regUnit
( "incA", m_incA ) regChannel (
inc2inc", m_inc2inc ) regUnit ( "incB",
m_incB ) regPort ( "out",
m_out ) elaborateChildUnits(plist)
// Connect child units and channels
smashconnect( m_in, m_incA.in() )
smashconnect( m_incA.out(), m_channel,
m_incB.in() ) smashconnect( m_incB.out(),
m_out ) private // Ports
smashInputPortltIntMsggt m_in
smashOutputPortltIntMsggt m_out // Child units
and channels Increment m_incA Increment
m_incB smashSimpleChannelltIntMsggt m_inc2inc
Â
InputPort in
IncPipe
Incrementer incA
SimpleChannel inc2inc
Incrementer incB
OutputPort out
10SMASH/C Example Simulation Loop
int main( int argc, char argv ) //
Toplevel channels and unit smashSimpleChannellt
IntMsggt iChannel("iChannel",32,3,7)
smashSimpleChannelltIntMsggt oChannel("oChannel",3
2,3,7) IncPipe incPipe incPipe.setName("top")
// Set some parameters and elaborate the
design smashParameterList plist
plist.set("top.incB",Incrementincrement_amount,
2) plist.set("top.inc2inc",SimpleChannelltIntMsg
gtbandwidth,32) plist.set("top.inc2inc",Simple
ChannelltIntMsggtlatency,3) plist.set("top.inc2
inc",SimpleChannelltIntMsggtbuffering,7)
incPipe.elaborate(plist) // Connect the
toplevel channels to the toplevel unit
smashconnect( iChannel, incPipe.in() )
smashconnect( incPipe.out(), oChannel )
// Simulation loop int testInputs 1, 2,
0, 3, 4, 0, 1, 2, 3, 4 Â int inputIndex 0
for ( int cycle 0 cycle lt 20 cycle ) Â
if ( iChannel.enqRdy() (inputIndex lt 10) )
iChannel.enq( IntMsg(testInputsinputIndex)
) if ( oChannel.deqRdy() ) stdcout
ltlt oChannel.first() ltlt stdendl
oChannel.deq() incPipe.tick() //
Hierarchical tick iChannel.tick() // Always
tick units before channels oChannel.tick()
iChannel
IncPipe top
Incrementer incA
SimpleChannel inc2inc
Incrementer incB
oChannel
11Why didnt we just use SystemC?
- If youre asking, you havent read the SystemC
standard - Ugly semantics
- Too many ways of doing the same thing
- Fundamental assumption is that host is sequential
- SMASH/C designed to support parallel hosts
- Even worse, simulator is a global object (cant
have two engines in one executable) - In industry, architects use SystemC, hardware
designers ignore it when building chips
12Issues
- Need to figure out flexible type system and
bindings from RDL into C/Bluespec/Verilog - Need to figure out common (across
C/RDL/Bluespec) interfaces/syntax for - Elaboration
- Configuration
- Debugging
- Monitoring/Tracing