Title: On Controllers, Soft Connections, and Logical Topologies
1On Controllers, Soft Connections, and Logical
Topologies
Michael Pellauer MIT CSAIL Angshuman Parashar,
Michael Adler, Joel Emer Intel VSSAD
2The Setup
- (For both our HAsim simulator and the talk)
- Virtex5 110t on HiTechGlobal PCIe accelerator
- Future FSB-based accelerators. Larrabee?
- Use HAsims Remote-Request-Response (RRR)
- Protocol of communication between SW/HW
- Allows calls from one to the other
run program
emulate instr
FPGA
Host Processor
translate address
PCIe
dump stats
3The Problem of the Day
- Just because you can talk doesnt mean you have
anything interesting to say! - We must control higher-level interactions between
software and hardware - Example Dump Stats command
- Transmit requests intra-FPGA, aggregate responses
- Future think about multiple-FPGA setup
Cache
dump stats
PCIe Interface
RRR
Controller
FPGA
Branch Pred
4The HAsim Controller
- Software sees it as
- Hardware sees it as
run, pause,
Controller
Host Software
setParam
RRR
dump stats
Different modules access different services
run, pause,
setParam
Controller
dump stats
Which modules use which service is very fluid
enable events
debug
assertion fail
5Problem HDLs Inflexible Interfaces
Simulator
HW Module Instantiation
- Branch Predictor has a bug
- Want to send some debug info to the Controller
- Fundamental Problem HDLs allow communication
only up and down hierarchy - Verilog OOMRs are not an acceptable solution
- Gets worse if we have alternative modules
Branch Pred
6Our Solution Soft Connections
- Goal soften rigid communication hierarchy
- Users separately instantiate named endpoints
- Can read and write as if they were half of a
guarded FIFO (FI and FO) - Instantiators interface does not change
- Bluespec standard ModuleCollect library
send()
recv()
mkSend
mkRecv
fet2dec
fet2dec
Added During Bluespec Static Elaboration Compiler
Phase
7Review Static Elaboration Phase
- Inline function calls and datatypes as
combinational logic - Instantiate modules with specific parameters
- Resolve polymorphism/overloading
8Elaboration-Time Algorithm
- let (sends, recvs) getCollection()
// Get from ModuleCollect - for each s in sends do
- let rs matchByName(s.name, recvs)
- if rs and not s.optional then
- error(Unmatched Send s.name)
- else if rs r then
- connect(s, r) //
instantiate buffering - else
- error(Multiple Receives connected to
s.name) - recvs recvs rs //
remove matched recvs - for each r in recvs do
- error(Unmatched Receive r.name)
Open Question Can we do this in SystemVerilog as
well?
9Multicast Connections
- A one-to-many Send (broadcast)
- A many-to-one Recv (listener)
recv()
mkRecv
(now multiple recvs are no longer an error)
start_prog
recv()
mkRecv
broadcast()
start_prog
mkBcast
recv()
mkRecv
start_prog
start_prog
10Building 2-Way Communication
- More complex abstractions from primitives
- Client/Server
- Multicast Client/Server
makeReq()
getReq()
mkClient
mkServer
getResp()
makeResp()
mem_load
mem_load
11Controller Services Revisited
- Which should get which type of soft connection?
- Commands/Params
- Receive from software, send to many modules
- One-to-Many Broadcast
- Can make a nice abstraction for local commands,
params - Events/Stats
- Receive from software, send to many modules,
aggregate responses - Many-to-one Client
- Assertions/Debug
- Receive from many modules, send to software
- Many-to-one Receive
12Case Study span
- span(c) number of instantiation boundaries
crossed between sender and receiver - Roughly, the pain of changing a communication
path - In HAsim, 118/217 connections are to/from
Controller - We start to worry about the massive fan-in
13Logical Topology vs Physical Topology
- We described the logical communication topology
- Could be implemented with different physical
topology - Could use Rings/Trees/Grids to offset massive
fan-in - Implemented Rings and Trees
- So far no improvement over physical point-to-point
this station doesnt have 5
Station routing tables made at elaboration
station has an address for foo 5
foo
send
station has to know 5 means foo
send
recv
Connection interface does not change!
foo
recv
recv
14Take Aways
- FPGA-as-accelerator model is rapidly maturing
- The FPGA-as-raw-fabric model is not ideal
- Something like HAsims Controller helps
- Coordinates interaction between FPGA/SW
- Need different Hardware-design techniques for
FPGA accelerators - More flexibility needed reconfigurations common
- Soft Connections bring flexibility to interfaces
- Make it easier to have a fluid set of modules
which interact with the controller - Logical topology ! Physical topology
- Designer needs help with both
15Thank You!
- pellauer_at_csail.mit.edu
16 17The Controllers Services
- Commands
- Receive start or pause from software
- Controller distributes to all interested hardware
modules - Params
- Receive dynamic command line values
- Controller distributes to interested hardware
modules - Events
- Software can enable, disable
- Controller aggregates, sends to software
- Stats
- Software requests dump periodically
- Controller passes on request, aggregates
responses - Assertions
- Controller passes failures on to software
- Debug
- Controller passes info on to software
18Making Gateware more like Software
- Ultimately we want many distributed services
throughout the FPGA talking to software - They communicate at different rates
- It makes sense for the variable/rare services to
share the same interconnect on the FPGA - Flexibility of communication Easier
development - Today Development plan and issues
19Review Soft Connections
- Point-to-Point
- Smart Synthesis Boundaries
send()
recv()
mkRecv
mkSend
fet2dec
fet2dec
try_xfer() xfer_ack()
mkB
addDanglingSend(mkB.outg3, fet2dec, Inst)
outg
outg
outg
outg
outg
Compiler Log Dangling Send fet2dec 3 Inst
20Proposed Primitive One-To-Many
when (r0 0) try_xfer(q.first()) if (ack)
r0 lt 1
rule when (all r 1) all r lt 0 q.deq()
recv()
mkRecv
start_prog
when (r1 0) try_xfer(q.first()) if (ack)
r1 lt 1
recv()
mkRecv
start_prog
broadcast()
mkBcast
when (r2 0) try_xfer(q.first()) if (ack)
r2 lt 1
start_prog
recv()
mkRecv
start_prog
when (r3 0) try_xfer(q.first()) if (ack)
r3 lt 1
recv()
mkRecv
start_prog
All rules and registers inserted during static
elaboration (dont know how many receivers during
instantiation)
- Tougher alternative many FIFOs
21Proposed Primitive Many-to-One
send()
rule when (q0.notEmpty) try_xfer(q0.first(),
0) if (ack) q0.deq()
mkSend
All rules inserted during static
elaboration (dont know IDs during instantiation)
debug_out
rule when (q1.notEmpty) try_xfer(q1.first(),
1) if (ack) q1.deq()
send()
mkSend
debug_out
listen()
mkListener
rule when (q2.notEmpty) try_xfer(q2.first(),
2) if (ack) q2.deq()
debug_out
send()
mkSend
debug_out
rule when (q3.notEmpty) try_xfer(q3.first(),
3) if (ack) q3.deq()
send()
mkSend
debug_out
- Is a fairness guarantee needed?
22Proposed Primitive Hub Servers
- Hub Server, Distributed Clients
- 1 Many-to-One Connection
- Reverse is many One-to-One connections
- Remove the ID and send it to the appropriate
destination
makeReq()
mkClient
getResp()
getReq()
mkHub Server
mem_load
makeResp()
makeReq()
mkClient
mem_load
getResp()
mem_load
23Proposed Primitive Hub Client
- Hub Client, Distributed Servers
- 1 One-to-Many Connection
- 1 Many-to-One Connection
getReq()
mkServer
makeResp()
stats_count
broadcastReq()
mkHub Client
getReq()
getResp()
mkServer
stats_count
makeResp()
stats_count
- Ability to send to individuals as well?