Title: Access Regulation to HotModules in Wormhole NoCs
1Access Regulation toHot-Modules in Wormhole NoCs
Or Hot-Modules, Cool NoCs
- Isaskhar (Zigi) Walter
- Supervised by
- Israel Cidon, Ran Ginosar and Avinoam Kolodny
Technion Israel Institute of Technology
2Hot-Modules
- NoC is designed and dimensioned to meet QoS
requirements - Buffer sizing, routing, router arbitration, link
capacities, - NoC designers cannot tune everything
- Modules typically have limited capacity
- High-demanded, bandwidth limited modules create
edge bottlenecks - In SoC, often known in advance
- Off-chip DRAM, on-chip special purpose processor
- System performance is strongly affected
- Even if the NoC has infinite bandwidth
3Hot Module (HM) in NoC
- Wormhole, BE NoC
- At high Hot Module utilization, multiple worms
get stuck in the network - Two problems arise
- System Performance
- Source Fairness
IP(HM)
Interface
4Hot Module Affects the System
IP1(HM)
IP2
Problem1
Interface
Interface
Interface
IP3
- HM is not a local problem. Traffic not destined
at the HM suffers too!
5Source Fairness Problem
Problem2
HM
Interface
- Multiple locally fair decisions
-
- Global fairness
- The limited, expensive HM resource isnt fairly
shared
6Our Approach
- Problem is not caused by the NoC
- But rather by a congested end-point
- Solution should address the root cause
- Not the symptoms
- Utilize existing NoC infrastructure
- Solve both problems
- Simple and efficient
7Hot Module Congestion
- During congested periods, sources should not
inject packets towards the HM - Will experience increased delay anyway
- Better wait at the source, not in the network
- Keep routers unmodified!
8HM Allocation Control Basics
IP3
Interface
IP1
AllocationController
IP2(HM)
Interface
NoC
Interface
Interface
IP4
9HM Allocation Control Basics
IP3
Interface
IP1
Control
AllocationController
IP2(HM)
Interface
NoC
Interface
Interface
IP4
10HM Allocation Control Basics
IP3
Interface
IP1
Control
AllocationController
IP2(HM)
Interface
NoC
Interface
Interface
IP4
11HM Control Packets
Credit request packet
Credit reply packet
- The HM Controller receives all requests and can
employ any scheduling policy - Control packets are sent using a high service
level - Bypassing (blocked) data packets!
12Multiple Priority Router
Control packets
13Enhanced Request packet
- The request may include additional data as needed
- payloads priority, deadline, expiration time,
etc.
Optional fields
Credit request packet
14HM Allocation Controller
- The HM Allocation Controller is customized
according to systems requirements
CreditRequests
CreditReplies
Requests Decoder
Reply Encoder
PendingRequestsTable
LocalArbiter
HM Access Controller
15Further Enhancements
- Short packets are not negotiated
- Sources quota is slowly self-refreshing
- The mechanism is turned-off when the network is
not congested - Crediting modules ahead of time hides
request-grant latency - For light-load periods
16Not Classic Flow-Control
- Flow-control protects destinations buffer
- A pair-wise protocol
- HM access regulation protects the system
- Many-to-one protocol
17Results Synthetic scenario
- Hotspot traffic
- All-to-one traffic with all-to-all background
traffic - High network capacity
- Limited hot module bandwidth
- HM controller arbitration Round-robin
18System Performance
Average Packet Latency
X30
X10
19Hot vs. non-Hot Module Traffic
Average Packet Latency
X40
Using regulation, non-HM traffic latency is
drastically reduced
20Source Fairness
R
R
R
R
2
1
3
4
R
R
R
R
6
5
7
8
R
R
R
R
10
9
11
12
R
R
R
R
14
13
15
16
21Fairness in Saturated Network
No allocation control
With allocation control
- Simulation results for a 4-by-4 system,Data
packet length 200 flitsControl packet length 2
flits
- Hot-Module Utilization 99.99
- Regulated Hot-Module Utilization 98.32
22MPEG-4 Decoder
- Real SoC
- Over provisioned NoC
- Two hot-modules
23Results MPEG-4 Decoder
All traffic
HM/non-HM traffic breakdown
X8
X2
_at_80 load X8 reduction
_at_80 load X2 reduction
24The HMs are better utilized
No allocation control
With allocation control
Flows destined at HM1
Flows destined at HM2
Total
1?HM1 2?HM1 3?HM1 4?HM1 9?HM1 10?HM1
11?HM1
8?HM2 10?HM2 11?HM2 12?HM2
- Without regulation, the hot-modules are only 60
utilized - Traffic to one HM blocks the traffic to the
other!
25Hot-Module Placement
26Summary
- Hot-modules are common in real SoCs
- Hot-modules ruin system performance and are not
fairly shared - Even in NoCs with infinite capacity
- The network intensifies the problem
- But can also provide tools for resolving it
- Simple mechanism achieves dramatic improvement
- Completely eliminating the HM effects
Hot-Modules, Cool NoCs!
27Hot-Modules, Cool NoCs!
Thank you! Questions? zigi_at_tx.technion.ac
.il
QNoC
Research
Group
28Backup Slides
29Wormhole Routing
- Suits well on chip interconnect
- Small number of buffers
- Low latency
IP2
Interface
Interface
IP1
30Router-Based Approach?
- Virtual circuit
- Fair queuing
- Dedicated queues
- Deflective routing
- Packet combining
- Packet dropping
- Backpressure (credit/rate based)
- and more
X-Bar
- Can router detect congested periods?
31Router-Based Solutions?
- Routers must be fast, power and area efficient
- A few buffers
- Efficient routing
- Simple arbitration policy
- No state/flow memory
- Problem caused by end-points
- Address the root-cause
X-Bar
32Future Work
- Dynamically set hot-modules
- Other scheduling policies at hot-module
controller - Single/Multiple control modules for multiple HMs
- Placement
33References
1 E. Bolotin, I. Cidon, R. Ginosar, A.
Kolodny, QoS Architecture and Design Process
for Cost-Effective Network on Chip, Journal of
Systems Architecture, 2004 2 G. F. Pfister
and V. A. Norton, "Hot Spot contention and
combining in multistage interconnection
networks," IEEE Trans. Comp., Oct. 1985 3 D.
Bertozzi, A. Jalabert, S. Murali R. Tamhankar,
S. Stergiou, L. Benini, and G. De Micheli , "NoC
Synthesis Flow for Customized Domain Specific
Multiprocessor Systems-on-Chip", IEEE
Transactions on Parallel and Distributed Systems,
2005
34Is that new problem?
- Hotspots were comprehensively studied in the past
- Classically, solutions are categorized by the
mechanism policy - Avoidance-based (frequently impossible)
- Detection-based (requires threshold tuning)
- Prevention-based (overhead during light load)
- And by the mechanism implementation
- Central arbitration
- Router-based
- End-to-end flow-control
35Saturation (Un)Fairness
- A saturated router divides available BW equally
between inputs
R
R
R
R
IP
HM
IP
IP
R
R
R
R
IP
IP
IP
IP
R
R
R
R
IP
IP
IP
IP