Enabling Technology for OnChip Interconnection Networks - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

Enabling Technology for OnChip Interconnection Networks

Description:

MARS Router. 1984. Torus Routing Chip. 1985. Network Design Frame. 1988. MDP 1991 ... Concentrated Mesh with Express Channels. Flattened Butterfly. Flow Control ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 50
Provided by: william952
Category:

less

Transcript and Presenter's Notes

Title: Enabling Technology for OnChip Interconnection Networks


1
Enabling Technology for On-Chip Interconnection
Networks
  • William J. DallyComputer Systems
    LaboratoryStanford University
  • NOCS-1
  • May 7, 2007

2
Outline
  • Off-Chip Networks
  • Demand for On-Chip Networks
  • What is Unique about On-Chip Networks
  • Enabling Technologies
  • Circuits - set the constraints
  • Topology
  • Micro-Architecture
  • Some Open Problems

3
State of Off-Chip Networks
4
Technology Trends
BlackWidow
5
Some History
Torus Routing Chip 1985
MARS Router 1984
MDP 1991
Network Design Frame 1988
Reliable Router 1994
YARC 2006
MAP 1998
Imagine 2002
6
Some very good books
7
Summary of Off-Chip Networks
  • Topology
  • Fit to packaging and signaling technology
  • High-radix - Clos or FlatBfly gives lowest cost
  • Routing
  • Global adaptive routing balances load w/o
    destroying locality
  • Flow control
  • Virtual channels/virtual cut-through

oversimplified
8
Urgent Demand for OCINs
9
The Future is CMPsOCINs are a Critical Component
2006
2007.5
2009
2010.5
2012
2015
2013.5
10
Example CMP OCIN
11
Growing Complexity of SoCs Demands an On-Chip
Interconnection Network
Avner GorenTIEPF 2004
12
So, whats different about on-chip networks?
13
Cost, Channels, Workload are Different
  • Cost
  • Off-chip cost is channels - pins, connectors,
    cables, optics
  • On-chip cost is Si area and Power (storage and
    switches), wires plentiful
  • Drives networks with many long, wide channels,
    few buffers
  • Channel Characteristics
  • On-chip RC lines - need a repeater every 1mm (or
    less)
  • Short distance - low latency
  • Can put logic in repeaters, motivates low-latency
    routers
  • Workload
  • CMP cache traffic
  • SoC isochronous flows
  • Design issues
  • Floorplanning
  • Different constraints motivate some surprising
    differences in design.

14
Enabling Technology is a Prerequisite
Channels, Buffers, Switches
Topology Routing Flow Control
Microarchitecture
15
Circuits set Cost Area Constraints for
Architecture
  • Can do substantially (10x-100x) better than
    default circuits

16
Channels
  • 10x to 100x power reduction
  • Eq signaling for faster propagation and increased
    repeater distance (D P Chapter 8, Heaton 01)
  • Elastic channels provide free buffers (Mizuno
    01)
  • Send 4-8 bits per cycle per wire (assuming 20FO4
    cycle)

17
Buffers
  • Dense arrays (vs. Flip-Flops or Latches)
  • 1/10 area/bit
  • 1/10 power for low-swing read
  • Low-swing write 1/10 power for writes.
  • Low-swing read - can keep swing low through muxes.

18
Switches
  • Low-swing bit lines
  • Operate at channel rate
  • Reduces area and hence power
  • Equalized drive
  • Buffered crosspoints
  • Integral allocation

19
Circuits Impact Architecture
  • With standard-cell approach
  • Power is approximately evenly split between
    channels, buffers, and routers
  • With efficient circuits
  • Channels 1/30, buffers 1/3
  • Routers dominate
  • Routing gtgt Buffering gtgt Propagating
  • Motivates topologies with fewer hops, longer
    channels.
  • Just propagate bits - avoid buffering, really
    avoid routing

20
Properties of these elements drives optimal
network organization
21
On-Chip Interconnection Network
System Processor Tiles
Source Balfour and Dally, ICS 06
22
On-Chip Interconnection Network (2)
System Processor Tiles Channels
Source Balfour and Dally, ICS 06
23
Interconnection Network (3)
System Processor Tiles Channels Routers
Source Balfour and Dally, ICS 06
24
Router Architecture
  • Input-queued
  • Virtual Channel
  • Speculative Pipeline

Source Balfour and Dally, ICS 06
25
Router Area
Accurate modeling requires floorplan
Source Balfour and Dally, ICS 06
26
Torus
Source Balfour and Dally, ICS 06
27
Concentrated Mesh
Source Balfour and Dally, ICS 06
28
Express Links
Source Balfour and Dally, ICS 06
29
Network Replication
  • Abundant wire resources build second
    network
  • Resource allocation tradeoff
  • Wide
  • Serialization Latency
  • Router Energy Efficiency
  • - Router Area

Replicated Decoupled Resources Area
Efficiency ? Energy Efficiency -
Serialization Latency
SCALABLE
Source Balfour and Dally, ICS 06
30
Energy Efficiency
Network Energy Completion Time (normalized to
Torus network)
Source Balfour and Dally, ICS 06
31
Large differences in efficiency.Optimal
topology not obvious, not regular and very
sensitive to properties of network elements
32
Where is Energy Expended?
Source Balfour and Dally, ICS 06
33
On-Chip Flattened Butterfly
Conventional 2D Mesh
2D Flattened Butterfly
Source Kim and Dally, to appear
34
On-Chip Flattened Butterfly
dimension 1
Layout Mapping
dimension 2
Source Kim and Dally, to appear
35
Bypass Channels
Conventional Flattened Butterfly
Flattened Butterfly with Bypass Channels
connected to local router
Source Kim and Dally, to appear
36
Latency Comparison
Source Kim and Dally, to appear
37
Power Comparison
Source Kim and Dally, to appear
38
Flow Control
  • Trade channel bandwidth (cheap) for buffer space
    (expensive)
  • Make buffers shallow
  • Compensate for lower duty factor by
    overprovisioning channels
  • Little cost in energy
  • Circuit switching (no buffers)
  • Elastic buffers - use free buffers in the
    channels

39
Flow Control in an On-Chip FlatBfly
40
View as Two Buffered Links
S
X
D
41
Channels Have Repeaters
S
X
D
42
Buffers Decouple Channel Allocation in Time
S
X
D
S
X
D
43
Circuit Switching
S
X
D
S
X
D
44
With Elastic Buffers
S
X
D
S
X
D
45
Research Directions
46
NSF Workshop Identified 3 Critical Issues
  • Power
  • OCINs will have 10x the required power with
    current approaches
  • Circuit and architecture innovations can close
    this gap
  • Latency
  • OCIN latency currently not competitive with buses
    and dedicated wiring
  • Lower diameter topologies
  • Novel flow-control strategies required
  • Tool Integration
  • OCINs need to be integrated with standard tool
    flows to enable widespread use
  • See http//www.ece.ucdavis.edu/ocin06/

47
A Research Agenda
  • Develop efficient network elements
  • Channels, buffers, switches, allocators
  • Opportunities for 10x-100x improvements in
    efficiency
  • Enabling technology
  • Capture workloads representative of CMPs and SoCs
  • Develop optimal topologies for 1 and 2
  • Develop efficient routing and flow-control
    methods
  • Load-balanced routing
  • Buffer-efficient flow control
  • Develop efficient router microarchitectures
  • Single cycle, area efficient
  • Prototype to test assumptions
  • Iterate

48
Some Specific Topics
  • Efficient network elements - enabling building
    blocks
  • Low diameter topologies with minimum cost (area
    and power)
  • Flow control that allows packets to pass with
    elastic buffers
  • Low-latency router microarchitecture - with high
    radix

49
Summary
  • OCINs critically important
  • Vital component of CMPs, SoCs
  • Less mature technology than other components
  • Very different than off-chip networks
  • Cost, Channels, Workloads, Design Issues
  • Efficient network elements are enabling
    technology
  • Energy and area efficient channels, buffers,
    switches
  • Change the equation for network design channels
    ltlt buffers ltlt routers
  • Topology
  • Minimize diameter
  • Concentrated Mesh with Express Channels
  • Flattened Butterfly
  • Flow Control
  • Minimize buffers at switch points
  • Use elastic buffers to minimize latency
  • Many research opportunities
Write a Comment
User Comments (0)
About PowerShow.com