Multicore - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Multicore

Description:

Title: Multicore Author: Anant Agarwal Last modified by: xiaoping zhu Created Date: 12/4/1999 11:38:41 AM Document presentation format: On-screen Show (4:3) – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 28
Provided by: Anan46
Category:

less

Transcript and Presenter's Notes

Title: Multicore


1
(No Transcript)
2
The Kill Rule for Multicore
  • Anant Agarwal
  • MIT and Tilera Corp.

3
Multicore is Moving Fast
Corollary of Moores Law Number of cores will
double every 18 months
What must change to enable this growth?
4
Multicore Drivers Suggest Three Directions
  • Diminishing returns
  • Smaller structures
  • Power efficiency
  • Smaller structures
  • Slower clocks, voltage scaling
  • Wire delay
  • Distributed structures
  • Multicore programming

1. How we size core resources
2. How we connect the cores
3. How programming will evolve
5
How We Size Core Resources
3 cores Small Cache
6
KILL Rule for Multicore
Kill If Less than Linear
A resource in a core must be increased in area
only if the cores performance improvement is at
least proportional to the cores area increase
Put another way, increase resource size only if
for every 1 increase in core area there is at
least a 1 increase in core performance
Leads to power-efficient multicore design
7
Kill Rule for Cache Size Using Video Codec
8
Well Beyond Diminishing Returns
Madison Itanium2
Cache System
L3 Cache
Photo courtesy Intel Corp.
9
Slower Clocks Suggest Even Smaller Caches
Insight
Maintain constant instructions per cycle (IPC)
10
Multicore Drivers Suggest Three Directions
  • Diminishing returns
  • Smaller structures
  • Power efficiency
  • Smaller structures
  • Slower clocks, voltage scaling
  • Wire delay
  • Distributed structures
  • Multicore programming

1. How we size core resources
KILL rule suggests smaller caches for
multicore If the clock is slower by x, for
constant IPC, the cache can be smaller by x2
KILL rule applies to all multicore
resources Issue width 2-way is probably ideal
Simplefit, TPDS 7/2001 Cache sizes and number
of memory hierarchy levels
2. How we connect the cores
3. How programming will evolve
11
Interconnect Options
Packet routing through switches
12
Bisection Bandwidth is Important
13
Concept of Bisection Bandwidth
14
Meshes are Power Efficient
Energy Savings(Mesh vs. Bus)
Number of Processors
Benchmarks
15
Meshes Offer Simple Layout
ExampleMITs Raw Multicore
  • 16 cores
  • Demonstrated in 2002
  • 0.18 micron
  • 425 MHz
  • IBM SA27E standard cell
  • 6.8 GOPS

www.cag.csail.mit.edu/raw
16
Multicore
  • Single chip
  • Multiple processing units
  • Multiple, independent threads of control, or
    program counters MIMD

17
Multicore Drivers Suggest Three Directions
  • Diminishing returns
  • Smaller structures
  • Power efficiency
  • Smaller structures
  • Slower clocks, voltage scaling
  • Wire delay
  • Distributed structures
  • Multicore programming

1. How we size core resources
2. How we connect the cores
3. How programming will evolve
18
Multicore Programming Challenge
  • Multicore programming is hard. Why?
  • New
  • Misunderstood- some sequential programs are
    harder
  • Current tools are where VLSI design tools where
    in the mid 80s
  • Standards are needed (tools, ecosystems)
  • This problem will be solved soon. Why?
  • Multicore is here to stay
  • Intel webinar Think parallel or perish
  • Opportunity to create the API foundations
  • The incentives are there

19
Old Approaches Fall Short
  • Pthreads
  • Intel webinar likens it to the assembly of
    parallel programming
  • Data races are hard to analyze
  • No encapsulation or modularity
  • But evolutionary, and OK in the interim
  • DMA with external shared memory
  • DSP programmers favor DMA
  • Explicit copying from global shared memory to
    local store
  • Wastes pin bandwidth and energy
  • But, evolutionary, simple, modular and small core
    memory footprint
  • MPI
  • Province of HPC users
  • Based on sending explicit messages between
    private memories
  • High overheads and large core memory footprint

But, there is a big new idea staring us in the
face
20
Inspiration from ASICs Streaming
mem
Stream of data over a hardware FIFO
  • Streaming is energy efficient and fast
  • Concept familiar and well developed in hardware
    design and simulation languages

21
Streaming is Familiar Like Sockets
  • Basis of networking and internet software
  • Familiar popular
  • Modular scalable
  • Conceptually simple
  • Each process can use existing sequential code

22
Core-to-Core Data Transfer Cheaper than Memory
Access
  • Energy
  • 32b network transfer over 1mm channel 3pJ
  • 32KB cache read 50pJ
  • External access 200pJ
  • Latency
  • Reg to reg 5 cycles (RAW)
  • Cache to cache 50 cycle
  • DRAM access 200 cycle

Data based on 90nm process node
23
Streaming Supports Many Models
Pipeline
Not great for Blackboard style
Shared state
But then, there is no one size fits all
24
Multicore Streaming Can be Way Faster than Sockets
  • No fundamental overheads for
  • Unreliable communication
  • High latency buffering
  • Hardware heterogeneity
  • OS heterogeneity
  • Infrequent setup
  • Common-case operations are fast and power
    efficient
  • Low memory footprint

MCAs CAPI standard
25
CAPIs Stream Implementation 1
Process A (E.g., FIR1)
Process B (E.g., FIR2)
Core 1
Core 2
Multicore Chip
I/O register-mapped hardware FIFOs in SOCs
26
CAPIs Stream Implementation 2
Cache
Cache
Process A (E.g., FIR)
Process B (E.g., FIR)
Core 1
Core 2
On-chip Interconnect
Multicore Chip
On-chip cache to cache transfers over on-chip
interconnect in general multicores
27
Conclusions
  • Multicore is here to stay
  • Evolve core and interconnect
  • Create multicore programming standards users
    are ready
  • Multicore success requires
  • Reduction in core cache size
  • Adoption of mesh based on-chip interconnect
  • Use of a stream based programming API
  • Successful solutions will offer evolutionary
    transition path
Write a Comment
User Comments (0)
About PowerShow.com