Physical Implementation of Computation - PowerPoint PPT Presentation

1 / 28

About This Presentation

Title:

Physical Implementation of Computation

Description:

Physical Implementation of Computation. Andr DeHon. California Institute of ... How do we design and engineer physical devices which implement computations? ... – PowerPoint PPT presentation

Number of Views:42

Avg rating:3.0/5.0

Slides: 29

Provided by: andre57

Category:

more less

Transcript and Presenter's Notes

Title: Physical Implementation of Computation

1
Physical Implementation of Computation

André DeHon
California Institute of Technology

Sastry/ITO May 24, 2000
2

How do we design and engineer physical devices
which implement computations?
How do we build programmable VLSI computing
devices in the era of billion transistor
silicon die capacity? (and beyond)
Capacity increase by 1000-100,000
1984 15Ml2?1999 30Gl2 ? 2007 1Tl2

3
DARPA/ITO Background

Microsystems
MIT Large Scale Parallel Systems 1988-1993
MIT Reinventing Computing 1993-1996
Adaptive Computing Systems (JITHW)
UCB BRASS 1996-present
Polymorphic Computing?

4
Outline

Programmable Design Space
Instructions Organization Effects
Size
Interconnect
Requirements of Computation

5
Programmable Design Space

Basic design params.
SIMD data width (w)
Instruction Depth (c)
Retiming Depth (d)
Intercon. Richness (p)
Control Granularity

Overview IEEE Computer, April 2000
6
Architectural Characterization
Temporal
Spatial
7
Peak Computational Densities from Model

Small slice of space
only 2 parameters
100? density across

Large difference in peak densities
large design space!

8
Yielded Efficiency
FPGA (cw1)
Processor (c1024, w64)

Large variation in yielded density
large design space!

9
Large Design Space Reflection (1)

No one, conventional architecture is robustly
general-purpose across design space.
E.g. processor can be orders of magnitude less
efficient than an alternative programmable
architecture

10
Large Design SpaceReflection (2)

Need to understand space and application
characteristics to tailor Application-Specific
Processors.
Specific applications may have limited/dominating
characteristics in space
Can get it wrong by orders magnitude

11
UCB BRASS RISCHSRA(heterogeneous mix)

Integrate
processor
reconfig. array

Key Idea
best of both worlds temporal/spatial

12
MIT MATRIX

Make instruction distribution flexible
Efficient/flexible word size and depth
Base unit
8-bit RFALU slice
c4 or 256, d1 or 128
w8 expandable

FCCM96/HotChips97
13
Design Lesson?

General Purpose
BRASS hybrid is a first step
integrating two complementary points
generalize?
Application Specific
Within an application, requirements vary
even here, single point in space can be
suboptimal
identify best portfolio

14
Generalize Mix-and-Match
Heterogeneous Composition
Heterogeneous Tile
? Framework to systematize exploration and
construction
15
Interconnect

Along with instruction store, also dominant area
in temporal (processor) designs
Also dominant power, delay...

Dominant area in spatial designs

16
Can Parameterize Richness
p0.5
p0.75
Interconnect Richness ?
17
Effects of Richness on Area
18
How rich interconnect?
Single design
Binary tree or 1-D p0.0
Crossbar p1.0
Interconnect Richness ?
19
Interconnect

Since grows faster than linear in system size
not surprising dominant component
not surprising importance is growing
Important develop a systematic understanding of
design
richness and structure
energy, delay, power tradeoffs
switching requirements
mapping/routing requirements

20
What capacity is required to perform a
computation?

Strong component based on structural
characteristics just identified
application interconnect richness, throughput,
instruction locality/diversity, state
Also a component based on dataset characteristics
information content of input

21
Idea

Program semantics is very general
handle any data input
Specialized code/circuits
require less capacity
less cycles, less circuits
Input data not random, structured
Exploit to minimize work
very roughly like data compression

22
Examples Information Content

Branch predictability
e.g. trace-schedule likely path
Common/exceptional case
e.g. common case no error condition
Memoization
save/cache result rather than recompute
Binding time
value unchange once bound, specialize around

23
Data Example

Conventional C semantics
compute on 32b quantities
But many items not need that full width
Look at bit ops actually used in practice
Identify fraction of bit-ops doing useful work on
conventional processors

Student Eylon Caspi (UCB)
24
Bit Classification
25
Lesson

With simple models can identify large amount of
redundancy in conventional computations
e.g. 60-75 of bit-ops redundant
Interesting to develop a Specialization Theory,
computational analog to Information Theory

26
Vectors