Introduction to Reversible Computing: Motivation, Progress, and Challenges - PowerPoint PPT Presentation

About This Presentation
Title:

Introduction to Reversible Computing: Motivation, Progress, and Challenges

Description:

Introduction to Reversible Computing: Motivation, Progress, and Challenges ... M. Frank, 'Introduction to Reversible Computing' 7. Reliability Bound on Logic ... – PowerPoint PPT presentation

Number of Views:391
Avg rating:3.0/5.0
Slides: 32
Provided by: Michael2156
Learn more at: https://eng.fsu.edu
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Reversible Computing: Motivation, Progress, and Challenges


1
Introduction to Reversible Computing
Motivation, Progress, and Challenges
  • ACM Computing Frontiers Conference 2005
  • Special Session1st Intl Workshop on Reversible
    Computing
  • Thursday, May 5, 2005

2
Abstract of Talk
  • The practical performance of a computational
    process is ultimately limited by its energy
    efficiency.
  • Useful work accomplished per unit energy
    dissipated.
  • Fundamental physics limits the energy efficiency
    of conventional, irreversible logic.
  • The energy efficiency of conventional devices
    will likely be forced to level off in roughly the
    next 10-20 years.
  • Further advances beyond this point will require
    the use of highly energy-recovering circuit
    techniques
  • and (eventually) this will require an increasing
    degree of logical reversibility throughout the
    digital design.
  • In this talk, we
  • explain these motivations for reversible
    computing,
  • summarize some recent progress towards its
    realization
  • and discuss some outstanding challenges for the
    field.

3
Introduction to Reversible Computing
  • PART 1 Motivation

4
Energy Efficiency
  • The efficiency ? of a process that consumes
    valued resource R and produces valued product P
    is the ratio between the amount of product
    produced, and the amount of resource consumed ?
    Pprod/Rcons.
  • Example 1 A heat engine consumes (which in
    this case, means degrades) an amount Q of
    high-temperature heat energy, and produces an
    amount W of work.
  • The heat engines efficiency is thus ?h.e. W/Q.
    (Dimensionless.)
  • Of course, ?h.e lt 1 because of the conservation
    of energy
  • In the 19th cent., Sadi Carnot showed that ?h.e.
    (TH - TL)/TH.
  • Where TH,TL temps. of hot, cold thermal
    reservoirs
  • Example 2 A computer (i.e., computational
    engine) consumes an amount Econs of free energy,
    and performs Nops useful computational operations
    (produces Nops operations worth of useful
    computational effort).
  • The computers (energy) efficiency is thus
    ?E,comp Nops/Econs.
  • Units Operations per unit energy, or
    ops/sec/watt.

5
Lower Bounds on Energy Dissipation
  • In todays 90 nm VLSI technology, for minimal
    operations (e.g., conventional switching of a
    minimum-sized transistor)
  • Ediss,op is on the order of 1 fJ (femtojoule) ?
    ?E ? 1015 ops/sec/watt.
  • Will be a bit better in coming technologies (65
    nm, maybe 45 nm)
  • But, conventional digital technologies are
    subject to several lower bounds on their energy
    dissipation Ediss,op for digital transitions
    (logic / storage / communication operations),
  • And thus, corresponding upper bounds on their
    energy efficiency.
  • Some of the known bounds include
  • Leakage-based limit for high-performance
    field-effect transistors
  • Maybe roughly 5 aJ (attojoules) ? ?E ? 21017
    operations/sec./watt
  • Reliability-based limit for all
    non-energy-recovering technologies
  • Roughly 1 eV (electron-volt) ? ?E ? 61018
    ops./sec/watt
  • von Neumann-Landauer (VNL) bound for all
    irreversible technologies
  • Exactly kT ln 2 18 meV ? ?E ? 3.51020
    ops/sec/watt
  • For systems whose waste heat ultimately winds up
    in Earths atmosphere,
  • i.e., at temperature T Troom 300 K.

6
Trend of Min. Transistor Switching Energy
Based on ITRS 97-03 roadmaps
fJ
Node numbers(nm DRAM hp)
Practical limit for CMOS?
aJ
Naïve linear extrapolation
zJ
7
Reliability Bound on Logic Signal Energies
  • Let Esig denote the logic signal energy,
  • The energy involved in storing, transmitting, or
    transforming a bits worth of digital
    information.
  • But note that involved does not necessarily
    mean dissipated!
  • As a result of fundamental thermodynamic
    considerations, it is required that Esig kBTsig
    ln R,
  • Where kB is Boltzmanns constant, 1.3810-12 J/K
  • and Tsig is the temperature of the local
    subsystem carrying the signal
  • and R is the reliability factor, i.e., the
    improbability 1/perr of error.
  • In non-energy-recovering logic technologies
    (totally dominant today)
  • Basically all of the signal energy is dissipated
    to heat on each operation.
  • And often additional energy (e.g., short-circuit
    power) as well.
  • In this case, minimum sustainable dissipation is
    Ediss,op ? kBTenv ln R,
  • Where Tenv is now the temperature of the
    waste-heat reservoir
  • Averages around 300 K (room temperature) in
    Earths atmosphere
  • For a decent R 21017, this energy is 40 kT
    1 eV.
  • ? For energy efficiency gt 1 op/eV, we must
    recover some of the signal energy.
  • Rather than dissipating it all to heat with each
    manipulation of the signal.

8
(von Neumann?)-Landauer (VNL) Bound
A rigorous result first stated clearly by Rolf
Landauer, IBM, 1961
(von Neumann had suggested something similar in
1949 but did not publish details)
  • Bound is a simple, direct logical consequence of
    the time-reversibility (invertibility) of all
    fundamental physical dynamics.
  • This in turn is implied by the Hamiltonian
    formulation of all mechanics e.g., the unitarity
    of quantum mechanics. ? Very firmly established!
  • Invertibility implies physical information cant
    be destroyed!
  • Only reversibly (i.e., mathematically invertibly)
    transformed!
  • When we lose or discard a bits worth of logical
    information,
  • e.g., by erasing or destructively overwriting a
    bit storage location
  • the lost information must actually remain in
    existence,
  • if not in a known form, then as a bits worth (k
    ln 2) of physical entropy.
  • Entropy simply means unknown information residing
    in the physical state.
  • If the logical bit was originally known (not
    entropy)
  • then, entropy has increased in this process by ?S
    1 bit k ln 2.
  • The energy in the heat reservoir must be
    increased by an amount ?STenv kTenv ln 2 in
    order to accommodate this additional entropy.

9
VNL Bound on Energy Dissipation from Information
Loss
Follows directly from the reversibility of
fundamental physics!
N physical microstates per logical
macrostatebefore bit erasure(shown as 8 for
clarity in this simple example)
Physicalmicrostatetrajectories
Logical state 0,after operation
S k ln 8 3 bits
S k ln 16 4 bits
Logical state 0,before operation
?S 1 bit k ln 2
Logical state 1,before operation
Ediss ?STenv kTenv ln 2
S k ln 8 3 bits
10
Reversible Computing
  • The basic idea is simply this
  • Dont erase information when performing logic /
    storage / communication operations!
  • Instead, just reversibly (invertibly) transform
    it in place!
  • When reversible digital operations are
    implemented using well-designed energy-recovering
    circuitry,
  • This can result in local energy dissipation
    Ediss ltlt Esig,
  • this has already been empirically demonstrated by
    many groups.
  • and even total energy dissipation Ediss ltlt kT ln
    2!
  • This has been shown in theory, but we are not yet
    to the point of demonstrating such low levels of
    dissipation experimentally.
  • Achieving this goal requires very careful design,
  • and verifying it requires very sensitive
    measurement equipment.

11
Introduction to Reversible Computing
  • PART 2 Progress (1973-2005)

12
A Few Highlights Of Reversible Computing History
  • Bennett, 1973-1989
  • Reversible Turing machines emulation algorithms
  • Can run virtual irreversible machines on
    reversible architectures.
  • But, the emulation introduces some inefficiencies
  • Early chemical Brownian-motion models of
    physical implementations.
  • Fredkin and Toffoli, late 1970s/early 1980s
  • Reversible logic gates and networks
  • Ballistic and adiabatic implementation schemes
  • Groups _at_ Caltech,ISI,Amherst,Xerox,MIT, 85-95
  • Concepts implementation for adiabatic circuits
    in VLSI
  • Small explosion of adiabatic circuit literature
    since then
  • Mid 1990s-today
  • Better understanding of overheads, tradeoffs,
    asymptotic scaling
  • A few groups begin exploring post-CMOS
    implementations

13
Early Chemical Implementations
  • How to physically implement reversible logic?
  • Bennetts original inspiration DNA
    polymerization!
  • Reversible copying of a DNA strand
  • Molecular basis of cell division / organism
    reproduction
  • This (and all) chemical reactions are reversible
  • Direction (forward vs. backward) reaction rate
    depends on relative concentrations of reagent and
    product species ? affect free energy
  • Energy dissipated per step turns out to be
    proportional to speed.
  • Implies process is characterized by an
    energy-time constant.
  • I call this the energy coefficient cE
    Ediss,optop Ediss,op/fop.
  • For DNA, typical figures are 40 kT 1eV _at_ 1,000
    bp/s
  • Thus, the energy coefficient cE is about 1
    eV/kHz.
  • Can we achieve better energy coefficients?
  • Yes, in fact, we had already beat DNAs cE in
    reversible CMOS VLSI technology circa 1995!

14
Energy Coefficients in Electronics
  • For a transition involving the adiabatic transfer
    of an amount Q of charge along a path with
    resistance R
  • The raw (local) energy coefficient is given by
    cE Edisst Pdisst2 IVt2 I2Rt2 Q2R.
  • Here, V is the voltage drop along the path
  • Example In a fairly recent (180 nm) CMOS VLSI
    technology
  • Energy stored per min. sized transistor gate 1
    fJ _at_ 2V
  • Corresponds to charge per gate of Q 1 fC
    6,000 electrons
  • Resistance per turned-on transistor of 14 k?
  • Order of quantum resistance R R0 1/G0 h/2q2
    12.9 k?
  • Ideal energy coefficient for a single-gate
    transition 1.410-26 J/Hz
  • Or in more convenient units, 80 eV/GHz 0.08
    eV/MHz!
  • with some expected overheads for a simple test
    circuit, calculated energy coefficient comes out
    to about 8 higher, or 10-25 Js
  • Or 600 eV/GHz 0.6 eV/MHz.
  • Detailed Cadence simulations gave us, per
    transistor
  • _at_ 1 GHz P 20 µW, E 20 fJ 1.2 keV, so Ec
    1.2 eV/MHz
  • _at_ 1 MHz P 0.35 pW, E 3.5 aJ 2.2 eV, so Ec
    2.1 eV/MHz

Q
R
15
Simulation Results from Cadence
  • Assumptions caveats
  • Assumes ideal trapezoidal power/clock
    waveform.
  • Minimum-sized devices, 2?3? .18 µm (L)
    .24 µm (W)
  • nFET data is shown pFETs data is very
    similar
  • Various body biases tried Higher Vth
    suppresses leakage
  • Room temperature operation.
  • Interconnect parasitics have not yet been
    included.
  • Activity factor (transitions per
    device-cycle) is 1 for CMOS, 0.5 for 2LAL in
    this graph.
  • Hardware overhead from fully- adiabatic
    design style is not yet reflected 2
    transistor-tick hardware overhead in known
    reversible CMOS design styles

1 nJ
100 pJ
10 pJ
Standard CMOS
10 aJ
1 pJ
1 aJ
1 eV
Energy dissipated per nFET per cycle
100 fJ
2V
100 zJ
2LAL 1.8-2.0V
1V
10 fJ
10 zJ
0.5V
0.25V
1 fJ
kT ln 2
1 zJ
100 aJ
100 yJ
16
A Useful Two-Bit PrimitiveControlled-SET or
cSET(a,b)
  • Semantics If a1, then set b1.
  • Conditionally reversible, if the special
    precondition ab0 is met.
  • Note its 1-to-1 on the subset of states used
  • Sufficient to avoid Landauers principle
  • Can implement cSET in dual-rail CMOS with a pair
    of transmission gates
  • Each needs just 2 transistors
  • plus one drive signal
  • This 2-bit semi-reversible operation its
    inverse are together universal for reversible
    (and irreversible) logic!
  • If we compose them in special ways.

a b a b
0 0 0 0
0 1 0 1
1 0 1 1
drive
(0?1)
a
switch(T-gate)
b
b
a
17
Reversible OR (rOR) from cSET
  • Semantics rOR(a,b) if ab, c1.
  • Set c1 on the condition that either a or b is
    1.
  • Reversible under precondition that initially ab
    ? c.
  • Two parallel cSETs simultaneouslydriving a
    single output lineimplements the rOR operation!
  • This type of composition is not traditionally
    considered.
  • Similarly one can do rAND, and
    reversibleversions of all operations.
  • Logic synthesis is extremelystraightforward

Hardware diagram
a
c
b
Spacetime diagram
a
a
a OR b
0
c
c
b
b
18
O(log n)-time carry-skip adder
With this structure, we can do a2n-bit add in
2(n1) logic levels? 4(n1) reversible ticks?
n1 clock cycles. Hardwareoverhead islt2
regularripple-carry!
  • (8 bit segment shown)

3rd carry tick
2nd carry tick
4th carry tick
1st carry tick
19
32-bit Adder Simulation Results
1V CMOS
1V CMOS
0.5V CMOS
0.5V CMOS
2V 2LAL, Vsb1V
2V 2LAL, Vsb1V
(All results normalized to a throughput level of
1 add/cycle)
20
CMOS Gate Implementing rLatch / rUnLatch
  • Symmetric Reversible Latch

Implementation
Icon
Spacetime Diagram
crLatch
crUnLatch
connect
in
mem
in
2
mem
in
or
connect
(in)
mem
in
mem
  • Just a transmission gate again
  • This time controlled by a clock, with the data
    signal driving
  • Concise, symmetric hardware icon Just a short
    orthogonal line
  • Thin strapping lines denote connection in
    spacetime diagram.

21
Example Building cNOT from rlXOR
  • rlXOR(a,b,c) Reversible latched XOR.
  • Semantics c a?b.
  • Reversible under precondition that c is initially
    clear.
  • cNOT(a,b) Controlled-NOT operation.
  • Semantics b a?b. (No preconditions.)
  • A classic primitive in reversible quantum
    computing
  • But, it turns out to be fairly complex to
    implement cNOT in available fully adiabatic
    hardware
  • Thus, its really not a very good building block
    for practical hardware designs!
  • We can (of course) still build it, if we really
    want to.
  • Since, as I said, our gate set is universal for
    reversible logic

22
cNOT from rlXOR Hardware Diagram
  • A logic block providing an in-place cNOT
    operation (a cNOT gate) can be constructed
    from 2 rlXOR gates and two latched buffers.
  • The key is
  • Operate some of the gates in reverse!

Reversiblelatches
A
B
X
23
Introduction to Reversible Computing
  • PART 3 Challenges for the Field

24
Challenges for the Field
  • If we want our field to go beyond academia,
  • and become a practical computing technology,
  • then we need to address both
  • a few remaining technological challenges
  • and also, a variety of PR type challenges
  • because these are closely coupled!
  • A convincing technology gets people excited
  • Positive perceptions ? more funding, workers

25
Technological Challenges
  • Fundamental theoretical challenges
  • Find more efficient reversible algorithms
  • Or prove rigorous lower bounds on complexity
    overheads
  • Study fundamental physical limits of reversible
    computing
  • Implementation challenges
  • Design new devices with lower energy coefficients
  • Design high-quality resonators for driving
    transitions
  • Empirically demonstrate large system-level power
    savings
  • Application development challenges
  • Find a plausible near- to medium-term killer
    app for RC
  • Something thats very valuable, and cant be done
    without it
  • Build a prototype RC-based solution prototype

26
Plenty of Room forDevice Improvement
Power per device, vs. frequency
  • Recall, irreversible device technology has at
    most 3-4 orders of magnitude of
    power-performance improvements remaining.
  • And then, the firm kT ln 2 limit is encountered.
  • But, a wide variety of proposed reversible device
    technologies have been analyzed by physicists.
  • With theoretical power-performance up to 10-12
    orders of magnitude better than todays CMOS!
  • Ultimate limits are unclear.

.18µm CMOS
.18µm 2LAL
k(300 K) ln 2
Variousreversibledevice proposals
27
MEMS Resonator (One Concept)
(PATENT PENDING, UNIVERSITY OF FLORIDA)
Arm anchored to nodal points of fixed-fixed beam
flexures,located a little ways away, in both
directions (for symmetry)

z
y
Phase 180 electrode
Phase 0 electrode
Repeatinterdigitatedstructurearbitrarily
manytimes along y axis,all anchored to the
same flexure
x
C(?)
C(?)
0
360
0
360
?
?
28
A Challenge for Our Community
  • I suspect that the fields critics will never be
    silenced by theory and simulations alone
  • To prove to the world that reversible computing
    can really work will require a complete empirical
    demonstration.
  • We thus cannot afford to continue to sweep issues
    such as resonator design under the rug
  • A convincing demonstration of low total system
    power must be completely self-contained,
    including the resonator.
  • with only DC power input as needed to keep it
    running
  • My challenge to us
  • Lets work together to fabricate and empirically
    demonstrate a simple test chip (e.g., a binary
    counter) that measurably dissipates much less
    than the logic signal energy, and eventually much
    less than some small multiple of kT energy
    (within a room temperature environment)
  • Where this measures wall-plug power, as our
    critics like to put it.

29
Public Relations Challenges
  • Difficulty Reversible computing is little known
  • And people have a lot of misconceptions about it.
  • We need to strive to do better at things like
  • Educating the broader science, engineering, and
    CS community about the field
  • Including overcoming misconceptions and
    prejudices
  • Gaining political standing with funding
    agencies, industry, investors, professional
    organizations
  • To lead to the next level of more intensive
    research
  • Working collaboratively with colleagues in other
    disciplines (outside CS) who have relevant skills
  • Device physicists, analog circuit designers, etc.

30
Conclusions
  • Reversible computing will very likely become
    necessary within our lifetimes,
  • if we are to continue progress in computing
    performance/power.
  • Much progress in our understanding of RC has been
    made in the past three decades
  • But much important work still remains to be done.
  • Lets work together to solve the difficult
    technological challenges, as well as to raise
    awareness improve perceptions of the field.
  • I hope this workshop will help that to happen

31
Structure of Todays Session
  • Sub-session 1 Perspectives on RC (-1100 am)
  • Bennetts keynote, this introductory talk
  • Eric DeBenedictis on supercomputing apps
  • Sub-session 2 Novel Impl. Techs. (1120-1250)
  • Sarah Frost, Notre Dame, RC with Quantum Dots
  • Erik Forsberg, KTH/Zhejiang, Y-branch switches
  • Sub-session 3 Quasi-reversible circuits (2-350)
  • Four talks, groups from USA, Korea, Germany
  • Sub-session 4 Rev. comp. theory (420-520)
  • Paul Vitanyi, time/space/energy tradeoffs
  • Levitin Toffoli, on thermodynamic limits of RC
  • Panel Discussion What next steps should we take?
Write a Comment
User Comments (0)
About PowerShow.com