Title: COMP290-084 Clockless Logic and Silicon Compilers Lecture 3
1COMP290-084Clockless Logic and Silicon
CompilersLecture 3
- Montek Singh
- Tue, Jan 24, 2006
2Handshaking ExampleAsynchronous Pipelines
- Pipelining basics
- Fine-grain pipelining
- Example Approach MOUSETRAP pipelines
3Background Pipelining
- What is Pipelining? Breaking up a complex
operation on a stream of data into simpler
sequential operations
Storage elements(latches/registers)
Performance Impact Throughput significantly
increased (data items processed/second) Latency
somewhat degraded (seconds from input to
output)
4Focus of Asynchronous Community
- A Key Focus Extremely fine-grain pipelines
- gate-level pipelining use narrowest possible
stages - each stage consists of only a single level of
logic gates - some of the fastest existing digital pipelines to
date - Application areas
- general-purpose microprocessors
- instruction pipelines often 20-40 stages
- multimedia hardware (graphics accelerators, video
DSPs, ) - naturally pipelined systems, throughput is
critical input bursty - optical networking
- serializing/deserializing FIFOs
- string matching?
- KMP style string matching variable skip lengths
5MOUSETRAP Ultra-High-SpeedTransition-Signaling
Asynchronous Pipelines
- Singh and Nowick, Intl. Conf. on Computer Design
(ICCD), September 2001
6MOUSETRAP Pipelines
- Simple asynchronous implementation style, uses
- standard logic implementation Boolean gates,
transparent latches - simple control 1 gate/pipeline stage
- MOUSETRAP uses a capture protocol Latches
- are normally transparent before new data
arrives - become opaque after data arrives (capture
data) - Control Signaling transition-signaling
2-phase - simple protocol req/ack only 2 events per
handshake (not 4) - no return-to-zero
- each transition (up/down) signals a distinct
operation - Our Goal very fast cycle time
- simple inter-stage communication
7MOUSETRAP A Basic FIFO
- Stages communicate using transition-signaling
Latch Controller
1 transition per data item!
ackN-1
ackN
En
doneN
reqN
reqN1
Data in
Data out
Data Latch
Stage N
Stage N-1
Stage N1
2nd data item flowing through the pipeline
1st data item flowing through the pipeline
1st data item flowing through the pipeline
8MOUSETRAP A Basic FIFO (contd.)
- Latch controller (XNOR) acts as protocol
converter - 2 distinct transitions (up or down) ? pulsed
latch enable
Latch Controller
2 transitions per latch cycle
ackN-1
ackN
En
reqN
reqN1
doneN
Data in
Data out
Data Latch
Stage N
Stage N-1
Stage N1
9MOUSETRAP FIFO Cycle Time
N re-enabled to compute
N1 computes
N computes
10Detailed Controller Operation
Stage Ns Latch Controller
ack from N1
done from N
to Latch
- One pulse per data item flowing through
- down transition caused by done of N
- up transition caused by done of N1
11MOUSETRAP Pipeline With Logic
Simple Extension to FIFO insert logic block
matching delay in each stage
Latch Controller
ackN-1
ackN
reqN1
reqN
delay
delay
delay
doneN
Data Latch
Stage N1
Stage N
Stage N-1
- Logic Blocks can use standard single-rail
(non-hazard-free) - Bundled Data Requirement
- each req must arrive after data inputs valid
and stable
12Complex Pipelining Forks Joins
- Problems with Linear Pipelining
- handles limited applications real systems are
more complex
- Contribution introduce efficient circuit
structures - Forks distribute data control to multiple
destinations - Joins merge data control from multiple sources
- Enabling technology for building complex async
systems
13Forks and Joins Implementation
Join merge multiple requests
Fork merge multiple acknowledges
14Performance, Timing and Optzn.
15Timing Analysis
- Main Timing Constraint avoid data overrun
- Data must be safely captured by Stage N
- before new inputs arrive from Stage N-1
- simple 1-sided timing constraint fast latch
disable - Stage Ns self-loop faster than entire path
through previous stage
16Experimental Results
- Simulations of FIFOs
- 3 GHz (in 0.13u IBM process)
- Recent fabricated chip GCD
- 2 GHz simulated speed
- chips awaited