Title: A Combinatorial Architecture for InstructionLevel Parallelism
1A Combinatorial Architecture for
Instruction-Level Parallelism
2Regulated Elements By Universal Scheme (REBUS)
EXECUTABLE PROGRAM
Partitioned Instruction Streams
Processing Elements with Replicated Scratchpad
Registers
Combinatorial Interconnection Structure
MCU
MCU
MCU
Sliced Memory Hierarchy
MEMORY SYSTEM
3Processing Elements (PE) and Memory Coordination
Units (MCU)
1 7 5
2 1 6
3 2 7
4 3 1
5 4 2
6 5 3
7 6 4
Reg
PE
2
3
4
5
6
7
1
2
3
4
5
6
7
1
MCU
(X1, X2, X3, X4, X5, X6, X7) using (7, 7, 3, 3 ,1)
4Structure of MCU and its connections
Processing Element
Processing Element
Processing Element
To other MCUs
Scratchpad Registers
Unit Controller
Cache Memory
To and From Main Memory
5Structure of PE
Global Signals Management
Processor With Private Memory
Queues of Scratchpad Copies
R2
R1
MCU Interface
R3
6Pairwise-balanced combinatorial interconnection
- Xx1, X2, X3, X4, X5, X6, X7, X8, X9 a
Balanced Incomplete Block (BIB) with
configuration (b, v, r, k, ?) - v element number b number of k-subsets r
each element appears exactly in r subsets ?
each pair of elements appears exactly in ?
subsets - vrbk
- For example (12,9,4,3,1) is a BIB
7Cont
- A program can be partitioned amongst the PEs by
having an instructions operand pair determine
the PE to which the instruction should be
designated - ADD R1 R7
- MULT R2 R6
- DIV R4 R5
PE 1
PE 2
PE 5
8Excellent ideas
- Implementing ultra parallelism using balanced
incomplete block(BIB) - Expand parallelism from instruction level to
assembly code level - Parallelism is not restricted in a small size
window of code - Support parallelism among a group of connected
processors - Compatible to current technologies using in
compiler and superscalar. - Could benefit to both RISC and CISC.
9Characteristics
- Using fixed format of assembly code
- Usage of memory coordination units (MCU)
- Need data replication
10Future work
- Apply on multi-threaded processing
- Various instruction format support