Code Compression for VLIW Processors Using Variabletofixed Coding presentation

About This Presentation

Transcript and Presenter's Notes

Title: Code Compression for VLIW Processors Using Variabletofixed Coding

1
Code Compression for VLIW Processors Using
Variable-to-fixed Coding

Yuan Xie, Wayne Wolf, Haris Lekatsas
Princeton University ISSS02
2004/07/03

2
Outline

Introduction
Related works
Compression algorithm
Decompression architecture
Power reduction for instruction bus
Experimental results
Conclusion and future work

3
Introduction

Important issues for embedded systems
Restricted memory size
poses serious constraints on program size.
Power consumption
busses in a typical IC consume half of total chip
power.
Main contributions in this paper
presents novel code compression schemes.
based on variable-to-fixed (V2F) coding
makes decompressor design easier.
enables parallel decompression.
proposes a novel instruction bus power reduction
scheme.

4
Related Works

Ishiura et al.
Instruction code compression for application
specific VLIW processors based on automatic field
partitioning (SASIMI97)
proposed dictionary-based schemes.
not feasible for modern VLIW processors
Y. Xie et al.
A code decompression architecture for VLIW
processors (MICRO-34)
assumed modern VLIW processors which adapts a
VLES (various length execution set) scheme.
extended present compression algorithms and
proposed the decompression architecture for
modern VLIW architectures.
Tunstall et al.
Synthesis of noiseless compression codes (PhD
thesis, GIT)
investigated variable-to-fixed (V2F) coding.

5
Compression Algorithm- Memoryless V2F Coding
Algorithm (1)

Algorithm to construct N-bit
Tunstall codewords

Encoding example
000 01 001 -gt 11 01 10

6
Compression Algorithm- Memoryless V2F Coding
Algorithm (2)

Two possible problems
end of block
problem
compression is done by block by block.
tree traversal may end at a non-leaf node.
solution
pads extra bits to the block for the traversal to
meet the leaf node
extra bits can be simply truncated during
decompression
byte alignment
problem
the compressed block must be byte aligned.
solution
a few extra bits are padded if the size of the
compressed block is not multiple of 8 (in bits).

7
Compression Algorithm- Markov V2F Coding
Algorithm (1)

How to improve the compression ratio?
exploit the statistical dependencies among bits
in the instructions.
use more complicated probability model.
Markov model is used in this paper.
Markov model
consists of
a number of states
transitions between states with certain
probability
two main variables to describe proposed model
model depth should divide the instruction
evenly or be multiples of the instruction size.
model width models ability to remember the
path to a certain node

8
Compression Algorithm- Markov V2F Coding
Algorithm (2)

Example of 4X4 Markov model

Markov model
2-bit V2F coding tree and codebook for Markov
state 0
9
Compression Algorithm- Markov V2F Coding
Algorithm (3)

Code compression procedure
statistics-gathering phase
choose the width and depth for Markov model.
gather the probability for each transition by
going through the whole program.
codebook construction phase
generate N-bit V2F length coding tree and
codebook for each state.
M codebooks and 2N codewords per each codebook
for a M-state Markov model.
codewords assignment can be arbitrary.
compression phase
traverse the coding tree for each state from the
root until a leaf node is met.
produce the codewords related to the leaf node.
jump to the other coding tree indicated by the
leaf node.

10
Decompression Architecture

Decoder
N-bit table (i.e. codebook) lookup unit
very small
less than 100 gates
the size is only 4um2. (TSMC 0.25 cell library)
Parallel decompression
memoryless V2F possible
all codewords in the compressed code are
independent.
Markov V2F impossible
the codebook for the next N-bit chunk is known
only after the current N-bit chunk is
decompressed.

11
Power Reduction for Instruction Bus (1)

Codeword assignment
does not affect compression ratio.
can reduce bit toggling if it is done carefully.
power consumption on the bus ? bit toggles on the
bus
Formulation
each codeword can be represented by Ci, Wj
Ci one of the M codebooks (i 1, 2, , M)
Wj one of the codewords in codebook Ci (j 1,
2, , 2N)

12
Power Reduction for Instruction Bus (2)

Formulation
codeword transition graph
Ei specifying how many times the transition
happens
Hi Hamming distance between two N-bit binary
codewords
goal
find out the best codeword assignment to minimize
the total bus toggles (i.e. the sum of HiEi).
proved to be an NP problem when M 1.

13
Power Reduction for Instruction Bus (3)

A greedy heuristic codeword assignment algorithm
1. sort all the edges by weights in decreasing
order
2. for each edge, if either node is not assigned,
assign valid codewords with minimal Hamming
distance.
3. go to step 2 until all nodes are assigned.

14
Experimental Results- Compression ratio using
memoryless V2F
Best compression ratio (72.7) when N 4
15
Experimental Results- Compression ratio using
Markov V2F
Average compression ratio is 56 when N 4.
16
Experimental Results- Instruction Bus Toggles
17
Conclusion and Future Work

VLIW code compression schemes using V2F are
proposed.
A greedy codeword assignment algorithm are
presented to reduce instruction bus toggles.
Future work
better heuristics algorithm for low power
codeword assignment
the ASIC design of the decompression architecture

18
Very Long Instruction Word (VLIW) Architecture

A single instruction specifies more than one
concurrent operation
The instruction is quite large.
VLIW processor relies on compiler to pack the
operations into an instruction.
VLIW processor is not software compatible with
any general purpose processor.
Compaction depends on the instruction level
parallelism.
VLIW leads to simple hardware implementation
(compared to superscalar).

Write a Comment

User Comments (0)

About PowerShow.com

Code Compression for VLIW Processors Using Variabletofixed Coding PowerPoint PPT Presentation