Title: A Dictionary Construction
1A Dictionary Construction Technique for Code
Compression Systems with Echo Instructions
Philip Brisk
Jamie Macbeth
Ani Nahapetian
Majid Sarrafzadeh
philip, macbeth, ani, majid_at_cs.ucla.edu
Embedded and Reconfigurable Systems Lab
Computer Science Department
University of California, Los Angeles
LCTES 05. June 16, 2005. Chicago, IL
2Outline
- Introduction Code Compression
- Dictionary Compression
- Dictionary Construction
- Overview of the Algorithm
- Experimental Methodology and Results
- Summary
3Introduction Code Compression For Embedded
Systems
- Why Reduce Program Size?
- Reduces Memory Requirements
- Silicon Cost of Program Storage in on-chip ROMs
- As Embedded Systems Become More Complex,
Ever-More Functionality Will Migrate to Software - Costs of Runtime Decompression
- Performance Overhead
- Area of the Decoder Circuitry
4Dictionary Compression
- Find Repeated Code Sequences
- Place Each Sequence Into a Dictionary
- Replace Each Sequence in the Program with a
Codeword that Accesses the Dictionary
Dictionary
Program
5CALD and Echo Instructions
- CALD Instructions
- Place each sequence in a dictionary
- All Codewords Point to the Dictionary
- Echo Instructions
- Leave one Instance of the Sequence Inline
- All Codewords Point to the Sequence
Dictionary
Program
Program
6Compression Algorithms
- The Traditional Approach Compression Performed
at Link Time - Substring Matching Fraser et al., 1984
- Register Renaming Cooper and McIntosh, 1999
- Debray et al., 2000
- Instruction Rescheduling De Sutter et al.,
2002 - Our Approach is Somewhat Different
- Identify Repeated Isomorphic Patterns that Occur
within the Intermediate Representation PRIOR TO
Register Allocation Brisk et al., 2004
7Dictionary Construction
Sequence 1
Dictionary 1
DAG 1
A R1 ? R2 R3 B R4 ? R5 R6 C R7 ? R1 R4
A R1 ? R2 R3 B R4 ? R5 R6 C R7 ? R1 R4
5
A R1 ? R2 R3 C R7 ? R1 R4
DAG 2
Sequence 2
A R1 ? R2 R3 C R7 ? R1 R4
Dictionary 2
B R4 ? R5 R6
2 Schedules Exist for DAG 1
3
A R1 ? R2 R3 C R7 ? R1 R4
DAG 2 is isomorphic to a subgraph of DAG 1
8Isomorphic Pattern Generation
- Edge Contraction
- Add an Operation to a Pattern
- Combine 2 Patterns into a Larger One
- Build a Subgraph Hierarchy (SH)
9Isomorphic Pattern Generation
- Edge Contraction
- Add an Operation to a Pattern
- Combine 2 Patterns into a Larger One
- Build a Subgraph Hierarchy (SH)
T1
SH
T1
10Isomorphic Pattern Generation
- Edge Contraction
- Add an Operation to a Pattern
- Combine 2 Patterns into a Larger One
- Build a Subgraph Hierarchy (SH)
T1
SH
T1
11Isomorphic Pattern Generation
- Edge Contraction
- Add an Operation to a Pattern
- Combine 2 Patterns into a Larger One
- Build a Subgraph Hierarchy (SH)
T1
T2
SH
T1
12Isomorphic Pattern Generation
- Edge Contraction
- Add an Operation to a Pattern
- Combine 2 Patterns into a Larger One
- Build a Subgraph Hierarchy (SH)
T1
T2
SH
T1
T2
13Isomorphic Pattern Generation
- Edge Contraction
- Add an Operation to a Pattern
- Combine 2 Patterns into a Larger One
- Build a Subgraph Hierarchy (SH)
T1
T2
SH
T1
T2
14Isomorphic Pattern Generation
- Edge Contraction
- Add an Operation to a Pattern
- Combine 2 Patterns into a Larger One
- Build a Subgraph Hierarchy (SH)
T1
T2
SH
T1
T2
15Isomorphic Pattern Generation
- Edge Contraction
- Add an Operation to a Pattern
- Combine 2 Patterns into a Larger One
- Build a Subgraph Hierarchy (SH)
T1
T2
SH
T1
T2
16Isomorphic Pattern Generation
- Edge Contraction
- Add an Operation to a Pattern
- Combine 2 Patterns into a Larger One
- Build a Subgraph Hierarchy (SH)
T2
T3
T1
T2
SH
SH
T1
T2
T2
T3
T4
17Isomorphic Pattern Generation
- Edge Contraction
- Add an Operation to a Pattern
- Combine 2 Patterns into a Larger One
- Build a Subgraph Hierarchy (SH)
T2
T3
T1
T2
SH
SH
T1
T2
T2
T3
18Isomorphic Pattern Generation
- Edge Contraction
- Add an Operation to a Pattern
- Combine 2 Patterns into a Larger One
- Build a Subgraph Hierarchy (SH)
T4
T2
T3
T1
T2
SH
SH
T1
T2
T2
T3
T4
19Isomorphic Pattern Generation
- Edge Contraction
- Add an Operation to a Pattern
- Combine 2 Patterns into a Larger One
- Build a Subgraph Hierarchy (SH)
T4
T2
T3
T1
T2
SH
SH
T1
T2
T2
T3
T4
20An SH Grammar
- The SH is also a DAG
- Generate a pattern Tk from sub-patterns Ti and
Tj - Contract edge (Ti, Tj)
- Create a Production Tk ? TiTj
x
x
T4
T2
T1
T2
T3
T2 ? xT1
T4 ? T3T2
21Derivations and Scheduling
Grammar
G1
G2
G1? G2G3 G2? G4b G3? G5g G4? ac G5? G6f G5?
G7e G6? de G7? df
a
a
b
c
b
c
G3
d
d
G1
G1
Derivations
e
f
e
f
G2
G2
G3
G3
g
g
b
b
G4
G5
G4
G5
g
g
a
G4
G5
d
G7
G6
e
f
c
ac
ac
e
f
d
d
df
de
G7
G6
f
e
acbdefg
acbdfeg
22Compatibility
Ti, Tj patterns Si, Sj schedules for Ti, Tj
Assume Ti is a Subgraph of Tj
We want Ti and Tj to Share the Same Dictionary
Entry Then Si must be a Contiguous Subsequence of
Sj.
AC is a Contiguous Subsequence of BAC but not ABC
A R1 ? R2 R3 B R4 ? R5 R6 C R7 ? R1 R4
B R4 ? R5 R6
A R1 ? R2 R3 C R7 ? R1 R4
A R1 ? R2 R3 C R7 ? R1 R4
23Convex Cuts in DAGs
- Let G (V, E) be a DAG
- A Cut is a Partition of V
- A Convex Cut cannot have edges that cross the
boundary of a cut in BOTH directions - SH Construction Ensures Convex Cuts
Convex Cut / Scheduling
DAG
Non-Convex Cut
24Convex Cuts and Compatibility
G4
G2
a
G1
G5
a
b
c
G1?(2,3),(4,5)
a
G3
b
c
d
b
c
a
d
e
a
a
e
f
d
b
b
g
b
c
f
g
d
c
e
f
d
G1?(2,3)
G1?(4,5)
f
d
g
e
f
a
a
e
c
g
b
b
f
c
c
e
CYCLE!
g
g
d
d
e
G1 ? G2G3
g
e
f
f
G1 ? G4G5
g
25Generalized Compatibility
Given a Set of Productions with G1 on the LHS
G1 ? G4G5
G1 ? G2G3
G1 ? G2kG2k1
,
How can we Tell if they are Compatible?
- Three Criteria Equivalent to Compatibility
- G1?(2,3),(4,5),,(2k,2k1) is Acyclic
- G2 G4 G2k
- G2k1 G5 G3
The Pragmatic Question
If all Productions are NOT Compatible, what is
the Largest Compatible Subset?
26The Subset/Subgraph View of Compatibility and
Scheduling
Gi
Si
Si
Sj-i
Gj
Gj - Gi
Sj-i
Gi Gj
- Construct a Schedule Si for Gi
- Construct a Schedule Sj-i for Gj-i
- Construct a Schedule Sj SiSj-i for Gj
27A Production Compatibility Graph
- Represent the Subgraph Relation as a DAG
- called the Production Compatibility Graph (PCG)
- Productions G1 ? Gi and G1 ? Gj create vertices
Gi and Gj - Add an Edge (Gi, Gj) to the PCG if
- 1. Gi Gj
- 2. There is no Gk such that Gj Gk Gi
- Any PATH in the PCG Corresponds to a Subset
- of Patterns that can be Scheduled Contiguously
- within a Dictionary entry for G1.
28PCG Example
G2
G4
a
G1
a
b
G5
c
a
b
c
PCG
G3
d
b
c
d
G8
d
e
e
f
e
f
g
f
g
g
G2
G4
G6
G10
a
a
G8
a
b
c
b
b
c
G7
G6
G9
G10
d
c
d
d
f
e
f
e
f
e
G11
g
g
g
29Algorithm Overview
- Recall that the Subgraph Hierarchy is a DAG
- Process SH Entries in Topological Order
- All Sub-Patterns Processed Before Each Pattern
- Construct a PCG for each SH Entry
- Assign Vertex Weights to Each Pattern based on
the Number of Sub-Patterns in the Dictionary
Entry - Find Max Vertex-Weighted Path in the PCG
- Determine the Maximum Gain Pattern in the SH
- Remove the Max Gain Pattern and all
Sub-Patterns Selected for its Dictionary Entry - Repeat until the SH is Empty
30Experimental Framework
- Algorithm Built into the Machine SUIF Compiler
- Consolidate Each Application using link_suif Pass
- All Unrolled Loops Manually Re-rolled
- Standard Front End Compilation Script
- One Round of Constant Folding/DCE
- Instruction Selection for Alpha Architecture
- ARM Back End Recently Released
- Detect Recurring Isomorphic Patterns in the IR
- Analysis described in Brisk et al., 2004
- Dictionary Construction as Described Here
31Experimental Methodology
- Cannot Compare with Substring Matching
- Many Schedules Exist for Each DAG
- Substring Matching Assumes Scheduled Code
- How to Determine the Best Schedule for Each DAG?
- Our Algorithm Determines a Schedule for the
Entire Set of DAGs to Maximize Pattern Overlap - Naïve Approach Each Pattern Gets Its Own
Dictionary Entry - Our Approach - Isomorphism/Scheduling
32Experimental Results
Applications Taken from MediaBench Lee et al.,
1997
33Compilation Time
Benchmark Total (sec) Dictionary (sec) ()
Epic G.721 GSM JPEG MPEG2 Dec MPEG2 Enc Pegwit PGP PGP (RSA) Rasta 9.88 2.71 33.6 362 32.3 65.1 32.6 198 9.06 18.1 0.524 0.196 0.821 16.1 1.31 1.99 1.10 5.64 0.520 0.871 5.30 7.23 2.44 4.45 4.06 3.06 3.37 2.85 5.74 4.81
34Conclusion
- Algorithm Given for Dictionary Construction
- What Is Built is Actually an Intermediate
Representation of a Dictionary - Combination of 3 Classically Hard Problems
- Graph/Subgraph Isomorphism
- Scheduling
- Dictionary Construction/Compression
- Future Work Register Allocation and Assignment
- Make a Best Effort to Assign Registers So that
Isomorphic Patterns have Identical Register Usage
35References
- 1. Brisk, P., Nahapetian, A., and Sarrafzadeh,
M. Instruction Selection for Compilers that
Target Architectures with Echo Instructions,
SCOPES 2004. - 2. Fraser, C. W., Myers, E., and Wendt, A.
Analyzing and Compressing Assembly Code.
Symposium on Compiler Construction, 1984. - 3. Cooper, K. D., and McIntosh, N. Enhanced Code
Compression for Embedded RISC Processors, PLDI
1999. - De Sutter, B., De Bus, B., and De Bosschere, K.
Sifting out the Mud Low-Level C Code Reuse,
OOPSLA 2002. - Debray, S., Evans, W., Muth, R., and De Sutter,
B. Compiler Techniques for Code Compaction,
TOPLAS, 2000. - Lee, C., Potkonjak, M., and Mangione-Smith, W. H.
MediaBench A Tool for Evaluating and
Synthesizing Multimedia and Communications
Systems, MICRO-30, 1997.
36Questions
?