Translation Validation for an Optimizing Compiler - PowerPoint PPT Presentation

About This Presentation
Title:

Translation Validation for an Optimizing Compiler

Description:

Based on George C. Necula article (ACM SIGPLAN 2000) ... Showtime... A. Element #1 holds. C. Prove elem. # 2 (Trivial) B. There is only one ... – PowerPoint PPT presentation

Number of Views:96
Avg rating:3.0/5.0
Slides: 37
Provided by: csta3
Category:

less

Transcript and Presenter's Notes

Title: Translation Validation for an Optimizing Compiler


1
Translation Validation for an Optimizing Compiler
  • Guy Erez

Based on George C. Necula article (ACM SIGPLAN
2000)
Advanced Programming Languages Seminar, Winter
2000
2
In a Nutshell
  • The Problem Verify that the optimized and source
    code are equivalent
  • Partial (heuristic) Solution Independently prove
    the validity of each translation pass
  • Motivation Optimizer Testing

3
Outline
  • Introduction
  • Intermediate Language
  • An extensive example
  • Simulation Relation
  • Execution Pair
  • Equivalence Checking
  • Branch Navigation
  • Results and Limitations

4
Methods of Proving Compiler Correctness
  • Prove compiler general correctness
  • absolute
  • tedious
  • impractical for large programs
  • very dependent of compiler code

5
Methods of Proving Compiler Corr. (cont.)
  • Show that each translation phase was valid
  • weaker
  • proof per program
  • applicable for large programs
  • independent of compiler code

6
Compilation Process
SourceCode
IntermediateLanguage(IL)
TargetCode
7
Optimization Process
Optimize Pass
ILCode0
ILCode1
ILCoden
Validator
8
The IL in GNU C (subset)
  • InstructionsExpressions
  • Operators

9
An Example
extern int gextern int amain() int n
/ n contains the length of the array / int
i for (i0 iltn i) aigi3 return
i
10
And in IL
for (i0iltn i) aigi3return i
11
After Transformation
Use registers
Transform while to a repeat loop
?ltgt
?ltgt
12
Equivalence
  • x1,,xn variables in source
  • y1,,ym variables in target
  • Variable Equivalencex1 y3
  • Expression Equivalencex1x2 y36

13
Simulation Relation
  • A set of equivalences between a source block and
    a target block

14
Execution Pair
  • Definition An execution path in the source and
    its corresponding path in the target

Source
Target
15
Checking Equivalence
  • Equivalence is checked at the end of a specific
    execution pair
  • A variable value after the run is marked with a
    prime

Symbolic Substitution
xx1
x
y
yy3
16
Equivalence Simplification
  • An equivalence can be simplified using
  • Arithmetic rules
  • Already proven equivalences
  • Example If xx1 and yy5 then3xy?3(x1
    )y5?3x3y5
  • An equivalence holds if it can be simplified to
    an already proven equivalence

17
Checking Simulation Relations
  • A relation is correct if for each execution pair
    entering it, all of its equivalences hold

x
y
xy1
18
Something fishy
  • Whats the point of proving something using the
    same rules that created it?
  • Simpler
  • Provides an independent perspective on the final
    code

19
Showtime
C. Prove elem. 2 (Trivial)
20
Element 5
  • Two execution pairs

21
Element 5 (cont.)
  • The other pair

22
Known Equivalences
  • Equivalences from the start of the run
  • Equivalences at the end of run

23
Need to Prove
  • The path condition is correct
  • The equivalences hold, mainly

24
Elem 5 Path Cond.
25
Elem 5 The Equivalence
Q.E.D
26
Algorithm Parts
  • Inferring Simulation Relations
  • Finding execution pairs
  • Solving Constraints

27
Navigating Branches
  • An optimizer might eliminate or reverse branches
  • Problem did branch B originate from branch B in
    the source
  • Solution Use heuristics

28
A Typical Case
29
Similarity
  • The similarity between two branches depend on the
    similarity of their
  • preceding instruction sequence
  • boolean conditions
  • the two branching sequences

30
Similarity (cont.)
  • Formally
  • is a numeric relation(0..1)
  • and is multiplication
  • or is maximum

31
Boolean Similarity
  • Branches are similar if
  • one can be simplified into the other using simple
    transforms, such as

32
Instruction Similarity
  • Instructions similarity
  • amount of function calls
  • lead to already related branches (in that case,
    similarity is 1.0)

33
Instruction Similarity
  • gcc specific features
  • IL instructions serial number
  • source line number information (for code
    duplication detection)

34
Results
  • Detected a known bug in gcc 2.7.2.2
  • Used on large programs
  • Increased compile time x4

35
Limitations
  • Cannot handle loop unrolling
  • Cannot resolve all types of equivalences
  • Produces several false alarms (i.e. the gcc bug
    was accompanied by 3 false alarms)

36
Conclusion
  • Automatically infer equivalences
  • Uses
  • simple rules and substitution
  • heuristics
  • Good results
  • Problems
  • false alarms
  • runtime overhead
Write a Comment
User Comments (0)
About PowerShow.com