Title: An Example of Translation and Proof using Higher-Order Abstract Syntax
1An Example of Translation and Proof using
Higher-Order Abstract Syntax
- Michael W. Whalen
- Advanced Technology Center
- Rockwell Collins Inc.
2Safety-Critical Systems
3Code Generation Requirements
- Automatic
- Formally-Defined
- Formal description of source/target language
- Proof that generated code implements
specification - Correctly-Implemented
- Transparent transliteration of translation rules
- Implementation should be rigorously tested
- Usable for Safety-Critical Systems
- Human-Understandable and traceable
- Necessary for fault analysis, code
instrumentation - Required by regulatory agencies
- Fast enough for target environment
4Aspects of Translation
4. Implementation Designing a Translator that
transparently implements rules
5Formal Definition of Compiler Correctness
RSML-e Syntax
Compiler Definition
Program Syntax
6Operational Semantics and Proof
- Operational semantics provides framework for
evaluation, static semantics, and transformations - Several different flavors of operational
semantics - SOS, Natural Semantics, Abstract Machines
- We want formalism that leads to elegant
transformations and proofs
7Managing Identifiers
- Large part of translation and proof complexity
- Explicit Environments
- Environment Carrying functions Plotkin SOS,
Despyroux Mini-ML - Renaming over scopes Drossopoulou Java
- Implicit Environments
- Substitution as meta-rule Pierce PL Book
- Lambda variables in object language
- Metalevel support Hannon93, Whalen05
- Lambda variables in metalanguage
- Proofs describing substitution behavior provided
by metalogic
8Extended Natural Semantics Example
Concrete Syntax
Higher-Order Abstract Syntax
function sum(y int z int) int return y
z
(function_def int (param int (?y.
(param int (?z. (body
(binary_expr (lit_expr y) plus (lit_expr z))))))))
Evaluation Rules
9Extended Natural Semantics Typing
Higher-Order Abstract Syntax
Typing Rules
(function_def int (param int (?y.
(param int (?z. (body
(binary_expr (lit_expr y) plus
(lit_expr z)))))))).
10Extended Natural Semantics Transformation
Higher-Order Abstract Syntax
Transformation Rules
(function_def int (param int (?y.
(param int (?z. (body
(binary_expr (lit_expr y) plus (lit_expr z))
11ENS Transformation, Expanded
(? x ? ((?z.ltBodygt) x)
((?z.ltBodygt) x))
12Aspects of Translation
4. Implementation Designing a Translator that
transparently implements rules
13Notions of Completeness and Determinism
Source Syntax
Compiler Rules
Program Syntax
14Correctness Obligations for SOS Rules
- Obligations for deterministic language
- Obligations are equivalent if source semantics
are complete.
15Translation in Layers
Semantics Rules
RSML-e
Completeness Proofs
Translation Rules
C, Ada, Java,
16Evaluation Rules in Translation
Source AST Grammar
Target AST Grammar
New Syntax
if expr then v_expr else v_expr
... v_expr unknown id(expr list)
expr. ...
... v_expr if expr then v_expr else
v_expr expr. ...
Evaluation rules for new syntax
Source Evaluation Rules
Rules for Removed Syntax
?
-
Target Evaluation Rules
17Translation Proof Structure
- Describe the correctness of contexts
- Describe equivalence of program states
- Describe completeness obligation using evaluation
rules for source and target languages
transformation rules
18Aspects of Translation
4. Implementation Designing a Translator that
transparently implements rules
19Source Language RSML-e
- RSML-e is a Reactive Synchronous Dataflow
Language - Reactive Specification reacts to changes in
external environment at discrete intervals - Synchronous those reactions take (logically)
zero time - Dataflow value of object (variable or interface)
can be computed as soon as objects on which it is
dependent have been computed. - Specification consists of Variables and
Interfaces - Variables maintain internal state of model
- Interfaces describe interaction with the external
environment - Two-state model
- Values of variables from previous step can be
referenced
20Source Language RSML-e
Input Frames
Output Frames
Reset_Receiver Clock
Fault_Sender
Clock
ltemptygt
Altitude Switch Specification
Frame Being Evaluated
...
...
Evaluation Result
Clock
DOI_Receiver Clock
DOICmd_Sender
DOI_Receiver
DOICmd_Sender
...
Reset_Receiver Clock
21Source Language RSML-e
22Source Language RSML-e
- Each variable or interface has an assignment
23Translation Intermediate Languages
- We move the language successively closer to an
imperative language - RSMLp We move from the RSML-e synchronous
specification language to a synchronous
programming language remove undefined and case
lists. - RSMLt Switch from a structural to a nominal
type system - RSMLv Switch from two-state variables to
one-state variables - SIMPLr Add imperative, rather than functional,
assignments to variables (subset of Ada) - SIMPL Remove record assignments from SIMPLr
(subset of C, Java)
24Example RSML-e to RSML-p
- This transformation does two things
- Replaces assignment case lists with assignment
expressions - Removes undefined_val from the type system
- To remove undefined_val we transform all
variables in the specification - var x T becomes var x record val T,
def Boolean
25Transformation Rules
26Proof Obligations
Context Relation
State Variable Value Similarity Relation
State Relation
27Proof Obligation Expressions
Expression Obligation
Lemma about deref
28Example Proof pre_expr
Transformation Rule
RSML-e Evaluation Rule
29Aspects of Translation
4. Implementation Designing a Translator that
transparently implements rules
30Implementation
- Prototype Translator In ?Prolog
- Transparently Implements ENS Rules
becomes
31Translator Architecture
32Implementation
- Translator Stats
- Source Code _at_ 100KB in 27 source/header files
Rule Type Lines of Code Number of Rules
Translation _at_2000 278
RSML-e Static Semantics _at_1000 141
Scaffolding _at_500 45
RSML-e Evaluation _at_350 100
SIMPL Evaluation _at_320 91
33Implementation
File Name Size (LOC) Compilation Time
records3.rsmle 71 1s
three_altimeters.rsmle 131 2s
numeric_ops.rsmle 230 DNF Ran out of Memory
function_test.rsmle 215 DNF Ran out of Memory
- Teyjus Needs Garbage Collection!
34Post-Mortem
35Discussion
- Original work was in first-order system
- Used ID-substitution (Drossopoulou)
- Requires additional rules describing which ids
should be substituted (e.g. no record fields) - Required significant additional lemmas about how
terms behave under id substitutions - I was struggling to complete proofs (and bored)
due to sheer number of details related to
identifiers
36Discussion
- HOAS and ?Prolog made my dissertation much more
straightforward - Language descriptions became simpler
- Translation became much simpler
- Use of implication allowed immediate and simple
constructions of compiler environment - Relations over correct environments are
straightforward to construct - Proofs became much simpler
- No substitution lemmas Pierce, Despyroux
- Proofs 2-3x shorter
37Binding I Removing Names
- One goal of HOAS make identifier names
irrelevant - I was not totally able to do this
- Record fields still keyed by id
- ?-bindings assume a specific order record
expressions allow arbitrary order - Question is it possible / a good idea to remove
field identifiers?
38Binding II Adding Variables
- Translation from higher-level to lower-level
language often requires introduction of new
variables - Difficult to motivate translation rules at first
- Led to some odd rule constructions where bindings
and code were constructed in parallel - Example moving from a language with
record-creation expressions (a la ML) to one that
does not (a la C)
39Remove Record Expressions Example
- Giventype a record f1 int, f2 real
- Want to change something like f1 2y, f2
3.1 - Intocreate_a(2y, 3.1)
- Need to createfun create_a( f1 int, f2
real) a var r_result a in
r_result.f1 f1 r_result.f2 f2 return
r_result end
40Remove Record Expressions Example
Rule create_type_fn_body Var Type Fields
StmtList Block - Var is the fresh constant
bound to the r_result local variable - Type is
the return type of Var - Fields describes the
remaining fields to be assigned within the
record - StmtList defines the field
assignments performed thus far - Block is the
returned function block
41Binding II Adding Variables
- Similar project at RCI Translating Lustre to
several languages (NuSMV, PVS, SAL) - Lustre supports PRE-operator that allows
reference to previous values of variables - Fibonacci x pre(pre(x, 0), 0) pre(x,1)
- To translate to C, we must introduce additional
variables for each pre-operator - Seems tricky to do in HOAS!
42Binding III Non-Lexical Scoping
- Many languages allow forward references to
identifiers - Java
- Lustre/SCADE
- I changed the RSML-e semantics to disallow
forward references - (How) Can we represent global scopes in HOAS?
- Alternately, can we add environments for global
ids and still get most of the HOAS benefits?
43Working in a Positivist Logic
- It would be difficult to write semantics and
translator entirely without the use of cut - List non-membership in static semantics
- Evaluation rule for not-equal expressions
- Occasional use of set data structure
- Cuts were not used in rules that referenced
structures that could contain meta-level
variables or universal constants - These uses could affect correctness of reasoning
- How will my use of cut affect reasoning in a
formal framework?
44Tool Support
- ?Prolog gripes
- No syntax for naming commonly used types makes
for long type descriptions - Syntax allows misplaced comma to conjoin two rule
instances - New symbol for reverse implication in rule
instance? (lt- ) - New rule begins with turnstile? (- )
- Implication (gt) binds tighter than and (,)
- Teyjus gripes
- No garbage collector
- No warnings on single use of variable
- No warnings on rule declaration without
definition - No warnings on non-use of bound variable within
term - No debugger
45Conclusion
- Formal approach can be used for real translators
- Difficulty is dependent on choice of formalism
- Original work was in natural semantics
- Much simpler with extended natural semantics
- Some things are still tricky to do in HOAS
- A few improvements to tools would really benefit
serious users
46Conclusion
- SIMPL Small Imperative Language semantics may
be useful to others - I didnt want to write it
- YAILS - boring
- However, I needed a small subset of Ada/Java/C
- Literature semantics are cleaner, but no clear
correspondence to real languages - Supports basic records, arrays, block
structuring, functions - Recursion could be added easily
- However, matching C/Java syntax for recursion
would be harder
47Future Work
- Generalizing work to other source languages
- Lustre, SCR
- Adding other target languages
- Extensive testing (if actually to be used on
DO178B development effort) - Teyjus Improvements
- Optimizations
48Contact Information
- Crisys Research Group
- on the web http//www.cs.umn.edu/crisys
- Mike Whalen
- e-mail mwwhalen_at_rockwellcollins.com
- phone (612) 625-4543
- Mats P.E. Heimdahl
- e-mail heimdahl_at_cs.umn.edu
- phone (612) 625-2068