Title: Banana Algebra:
1Banana Algebra
Syntactic Language Extension via an Algebra of
Languages and Transformations
Jacob Andersen jacand_at_cs.au.dk Aarhus
University
Claus Brabrand brabrand_at_itu.dk IT University
of Copenhagen
2Abstract
We propose an algebra of languages and
transformations as a means for extending
languages syntactically. The algebra provides a
layer of high-level abstractions built on top of
languages (captured by CFGs) and transformations
(captured by constructive catamorphisms).
The algebra is self-contained in that any term
of the algebra specifying a transformation can be
reduced to a constant catamorphism, before the
transformation is run. Thus, the algebra comes
"for free" without sacrificing the strong safety
and efficiency properties of constructive
catamorphisms. The entire algebra as
presented in the paper is implemented as the
Banana Algebra Tool which may be used to
syntactically extend languages in an incremental
and modular fashion via algebraic composition of
previously defined languages and transformations.
We demonstrate and evaluate the tool via several
kinds of extensions.
3Outline
- Introduction "What is a Banana?"
- Bananas for Language Transformation
- Language Extension Pattern
- Banana Algebra
- Examples
- Implementation
- Related Work
- Conclusion
4What is a 'Banana' ?
- Datatype "list"
-
- Banana ("sum-of-list")
- Separation of recursion and evaluation
- Implicit recursion on input structure
- bottom-up re-combination of intermediate results
-
list Num N Cons N list
list ? N
Num n n Cons n l n l
(aka. "Catamorphism" )
( ?n.n , ?(n,l ).nl )
5Language Transformation
- Bananas (statically typed)
- Source language 'LS'
- Target language 'LT'
- Nonterminal-typing '? '
- Reconstructors 'c '
?
( LS -gt LT ? c )
LS -gt LT
list Num N Cons N list
tree Nil Leaf N Node N tree tree
list -gt tree
Num n Leaf n Cons n l Node n (Nil)
l
Type-check'able!
6Banana Properties
- Banana properties
- Simple (corresponds to simple recursion)
- Safe (syntactically safe always
terminate) - Efficient (linear time in size of input
output) - (Expressive) (enough for interesting
extensions) - Banana Algebra for free (16 banana ops)
- Modular
- Incremental
- Simple
- Safe
- Efficient
- (Expressive)
?
7Outline
- Introduction "What is a Banana?"
- Bananas for Language Transformation
- Language Extension Pattern
- Banana Algebra
- Examples
- Implementation
- Related Work
- Conclusion
8Language Extension Pattern
Numeral extension
Lambda-Calculus
'LS'
'LT'
Exp var Id lam Id Exp app Exp
Exp zero succ Exp pred Exp
Exp var Id lam Id Exp app Exp
Exp
'? '
Nonterminal typing
Exp -gt Exp
Reconstructors
'c '
var V var V lam V E lam V
E app E1 E2 app E1 E2
zero lam z (var z) succ E lam s
E pred E app E (lam z (var z))
Catamorphism
( LS -gt LT ? c )
Using very simple numeral encoding
9Algebraic Solution
lnl ? l
ln ? l
( ln -gt l Exp -gt Exp zero lam z (var
z) succ E lam s E pred E app E
... )
l ? l
idx
ln
l
Exp var Id lam Id Exp app Exp
Exp
Exp zero succ Exp pred Exp
10Banana Algebra
- Languages (L)
-
- l
- v
- L \ L
- L L
- src ( X )
- tgt ( X )
- let v L in L
- letx w X in L
- Transformations (X)
-
- x
- w
- X \ L
- X X
- X X
- idx ( L )
- let v L in X
- letx w X in X
( L -gt L ? c )
?CFG?
11Algebraic Laws
- Idempotency of ''
-
- Commutativity of ''
-
- Associativity of ''
-
- Source-identity
-
L ? L L
L1 L2 ? L2 L1
L1 (L2 L3) ? (L1 L2) L3
Target-identity
L ? tgt(idx(L))
L ? src(idx(L))
12Outline
- Introduction "What is a Banana?"
- Bananas for Language Transformation
- Language Extension Pattern
- Banana Algebra
- Examples
- Implementation
- Related Work
- Conclusion
13Example Revisited
--- "ln2l.x" ---
let l "l.l" in let ln "ln.l" in idx(l)
( ln -gt l Exp -gt Exp Exp.zero
'\z.z' Exp.succ '\s.1'
Exp.pred '(1 \z.z)' )
--- "l.l" ---
--- "ln.l" ---
Id a-z a-z0-9 Exp.var Id
Exp.lam "\\" Id "." Exp
Exp.app "(" Exp Exp ")"
Exp.zero "zero" Exp.succ
"succ" "(" Exp ")" Exp.pred "pred" "("
Exp ")"
14Numerals Booleans
llnlb ? l
with Nums Bools?
with Nums
with Bools
lbl ? l
lnl ? l
l ? l
lb ? l
ln? l
idx
idx
l
lb
l
ln
15Java Repeat
--- "java.l" ---
575 lines
Java ... "try" Stm "catch" ...
Name.id Id
--- "repeat.l" ---
Stm.repeat "repeat" Stm "until" "(" Exp
")" ""
--- "repeat2java.x" ---
let java "java.l" in let repeat
"repeat.l" in idx(java) ( repeat -gt
java Exp -gt Exp, Stm -gt Stm
Stm.repeat 'do 1 while (!(2))' )
7 lines !
16Concrete vs. Abstract Syntax
Concrete syntax
Stm.repeat 'do 1 while (!(2))'
Exp (with explicit assoc./prec.)
Abstract syntax
Exp.or Exp1 "" Exp .exp1 Exp1
Exp1.and Exp2 "" Exp1 .exp2
Exp2 Exp2.add Exp3 "" Exp2
.exp3 Exp3 ? Exp7.neg
"!" Exp8 .exp8 Exp8
Exp8.par "(" Exp ")" .var Id
.num IntConst
Stm.repeat Stm.do(lt1gt, Exp.exp1(
Exp1.exp2( Exp2.exp3(
Exp3.exp4( Exp4.exp5(
Exp5.exp6( Exp6.exp7(
Exp7.neg(
Exp8.par(lt2gt) )))))))))
(unambiguous concrete ? abstract)
NB Tool supports BOTH !
17"FUN" Example
The "FUN" Language used for Teaching
Functional Programming
(at Aarhus University)
Fun
Basically The Lambda Calculus with numerals,
booleans, arithmetic, boolean logic, local
definitions, pairs, literals, lists, signs,
comparisons, dynamic types, fixed-point
combinators,
Fun grammar transform
Literals
Literals?Nums
Unsigned arithmetic booleans
definitions pairs
Nums??
Bools??
Defs??
Pairs??
Lambda Calculus
18"FUN" Example
Component re-use
Fun
Fun
FunSigned
Fun grammar transform
Fun grammar transform
FunSigned GT
Literals
Literals?Nums
Literals?Nums
Signed arith?Nums
Unsigned arithmetic booleans
definitions pairs
Nums??
Bools??
Defs??
Pairs??
Lambda Calculus
19"FUN" Example
Fun
FunSigned
FunCompare
FunTypesafe
Fun GT
FunSigned GT
FunCompare GT
FunTypesafe GT
245x Banana Algebra ops ? 4 MB Banana !
Unsigned arithmetic booleans
definitions pairs
Nums??
Bools??
Defs??
Pairs??
Lambda Calculus
20"FUN" Usage Statistics
- Usage statistics (245x operators) in "FUN"
- 58x cfg Constant languages
- 51x "file.l" Language inclusions
- 28x L L Language additions
- 23x v Language variables
- 17x (L ? L ? c) Constant
transformations - 17x X X Transformation additions
- 14x "file.x" Transformation inclusions
- 10x let-in Local definitions
- 9x idx(L) Identity transformations
- 8x X X Compositions
- 4x L \ L Language restriction
- 4x w Transformation variables
- 2x src(X) Source extractions
21Other Examples
- Self-Application (The tool on itself!)
-
- SQL embedding (in ltbigwiggt)
-
- My-Java (endless variations)
-
L1 ltlt L2 '(L1 \ L2) L2' X1 ltlt X2
'(X1 \ src(X2)) X2'
Stm.select 'factor (lt2gt) if (lt3gt)
return ( \ (lt1gt) ) '
java
( sql)
( \ loops)
o syntaxe_francais
22Implementation
The 'Banana Algebra' Tool (3,600 lines of O'Caml)
http//www.itu.dk/people/brabrand/banana-algebra
/
Uses (underlying technologies)
'dk.brics.grammar' for parsing, unparsing, and
ambiguity analysis ! 'XSugar' for
transformation "concrete syntax ? abstract XML
syntax" 'XSLT' for transformation "XML ? XML"
23Outline
- Introduction "What is a Banana?"
- Bananas for Language Transformation
- Language Extension Pattern
- Banana Algebra
- Examples
- Implementation
- Related Work
- Conclusion
24Related Work (I/III)
- "Growing Languages with Metamorphic Syntax
Macros" Claus Brabrand Michael
Schwartzbach ( PEPM 2002 )
- "The metafront System Safe and Extensible
Parsing and Transformation" Claus Brabrand
Michael Schwartzbach ( LDTA 2003 , SCP J.
2007 )
25Related Work (II/III)
- Attribute Grammars
- Language transformation (and extension)
- via computation on AST's
- (using "inherited" or "synthesized" or
attributes) - E.g., Eli, JastAdd, Silver,
- Rewrite Systems
- Language transformation (and extension)
- via syntactic rewriting, using encodings
- gradually rewrite "S-syntax" to
"T-syntax" - gradually rewrite "S-syntax" to
"T-syntax" - E.g., Elan, TXL, ASFSDF, Stratego/XT,
S ? T
S ? T
26Related Work (III/III)
- Functional Programming
- Catas mimicked by "disciplined style" of fun.
programming - aided by
- Traversal functions (auto-synthesized from
datatypes) - Combinator libraries
- "Shortcut fusion" (to eliminate ' ' at
compile-time) - Category Theory
- A lot of this work can be viewed as Category
Theory
Basically ye olde issue GPL vs. DSL
27Conclusion
- IF bananas are sufficiently
- (Expressive)
- THEN you get
- Banana Algebra for free (16 banana ops)
- Incremental
- Modular
- Simple
- Safe
- Efficient
"Niche"
?
28BONUS SLIDES
If you want all the details
- "Syntactic Language Extension via an Algebra of
Languages and Transformations" Jacob
Andersen Claus Brabrand ( ITU Technical
Report, Dec. 2008 )
29Reduction Semantics
- Environments
-
-
- Reduction relations
-
-
- Abbreviations
- ...as a short-hand for
- ...as a short-hand for
ENVL VARL ? EXPL
environment of languages
ENVX VARX ? EXPX
environment of transformations
ENVL ? ENVX ? EXPL ? EXPL ? '?L'
ENVL ? ENVX ? EXPX ? EXPX ? '?X'
?,? - L ?L l
(?,?,L,l) ? '?L'
(?,?,X,x) ? '?X'
?,? - X ?X x
30Semantics (L)
CONL
VARL
l
?,? l ?L l
wfl
?,? v ?L ?(v)
?,? L ?L l
?,? L' ?L l'
RESL
?,? L \ L' ?L l l'
l
?,? L ?L l
?,? L' ?L l'
l l'
ADDL
l
?,? L L' ?L l l'
l
31Semantics (L)
?,? X ?X ( lS -gt lT ? c )
SRCL
?,? src (X) ?L lS
?,? X ?X ( lS -gt lT ? c )
TGTL
?,? tgt (X) ?L lT
?vl,? L' ?L l'
?,? L ?L l
LETL
?,? let vL in L' ?L l'
32Semantics (X)
( lS -gt lT ? c )
?,? LT ?L lT
?,? LS ?L lS
wfx
CONX
?,? ( LS -gt LT ? c ) ?X ( lS -gt lT ? c
)
?,? X ?X x
?,? L ?L l
VARX
RESX
?,? w ?X ?(w)
?,? X \ L ?X x l
x
?,? X ?X x
?,? X' ?X x'
x x'
ADDX
x
?,? X X' ?X x x'
x
33Semantics (X)
?,? X ?X ( lS -gt lT ? c )
?,? X' ?X ( lS' -gt lT' ?' c' )
lT lS'
COMPX
l
?,? X' X ?X ( lS -gt lT' ?' ? c' c )
?,? L ?L l
IDXX
?,? idx (L) ?X ( l -gt l id?(l) idc(l) )
?,?wx X' ?X x'
?,? X ?X x
LETL
?,? letx wX in X' ?X x'
34BONUS SLIDES
35Numeral Boolean Extension
- Numeral Extension (catamorphism)
- Boolean Extension (catamorphism)
var V var V lam V E lam
V E app E1 E2 app E1 E2
zero lam z (var z) succ E lam s
E pred E app E (lam z (var z))
Exp var Id lam Id Exp app Exp
Exp zero succ Exp pred Exp
Exp var Id lam Id Exp app Exp
Exp
var V var V lam V E lam V
E app E1 E2 app E1 E2
true lam a (lam b (var a)) false
lam a (lam b (var b)) if E1 E2 E3 app (app
E1 E2) E3
Exp var Id lam Id Exp app Exp
Exp true false if Exp Exp Exp
Exp var Id lam Id Exp app Exp
Exp
36Lambda with Booleans
lbl ? l
lb ? l
( lb -gt l Exp -gt Exp true '\a.\b .
a' false '\a.\b . b' if E1 E2 E3
'((E1 E2) E3)' )
l ? l
idx
lb
l
Exp true false if Exp Exp Exp
Exp var Id lam Id Exp app Exp
Exp
37Incremental Development
--- "l.l" ---
--- "li.l" ---
Id a-z a-z0-9 Exp.var Id
Exp.lam "\\" Id "." Exp
Exp.app "(" Exp Exp ")"
Exp.id "id"
--- "li2l.x" ---
let l "l.l" in idx(l) ( "li.l" -gt l
Exp -gt Exp Exp.id '\z.z' )
--- "ln.l" ---
Exp.zero "zero" Exp.succ
"succ" Exp Exp.pred "pred" Exp
--- "ln2li.x" ---
--- "ln2l.x" ---
let l "l.l" in idx(l) ( "ln.l" -gt l
Exp -gt Exp Exp.zero '\z.z'
Exp.succ '\x.1' Exp.pred
'(1 \z.z)' )
let l "l.l" in idx(l) ( ln -gt l"li.l"
Exp -gt Exp Exp.zero 'id'
Exp.succ '\x.1' Exp.pred '(1
id)' )
--- "ln2l.x" ---
"li2l.x" o "ln2li.x"
38Example cont'd
- Both statically reduce to same catamorphism
( Exp.app Exp.app(1, 2)
Exp.lam Exp.lam(1, 2) Exp.pred
Exp.app(1, Exp.lam(Id("z"), Exp.var(Id("z"))))
Exp.succ Exp.lam(Id("x"), 1)
Exp.var Exp.var(1) Exp.zero
Exp.lam(Id("z"), Exp.var(Id("z"))) )
Id a-z 0-9a-z Exp.app "(" Exp
Exp ")" Exp.lam "\" Id "." Exp
Exp.pred "pred" Exp Exp.succ "succ" Exp
Exp.var Id Exp.zero "zero"
Id a-z 0-9a-z Exp.app "(" Exp
Exp ")" Exp.lam "\" Id "." Exp Exp.var
Id
-gt
Exp -gt Exp, Id-gtId
39Java Repeat
javarepeat ? java
repeat ? java
java ? java
( repeat -gt java Exp -gt Exp,
Stm -gt Stm repeat S until (E) )
idx
java
Java ... ... Exp Exp "" Exp ... Stm
Exp "" "if" "(" Exp ")" Stm "while"
"(" Exp ")" Stm ...
repeat
575 lines
Stm "repeat" Stm "until" "(" Exp ")" ""
Entire extension 7 lines !
40Usage Scenarios
- Programmers
- May extend existing languages ( syntax macros)
- Developers
- May embed DSLs into host languages (SQL in Java)
- Developers (and teachers)
- May incrementally specify multi-layered languages
- Compiler writers
- May rely on tool and implement only a small core
- (and then specify the rest externally as
extensions)
41BONUS SLIDES
- - Parsing Error Reporting -
42Parsing
- Parsing (XSugar)
- Variant of Earley's algorithm O ( ?3 )
- Can parse any context-free grammar
- Closed under union of languages
- Support for production priority
- Tool easily adapts to other parsing algorithms
43Ambiguity parsing?unparsing
.
.
ASTL / L
L
.
.
- Unparsing
- Canonical whitespace
- Parsing
- Grammar ambiguity
?
?
44Ambiguity Analysis
- Ambiguity Analysis
- Using implementation ( )
on - Source language
- Target language and/or
- all intermediate languages (somewhat expensive)
- (Note Ambiguity analysis comes with XSugar tool)
- "Analyzing Ambiguity of Context-Free Grammars"
Claus Brabrand Robert Giegerich Anders
Møller ( CIAA 2007 )
"dk.brics.grammar" by Anders Møller
45Error Reporting
- Error reporting
- Static parse-error (O'Caml-lex)
- Static transformation error (XSugar)
- (is actually a parse-error in a cata
reconstructor) - Dynamic parse-error (XSugar)
- Dynamic transformation error
- impossible -)
Prototype
In ln2l.x (4,4)-(4,7) Parse error at
"Exp"
Parse error at character 6 (line 1, column 7)
in /tmp/shape84e645.txt
Could be improved
Parse error at character 23 (line 1, column
24) in /dev/stdin