Title: Type Analysis and Typed Compilation
1Type Analysis andTyped Compilation
- Stephanie Weirich
- Cornell University
2Outline
- Typed Compilation background
- Type Analysis background
- Initial framework - Type Passing
- Problems with Type Passing
- Type Erasure framework
- Closure Conversion comparison
- Related Work
3Traditional Compilation
Compilation is a series of translations between
several languages
4Typed Compilation
Most of those languages are typed, and the types
are translated with the terms
5Typed Compilation
- Safety
- well-typed programs cant go wrong
- Types describe invariants maintained by the
compiler - Performance
- Types provide information which may be used by
the compiler for optimization - Tag-free garbage collection
- Data layout control
6Type Analysis
- Create functions from types to values
- Code may branch on an unknown type
fun toString xa gt (typecase a of int gt
Int.toString char gt Char.toString
gt fn (fst,snd) gt ( (toString
fst) , (toString snd) ) )
x
7Data Layout Optimization
- Parametric polymorphism requires uniformity of
representation for all arguments, regardless of
their types.
kinds k T4 T types t B4 t array a
t -gtt terms e,f x lx.e e f Lak.e
e t ld(e,f) i e ? f if0 then
else... malloc(e,e ...) st(e,e,e) ld (t
array B4) -gt t st (t array B4 t ) -gt t
06/08/99
8Polymorphic Subscript
Because any array may be passed to a polymorphic
function, all arrays must look the same, no
matter the type of their elements.
Aint Array Bbool Array
ICFP '98
sub fn (Aa array,iint) gt wordsub(A,i)
06/08/99
9Monomorphic subscript
In languages such as C, the type of an array is
always known at its use.
int A4
bool B4
06/08/99
10Type Analysis to the Rescue
Type analysis allows us to determine the type at
run time.
Aint Array
Bbool Array
sub fn (Aa array,iint) gt typecase a
of int gt wordsub(A,i) bool gt
(wordsub(A,i div 32) (1ltlt(i mod 32)))
ltgt 0
06/08/99
ICFP '98
11Initial Framework - ??iML
- Make type abstraction/application explicit (as in
System F) - (? a. e) int
- Add typecase operator
- typecase t of
- int gt ...
- b g gt ...
- Code execution (operational semantics) now relies
on the type system
12Type Passing Semantics
subint(A,3)
(?a.?(Aa array,iint). typecase a of
int gt wordsub(A,i) bool gt (wordsub(A,i div
32) (1ltlt(i mod 32))) ltgt 0) int
(A,3)
typecase int of int gt wordsub(A,3)
bool gt (wordsub(A,3 div 32) (1ltlt(3 mod
32))) ltgt 0
wordsub(A,3)
06/08/99
13Typechecking
To typecheck a typecase term we annotate the
term with its return type tostring????a.
?xa. (typecase d.d -gt string a of int
gt Int.toString char gt Char.toString
b g gt fn (fst,snd)bg gt (
(toStringb fst) ,
(toStringg snd) ) ) x Then we check that
each branch satisfies that type with the
appropriate type substitution. The return type of
the entire term is a substituted for d in the
annotated type.
14Problems
- Issues about expressiveness
- Complicates low-level constructs
- Cant express some optimizations
- Cant express abstraction boundaries
ICFP '98
06/08/99
15Complexity
Complexity
- Polymorphic closure conversion
- Minamide et al. 1996
- Morrisett et al. 1998
- Duplication of effort
- Optimization of runtime behavior
- Explicit modeling of low level computation
- Allocation Semantics
- Typed Assembly Languages
06/08/99
16Inefficiency
Inefficiency
- Must pass all types even if some are never
examined - TIL -- eliminates unexamined run-time types in ad
hoc manner in translation to untyped calculus
ICFP '98
06/08/99
17Loss of Abstraction
- No way to hold types abstract if they can always
be examined - Clients allowed to break abstraction barriers and
infer more information than desired
Host
Client
ICFP '98
06/08/99
bool ref
capability
18Solution
- Pass terms that represent types
ICFP '98
06/08/99
19Type Erasure Semantics
sub int Rint (A,3)
06/08/99
20Type Erasure Semantics
sub int Rint (A,3)
(?a.?xR(a). ?(Aa array,iint). typecase x
of Rint gt wordsub(A,i) Rbool gt
(wordsub(A,i div 32) (1ltlt(i mod 32)))
ltgt 0) int Rint (A,3)
typecase Rint of Rint gt wordsub(A,3)
Rbool gt (wordsub(A,3 div 32) (1ltlt(3
mod 32))) ltgt 0
wordsub(A,3)
06/08/99
21Type Erasure Semantics
sub int Rint (A,3)
(??.?xR(?). ?(A? array,iint). typecase x
of Rint gt wordsub(A,i) Rbool gt
(wordsub(A,i div 32) (1ltlt(i mod 32)))
ltgt 0) int Rint (A,3)
typecase Rint of Rint gt wordsub(A,3)
Rbool gt (wordsub(A,3 div 32) (1ltlt(3
mod 32))) ltgt 0
wordsub(A,3)
06/08/99
22Formalization
- Special representation terms
- Rint
- R? (x,y)
- A term e which represents a type ? has the
special type R(?). - R?(Rint,Rint) R(int ? int)
- Instead of a type, the argument to typecase is a
term of type R(?) - The type system tracks the correspondence between
a type and its representation
06/08/99
23Typechecking
To typecheck a typecase term we still annotate
the term with its return type tostring????a.
?xa. ?yR(a). (typecase d.d -gt string y
of Rint gt Int.toString Rchar gt
Char.toString R(r1,r2) as bg gt fn
(fst,snd)bg gt ( (toStringb fst
r1) , (toStringg snd r2)
) ) x We require that the argument to the
typecase be of type R(t) for the typecase term to
typecheck. We can then substitute a for d in the
annotated type as before.
24Solution
Solution
- Everything that happens at run-time is described
by the terms - Can go to a type erasure semantics
- Optimization
- Traditional code optimizers can optimize type
representations - Sophisticated techniques still possible
- If a representation is not provided a type may
not be analyzed
06/08/99
25Typed Closure Conversion
- At run-time, a first-class function is
represented by a code pointer - But now x is unbound -- so we change the type of
l2 to take another argument - A closure is just a code pointer paired with the
values of its free variables -- its environment
26Typed Closure Conversion
- Application just extracts the environment and
applies the function to it
27Existential Types
- Unfortunately, the type of the closure now
depends on its free variables - Existential types hold the types of free
variables abstract
clos (intint -gt int) int
l1 ?xint. pack (l2,x) as ?env.((intenv
-gt int)env) hiding int
unpack (env,clos) (f 3) in (1 clos)(4,2 clos)
28Polymorphic Closure Conversion
- In a type passing semantics a function may also
have free type variables at run time - They also must be part of the closure, via
translucent types - The type environment is then abstracted using
existential kinds
29Closure Conversion
- In a type erasure semantics, only the type
representation remains at run time. - At the time of closure creation, the type
arguments may be given to the function.
30Multi-stage Type Analysis
Final Step of typed compilation compile to a
typed assembly language
31Multi-stage Type Analysis
- How do we preserve the meaning of typecase when
the types themselves change? - in TALx86 both int and float are compiled into
B4, the type of 4-byte values - int ? int may be compiled into a variety of
types depending on the calling convention
32Related Work
- Harper, Morrisett - POPL 95
- Minamide, Morrisett, Harper - POPL 96
- Minamide - 2nd Fuji Intl Workshop on Functional
and Logic Programming 96 - Morrisett, Walker, Crary, Glew - POPL 98
- Crary, Weirich, Morrisett - ICFP 98
- Crary, Weirich - ICFP 99
06/08/99
33Low-level Type Analysis
- How do we analyze types with quantifiers?
- In TALx86 every function (polymorphic or not) is
compiled into a polymorphic code pointer
34Areas for Future Work
06/08/99
34
35(No Transcript)
36Outline
- Introduction
- Typed Compilation
- Type Analysis in general
- toString example
- Type Analysis in compilation
- bit array example
- Initial framework
- syntax -- from examples
- semantics
- type passing
- problems
- complication of theory
- cant express efficient code
- loss of abstraction
- Type erasure semantics
- syntax
- dynamic semantics
- static semantics
- Closure Conversion example
37A note about typechecking
- In this example wordsub has a strange type
- a array int -gt int
- It would better if it were of type
- int array int -gt int
- Then argument to subscript must allways be an int
array. But that forgets its actual type of bool
array. So we create a special type, called packed
array, with a type-level type analysis operator.
38Type Passing Semantics
subint(A,3)
06/08/99
12
ICFP '98
39Type Passing Semantics
subint(A,3)
(??.?(A? array,iint). typecase ? of
int gt wordsub(A,3) bool gt (wordsub(A,3 div
32) (1ltlt(3 mod 32))) ltgt 0) int
(A,3)
06/08/99
13
40Type Passing Semantics
subint(A,3)
(??.?(A? array,iint). typecase ? of
int gt wordsub(A,3) bool gt (wordsub(A,3 div
32) (1ltlt(3 mod 32))) ltgt 0) int
(A,3)
typecase int of int gt wordsub(A,3)
bool gt (wordsub(A,3 div 32) (1ltlt(3 mod
32))) ltgt 0
06/08/99
14
41Formalization
- Special representation terms
- Rint
- R? (x,y)
06/08/99
30
42Formalization
- Special representation terms
- Rint
- R? (x,y)
- A term e which represents a type ? has the
special type R(?). - R?(Rint,Rint) R(int ? int)
06/08/99
31
43Type Erasure Semantics
sub int Rint (A,3)
(??.?xR(?). ?(A? array,iint). typecase x
of Rint gt wordsub(A,3) Rbool gt
(wordsub(A,3 div 32) (1ltlt(3 mod 32)))
ltgt 0) int Rint (A,3)
06/08/99
24
44Type Erasure Semantics
sub int Rint (A,3)
(??.?xR(?). ?(A? array,iint). typecase x
of Rint gt wordsub(A,3) Rbool gt
(wordsub(A,3 div 32) (1ltlt(3 mod 32)))
ltgt 0) int Rint (A,3)
typecase Rint of Rint gt wordsub(A,3)
Rbool gt (wordsub(A,3 div 32) (1ltlt(3
mod 32))) ltgt 0
06/08/99
25
45(No Transcript)
46Type based compilation
Terms
Types
Source Language
Intermediate Language
Machine Language
06/08/99
5
ICFP '98
47Multi-stage Type Analysis
48Type Level Type Analysis
49Type Safe Language
- Give us guarentees about the run-time behavior of
programs - Types abstractly describe the run-time flow of
values
50Traditional Compilation
( fn x gt x1 ) 3
Source File
Type Inference Checking
( fn x int gt x 1 ) 3
( ? x . x 1 ) 3
Untyped IL
l1 push 3 call l2 retn l2 mov
eax,esp4 add eax,eax mov
esp4, eax retn
Machine Code
51Type Based Compilation
( fn x gt x1 ) 3
Source File
Type Inference Checking
( fn x int gt x 1 ) 3
( ? x int . x 1 ) 3
Typed IL
l1 push 3 call l2 retn l2 mov
eax,esp4 add eax,eax mov
esp4, eax retn
Machine Code
52Why Typed Compilation
- Safety -- assurances about compiler correctness
- Type based optimizations
- For example ...
53But we dont always know the types, what then ?
- For example -- Parametric polymorphism
- Introduce an operator into the language that can
distinguish types - Typecase !
54Need a language with this operator - lmli
55Second example
56Performance and Safety
06/08/99
16
ICFP '98
57 Safety
Terms
Types
Source
IL
Machine
TAL
06/08/99
17
ICFP '98
58Type Passing Semantics
- Used by the language??iML , the Intermediate
language of TIL/ML and FLINT compilers - Unlike most calculi where types may be erased
prior to run-time, types do have an operational
significance -- they are arguments to typecase
terms.
06/08/99
11
ICFP '98
59Intensional Type Analysis
- Valuable element of type-directed compilers
- Allows otherwise untypeable optimizations
- Specialized data layout
- Tag-free Garbage Collection
- Polymorphic marshalling
- ...
06/08/99
10
ICFP '98
60Solution
- More efficient
- Only pass type representations when necessary
- Traditional code optimizers can help
- Sophisticated techniques still possible
- Recovers abstraction
- Can withhold representation from clients
06/08/99
29