Title: Parametric Polymorphism for Popular Programming Languages
 1Parametric Polymorphism for Popular Programming 
 Languages
- Andrew Kennedy 
 - Microsoft Research Cambridge
 
  2OrForall for all
- Andrew Kennedy 
 - Microsoft Research Cambridge(Joint work with Don 
Syme) 
  3Curriculum Vitae for FOOLs
http//research.microsoft.com/akenn 
 4Parametric polymorphism
- Parameterize types and code by types 
 - Concept Strachey (1967) 
 - Language ML (Milner, 1975), Clu (Liskov, 1975) 
 - Foundations System F (Girard, 1971), Polymorphic 
lambda calculus (Reynolds, 1974)  - Engineering benefits are well-known (code re-use 
 strong typing)  - Implementation techniques are well-researched
 
  5Polymorphic Programming Languages
Standard ML
Eiffel
OCaml
C
Ada
Clu
GJ
Haskell
Mercury
Miranda
Pizza 
 6Widely-usedPolymorphic Programming Languages
C 
 7Widely-used Strongly-typedPolymorphic 
Programming Languages 
 8In 2004?
C
Visual Basic?
Java
Cobol, Fortran, ? 
 9This talk
- The .NET generics project 
 - What was challenging? 
 - What was surprising? 
 - Whats left?
 
  10What is the .NET CLR (Common Language Runtime)?
- For our purposes the CLR 
 - Executes MS-IL (Intermediate Language) programs 
using just-in-time or way-ahead-of-time 
compilation  - Provides an object-oriented common type system 
 - Provides managed services garbage collection, 
stack-walking, reflection, persistence, remote 
objects  - Ensures security through type-checking 
(verification) and code access security 
(permissions  stack inspection)  - Supports multiple source languages and interop 
between them 
  11Themes
- Design Can multiple languages be accommodated by 
a single design? What were the design trade-offs?  - Implementation How can run-time types be 
implemented efficiently?  - Theory How expressive is it? 
 - Practice Would you like to program in it? 
 - Future Have we done enough?
 
  12Timeline of generics project
May 1999
Don Syme presents proposal to C and CLR teams
Feb 2000
Initial prototype of extension to CLR
Feb 2001
Product Release of .NET v1.0
Jan 2002
Our code is integrated into the product teams 
code base
Nov 2002
Anders Hejlsberg announces generics at OOPSLA02
late 2004?
Product release of .NET v1.2 with generics 
 13  14Design for multiple languages
C Give me template specialization
C Can I write class CltTgt  T
CJust give me decent collection classes
C And template meta-programming
Visual BasicDont touch my language!
JavaRun-time types please
EiffelAll generic types covariant please
MLFunctors are cool!
HaskellRank-n types? Existentials? Kinds? Type 
classes?
SchemeWhy should I care? 
 15Some design goals
- SimplicityDont surprise the programmer with odd 
restrictions  - ConsistencyFit with the object model of .NET 
 - Separate compilationType-check once, instantiate 
anywhere  
  16Non-goals
- C style template meta-programmingLeave this to 
source-language compilers  - Higher-order polymorphism, existentialsHey, 
lets get the basics right first!  
  17Whats in the design?
- Type parameterization for all declarations 
 - classes e.g. class SetltTgt 
 - interfaces e.g. interface IComparableltTgt 
 - structse.g. struct HashBucketltK,Dgt 
 - methods e.g. static void ReverseltTgt(T arr) 
 - delegates (first-class methods) e.g. delegate 
void ActionltTgt(T arg) 
  18Whats in the design (2)?
- Bounds on type parameters 
 - single class bound (must extend)e.g. class 
GridltTgt where T  Control  - multiple interface bounds (must implement)e.g. 
class SetltTgt where T  IComparableltTgt  
  19Simplicity gt no odd restrictions
interface IComparableltTgt  int CompareTo(T 
other)  class SetltTgt  IEnumerableltTgt where 
T  IComparableltTgt private TreeNodeltTgt root 
 public static SetltTgt empty  new SetltTgt() 
public void Add(T x)    public bool 
HasMember(T x)   SetltSetltintgtgt s  new 
SetltSetltintgtgt()
Interfaces and superclass can be instantiated
Bounds can reference type parameter (F-bounded 
polymorphism)
Even statics can use type parameter
Type arguments can be value or reference types 
 20Consistency gt preserve types at run-time
- Type-safe serialization 
 - Interop with legacy code 
 - Reflection
 
Object obj  formatter.Deserialize(file)LinkedLi
stltintgt list  (LinkedListltintgt) obj 
// Just wrap existing Stack until we get round to 
re-implementing it class GStackltTgt  Stack st 
 public void Push(T x)  st.Push(x)  public T 
Pop()  return (T) st.Pop() 
object obj Type ty  obj.GetType().GetGenericAr
guments()0  
 21Separate compilation gt restrict generic 
definitions
- No dispatch through a type parameter 
 - No inheritance from a type parameter 
 
class CltTgt  void meth()  T.othermeth()  // 
dont know whats in T 
class WeirdltTgt  T    // dont know whats in 
T  
 22  23Compiling polymorphism, as was
- Two main techniques 
 - Specialize code for each instantiation 
 - C templates, MLton  SML.NET monomorphization 
 - good performance ? 
 - code bloat ? 
 - Share code for all instantiations 
 - Either use a single representation for all types 
(ML, Haskell)  - Or restrict instantiations to pointer types 
(Java)  - no code bloat ? 
 - poor performance ? (extra boxing operations 
required on primitive values)  
  24Compiling polymorphism in the Common Language 
Runtime
- Polymorphism is built-in to the intermediate 
language (IL) and the execution engine  - CLR performs just-in-time type specialization 
 - Code sharing avoids bloat 
 - Performance is (almost) as good as 
hand-specialized code  
  25Code sharing
- Rule 
 - share field layout and code if type arguments 
have same representation  - Examples 
 - Representation and code for methods in 
Setltstringgt can be also be used for Setltobjectgt 
(string and object are both 32-bit pointers)  - Representation and code for Setltlonggt is 
different from Setltintgt (int uses 32 bits, long 
uses 64 bits) 
  26Exact run-time types
- We want to support if (x is Setltstringgt)  ... 
 else if (x is SetltComponentgt)  ...   - But representation and code is shared between 
compatible instantiations e.g. Setltstringgt and 
SetltComponentgt  - So theres a conflict to resolve 
 - and we dont want to add lots of overhead to 
languages that dont use run-time types (ML, 
Haskell)  
  27Object representation in the CLR
vtable ptr
vtable ptr
element type
 fields
no. of elements
 elements
normal object representationtype  vtable 
pointer
array representationtype is inside object 
 28Object representation for generics
- Array-style store the instantiation directly in 
the object?  - extra word (possibly more for multi-parameter 
types) per object instance  - e.g. every list cell in ML or Haskell would use 
an extra word  - Alternative make vtable copies, store 
instantiation info in the vtable  - extra space (vtable size) per type instantiation 
 - expect no. of instantiations ltlt no. of objects 
 - so we chose this option
 
  29Object representation for generics
x  Setltstringgt
y  Setltobjectgt
vtable ptr
vtable ptr
 fields
 fields
code for Add
Add
Add
code for HasMember
HasMember
HasMember
ToArray
ToArray
code for ToArray
string
object 
 30Type parameters in shared code
- Run-time types with embedded type parameters 
e.g. class TreeSetltTgt  void Add(T item)  
..new TreeNodeltTgt(..)..  Q Where do we get 
T from if code for m is shared?A Its always 
obtainable from instantiation info in this 
objectQ How do we look up type rep for 
TreeNodeltTgt efficiently at run-time?A We keep a 
dictionary of such type reps in the vtable for 
TreeSetltTgt  
  31Dictionaries in action
class SetltTgt    public void Add(T x)   
new TreeNodeltTgt()  public T ToArray()  
 new T  Setltstringgt s  new 
Setltstringgt()s.Add(a)SetltSetltstringgtgt ss  
new SetltSetltstringgtgt()ss.Add(s)Setltstringgt 
ssa  ss.ToArray()string sa  s.ToArray() 
 32Dictionaries in action
vtable for Setltstringgt
class SetltTgt    public void Add(T x)   
new TreeNodeltTgt()  public T ToArray()  
 new T  Setltstringgt s  new 
Setltstringgt()s.Add(a)SetltSetltstringgtgt ss  
new SetltSetltstringgtgt()ss.Add(s)Setltstringgt 
ssa  ss.ToArray()string sa  s.ToArray()
vtable slots
string 
 33Dictionaries in action
vtable for Setltstringgt
class SetltTgt    public void Add(T x)   
new TreeNodeltTgt()  public T ToArray()  
 new T  Setltstringgt s  new 
Setltstringgt()s.Add(a)SetltSetltstringgtgt ss  
new SetltSetltstringgtgt()ss.Add(s)Setltstringgt 
ssa  ss.ToArray()string sa  s.ToArray() 
 34Dictionaries in action
vtable for Setltstringgt
class SetltTgt    public void Add(T x)   
new TreeNodeltTgt()  public T ToArray()  
 new T  Setltstringgt s  new 
Setltstringgt()s.Add(a)SetltSetltstringgtgt ss  
new SetltSetltstringgtgt()ss.Add(s)Setltstringgt 
ssa  ss.ToArray()string sa  s.ToArray()
vtable for SetltSetltstringgtgt
vtable slots
Setltstringgt 
 35Dictionaries in action
vtable for Setltstringgt
class SetltTgt    public void Add(T x)   
new TreeNodeltTgt()  public T ToArray()  
 new T  Setltstringgt s  new 
Setltstringgt()s.Add(a)SetltSetltstringgtgt ss  
new SetltSetltstringgtgt()ss.Add(s)Setltstringgt 
ssa  ss.ToArray()string sa  s.ToArray()
vtable for SetltSetltstringgtgt
vtable slots
Setltstringgt
TreeNodeltSetltstringgtgt 
 36Dictionaries in action
vtable for Setltstringgt
class SetltTgt    public void Add(T x)   
new TreeNodeltTgt()  public T ToArray()  
 new T  Setltstringgt s  new 
Setltstringgt()s.Add(a)SetltSetltstringgtgt ss  
new SetltSetltstringgtgt()ss.Add(s)Setltstringgt 
ssa  ss.ToArray()string sa  s.ToArray()
vtable for SetltSetltstringgtgt
vtable slots
Setltstringgt
TreeNodeltSetltstringgtgt
Setltstringgt 
 37Dictionaries in action
vtable for Setltstringgt
class SetltTgt    public void Add(T x)   
new TreeNodeltTgt()  public T ToArray()  
 new T  Setltstringgt s  new 
Setltstringgt()s.Add(a)SetltSetltstringgtgt ss  
new SetltSetltstringgtgt()ss.Add(s)Setltstringgt 
ssa  ss.ToArray()string sa  s.ToArray()
vtable slots
string
TreeNodeltstringgt
string
vtable for SetltSetltstringgtgt
vtable slots
Setltstringgt
TreeNodeltSetltstringgtgt
Setltstringgt 
 38x86 code for new TreeNodeltTgt
mov ESI, dword ptr EDImov EAX, dword 
ptr ESI24mov EAX, dword ptr EAXadd 
EAX, 4mov dword ptr EBP-0CH, EAXmov 
EAX, dword ptr EBP-0CHmov EBX, dword ptr 
EAXtest EBX, EBXjne SHORT 
G_M003_IG06G_M003_IG05push dword ptr 
EBP-0CHpush ESImov EDX, 0x1b000002mov 
 ECX, 0x903ea0call _at_RuntimeHandlejmp 
SHORT G_M003_IG07G_M003_IG06mov EAX, 
EBXG_M003_IG07mov ECX, EAXcall 
_at_newClassSmall
Retrieve dictionary entry from vtable
If non-null then skip
Look up handle the slow way
Create the object with run-time type 
 39Is it worth it?
- With no dictionaries, just run-time look-up 
 - new SetltTgt() is 10x to 100x slower than normal 
object creation  - With lazy dictionary look-up 
 - new SetltTgt() is 10 slower than normal object 
creation 
  40Shared code for polymorphic methods
- Polymorphic methods 
 - Specialize per instantiation on demand 
 - Again share code between instantiations where 
possible  - Run-time types issue solved by dictionary-passing
 style 
  41Performance
- Non-generic quicksortvoid Quicksort(object 
arr, IComparer comp)   - Generic quicksortvoid GQuicksortltTgt(T arr, 
GIComparerltTgt comp)  - Compare on element types int, string, double
 
  42Performance 
 43  44Transposing F to C
- As musical keys, F and C? are far apart 
 - As programming languages, (System) F and 
(Generic) C? are far apart  - But
 
Polymorphism in Generic C? is as expressive as 
polymorphism in System F 
 45System F and C? 
 46System F into C? 
- Despite the differences, we can formalize a 
translation from System F into (Generic) C? that  - is fully type-preserving (no loss of information) 
 - is sound (preserves program behaviour) 
 - makes crucial use of the fact that
 
polymorphic virtual methodsexpressfirst-class 
polymorphism 
 47Polymorphic virtual methods
- Define an interface or abstract classinterface 
Sorter  void SortltTgt(T a, IComparerltTgt c)   - Implement the interfaceclass QuickSort  
Sorter  ... class MergeSort  Sorter  ...   - Use instances at many type instantiationsvoid 
TestSorter(Sorter s, int ia, string sa)  
s.Sortltintgt(ia, IntComparer) 
s.Sortltstringgt(sa, StringComparer)TestSorter(
new QuickSort(), ...)TestSorter(new 
MergeSort(), ...) 
  48Compare
- Define an SML signaturesignature Sorter  sig 
 val Sort  a array  (aa-gtorder) gt unit 
end  - Define structures that match the 
signaturestructure QuickSort gt Sorter  ... 
structure MergeSort gt Sorter  ...   - Use structures at many type instantiationsfunct
or TestSorter(S  Sorter)  struct fun test 
(ia, sa)  (S.Sort(ia, Int.compare) 
S.Sort(sa, String.compare) endstructure TestQS 
 TestSorter(QuickSort) TestQS.test(...)structu
re TestMS  TestSorter(MergeSort) 
TestMS.test(...) 
  49Or (Russo first-class modules)
- Define an SML signaturesignature Sorter  sig 
 val Sort  a array  (aa-gtorder) gt unit 
end  - Define structures that match the 
signaturestructure QuickSort gt Sorter  ... 
structure MergeSort gt Sorter  ...   - Use a function to test the structuresfun 
TestSorter (s, ia, sa)   let structure S as 
Sorter  s in  (S.Sort(ia, Int.compare) 
S.Sort(sa, String.compare))  endTestSorter 
(structure QuickSort as Sorter, 
...)TestSorter (structure MergeSort as 
Sorter, ...) 
  50Observations
- Translation from System F to C is global 
 - generates new class names for (families of) 
polymorphic types  - The generics design for Java (GJ) also supports 
polymorphic virtual methods  - C has template methods but not virtual ones 
 - for good reason it compiles by expansion 
 - Distinctiveness of polymorphic virtual methods 
shows up in (type-passing) implementations (e.g. 
CLR)  - requires execution-time type application
 
  51  52Type inference?
- ML and Haskell have type inference 
 - C programs must be explicitly-typed 
 - Is this a problem in practice? 
 - not for the most-frequent application collection 
classes  - but try parser combinators in C... 
 
  53Parser combinators (Sestoft)
class SeqSndltT,Ugt  ParserltUgt  ParserltTgt tp 
 ParserltUgt up public SeqSnd(ParserltTgt tp, 
ParserltUgt up)  this.tp  tp this.up  up  
public ResultltUgt Parse(ISource src)  
ResultltTgt tr  tp.Parse(src) if (tr.Success) 
 ResultltUgt ur  up.Parse(tr.Source) 
if (ur.Success)  return new SuccltUgt(ur.Value, 
ur.Source)  return new FailltUgt()  
 54On the other hand
- .NET generics are supported by 
 - debugger 
 - profiler 
 - class browser 
 - GUI development environment
 
  55Try it!
- Rotor  shared-source release of CLR and C 
 - http//msdn.microsoft.com/NET/sscli 
 - Generics  Rotor  Gyro 
 - Gyro extends Rotor with generics support in CLR 
and C  - http//research.microsoft.com/projects/clrgen 
 
  56  57Extension Variance
- Should we add variance? e.g. 
 - IEnumeratorltButtongt lt IEnumeratorltComponentgt 
 - IComparerltComponentgt lt IComparerltButtongt 
 - Can even use this to support broken Eiffel
 
class CellltTgt  T val void Set(T newval)  
 val  newval   T Get()   return val  
class CellltTgt  T val void Set(object 
newval)   val  (T) newval   T Get()   
return val  
Run-time check
invariant in T
covariant in T 
 58Extension Parameterize by superclass
- Can type-check given sufficient constraints
 
T must extend D
class D  virtual void m1()    virtual 
void m2()    class CltTgt  T where T  D 
int f override void m2(T x)  x.m1()  
 new virtual void m3()   
Override method D.m2
Know m1 exists because of constraint on T
New method, name can clash with method from T 
 59ExtensionParameterized by superclass (2)
- Provides a kind of mixin facility 
 - Unfortunately, implementation isnt easy 
 - Wed like to share rep  code for CltPgt and CltQgt 
for reference types P and Q, but it may be the 
case that  - object size of CltPgt ? size of CltQgt 
 - field offset of CltPgt.f ? offset of CltQgt.f 
 - vtable slot of CltPgt.m3 ? slot of CltQgt.m3 
 - gt abandon sharing, or do more run-time lookup
 
  60Open problem
- Most widely used polymorphic library is probably 
C STL (Standard Template Library)  - STL gets expressivity and efficiency from 
checking and compiling instantiations separately  - Really  ML functors cant match it 
 - How can we achieve the same expressivity and 
efficiency with compile-time-checked parametric 
polymorphism? 
  61