Title: Design and Implementation of Generics for the 'NET Common Language Runtime
1Design and Implementation of Generics for the
.NET Common Language Runtime
- Andrew Kennedy Don Syme
- Microsoft Research, Cambridge UK
- Presented at PLDI (Programming Language Design
and Implementation) 2001 - Presented by Dmitri Alperovitch
2Parametric Polymorphism (Generics)
- Benefits
- Code reuse
- Faster code (no runtime casts)
- Safer programming (static type-checking)
- For use in
- Collection classes
- Algorithms
- Structured types
3Proposal for Generics in CLR
- Very expressive and efficient design, supporting
- Generic Types (ex. LinkedListltTgt)
- Generic static, instance, and virtual methods
(ex. quicksortltTgt(T arr)) - Unrestricted instantiations value types are
supported (no boxing) - Polymorphic recursion
- ex. void mltTgt(T x)
- ...
- mltIListltTgt(....)
- ...
- F-bounded type parameters. A constraint may
involve the type parameter itself (ex.
quicksortltT IComparableltTgtgt) - Exact run-time types (ex. if(x is
LinkedListltDoggt) )
4No support for
- No covariance in the type parameters of a generic
type (ex. ListltStringgt not a subtype of
ListltObjectgt) - Type parameter T cannot be used in
- new T() or T.func()
- A type cannot implement a generic interface at
more than one instance - class Matrix IEnumerableltintgt,
IEnumerableltintgt - Inheritance from naked type parameter (ex.
class FooltTgt extends T)
5Implementation Details
- Translation
- Type-checked at declaration, not at use (unlike
C templates) - After type check, translated to Generic IL
bytecode which supports explicit type parameters
and validation of generic code
6Implementation Details VM Changes
- Dynamic Code Expansion and Sharing (NEW IDEA)
- Combination of C (expansion) and Java/GJ
(sharing) approaches - Specialized instances of generic class CltTgt are
created at runtime for relevant argument types T
(JIT) - Type instances are shared among relevant types T,
avoiding most of the code bloat
7Dynamic Code Expansion and Sharing
- Type instances are created lazily (when
encountered), which permits polymorphic recursion - Code is also recompiled lazily when new method
calls are encountered
8Implementation Details VM Changes
- Pass and store runtime type information
- Objects carry exact runtime type information
- Accomplished by duplication of vtables (not
shared among Stackltstringgt and Stackltobjectgt) - The vtable for a type instance contains the exact
actual argument type type handle (ex. ltstringgt
as last entry for Stackltstringgt) - These argument types are used in runtime type
casts and instance checks
9Just-In-Time Compilation
- Some polymorphism code cannot be specialized
statically (ex. polymorphic recursion) - Lazy specialization has a drawback can generate
more optimal code if know ahead of time it will
not be shared (allocation overhead) - Type handle dictionary is used to keep track of
instantiated vtables. Some type handles may be
precomputed and stored in the class vtable to
minimize runtime lookup during allocations (ex.
DltTgt will need SetltTgt and ListltTgt)
10Object-based Stack
Generic Stack
- .class StackltTgt
- .field private class !0 store
- .methods public void Push(!0 x)
- .maxstack 4
- .locals (!0, int32)
- ..
- ldarg.0
- ldfld class !0 Stacklt!0gtstore
- ldarg.0
- dup
- ldfld int32 Stacklt!0gtsize
-
- stelem.any !0
- ret
-
- .methods public !0 Pop()
-
- stfld int32 Stacklt!0gtsize
.class Stack .field private class
System.Object store .methods public void
Push(class System.Object x) .maxstack
4 .locals (class System.Object,
int32) .. ldarg.0 ldfld class System.Object
Stackstore ldarg.0 dup ldfld int32
Stacksize stelem.ref ret .methods public
class System.Object Pop() stfld int32
Stacksize ldelem.ref ret
11Polymorphism in instructions
- Most IL instructions are inherently generic in
the sense that they work with many types (ex.
ldarg.1 pushes the first argument of a method
onto a stack). Contrast with JVM different
instructions for different types. - \ No Changes
- Other IL instructions are generic, but are
followed by other information required for
overloading resolution (ex. ldfld class
System.Object Stackstore Þ ldfld !0
Stacklt!0gtstore) - \ Use numbered type parameters instead of class
names - A small number of IL instructions do come in
different variants (ex. ldelem.ref, stelem.ref,
ldelem.i4, stelem.i4 Þ ldelem.any, stelem.any) - \ Add 2 generic instructions for array access
and update
12Performance
- Goal no performance bar to using polymorphic
code - 500 speed up in use with value types (eliminates
boxing) - No casts Þ 20 speed up
- Allocation Þ 10 slower with generic code
13Unique Ideas
- Combining the 2 current techniques
code-expansion and code sharing - Worlds first cross-language generics (not just
for C, but C and VB and other languages
running on the CLR)
14Critique
- Backwards compatibility? Forget it!
- New bytecode instructions
- No covariance in type parameters (cant mix old
even recompiled object-based code with generics) - 2 API Frameworks then? Maybe. No answer in the
paper (couldnt get a straight answer from
Microsoft sources because of NDA issues)
15Future
- Bill Gates at last months OOPSLA 2002
- The next version of .Net CLR will have generics!
- When? Who knows?
- Implementation details may still change, but
major changes are unlikely since the C team has
already gone on record that all the features
outlined in this paper will be provided - Considering type-safe variance design for the
future