Title: Untrustworthy Programming Languages
1Untrustworthy Programming Languages
Andrew Kennedy, MSR Cambridge
2Do you trust your programming language?
- Modern programming platforms promise security
- The Java security model is based on a
customizable "sandbox" in which Java software
programs can run safely, without potential risk
to systems or users (java.sun.com/security) - The .NET Common Language Runtime implements its
own secure execution model that is independent of
the host platform (Don Box, MSDN magazine) - Most articles emphasise type-safety (gt memory
safety) of the JVM or CLR - And of course, special-purpose mechanisms such as
Code Access Security (stack-walking),
permissions, crypto, etc - But thats not the whole story
3The way it was
- In the past
- programming language abstractions made languages
high-level i.e. far from the raw metal of the
machine - good software engineering
- protected programmers from themselves others
- If the language contained holes, it was just a
programming problem - In any case, nothing was enforced underneath
except at coarse boundaries (machine,
system/user, process)
4But now...
- The programming model is part of the security
model - in particular, its type system
- but also, other aspects
- Programmers will assume that abstractions are
enforced underneath... - ...and use them to write secure code.
5Eiffel, 1989
Cook, W.R. (1989) - A Proposal for Making Eiffel
Type-Safe, in Proceedings of ECOOP'89. S. Cook
(ed.), pp. 57-70. Cambridge University Press.
Betrand Meyer, on unsoundness of Eiffel Eiffel
users universally report that they almost never
run into such problems in real software
development.
6Ten years later Java
7Secure programming platforms
Java source
C
C
Visual Basic
C compiler
VB compiler
Java compiler
C compiler
JVML (bytecodes)
CIL
CIL
CIL
Executed on
Executed on
JVM(Java Virtual Machine)
.NET CLR(Common Language Runtime)
8Type safety
- Ensures
- data safety can access memory only through typed
objects - code safety can access components only according
to their interface - Isolates software processes (Application
Domains in .NET) - used for downloadable plug-ins for UI in next
version of Windows - Importance of type safety is now widely
appreciated - Microsoft would issue an immediate critical
update if a type safety bug was discovered
Insert war stories here
9Type loophole gt anything goes
- Exploit a type loophole to execute arbitrary
code. Heres a recipe. - Define a delegate type D, create a delegate
object off an empty methoddelegate void
D()public static void DoNothing() D d new
D(DoNothing) - Define a SpoofD class with int field spoofing the
(internal) function pointer field of the delegate
typeclass SpoofD public int fptr ... - Now pretend that the delegate object has type
SpoofD (via type loophole)SpoofD sd
...loophole magic...(d) - Set the spoof function pointer field to the
address of your malicious codesd.fptr
my_bad_code - Invoke the delegatesd()
10Beyond type safety
- How do programmers reason about security
properties of their code? Or about their code at
all? We might hope that - A C programmer can reason about code armed only
with the C language spec and specs for libraries
used by the code - Unfortunately, it seems that a C programmer also
needs - Some understanding of how C is translated into
IL - Some understanding of the behaviour of IL
- Some understanding of parts of the standard
library not mentioned in the language spec or
used by the program
11Example 1 Privacy through override
- In C (and Java), overridden methods cannot be
invoked directly except by the overriding method - This property has been used by programmers for
security purposesclass InsecureWidget //
No checking of argument virtual void Put(string
s) class SecureWidget InsecureWidget
// Validate argument and pass on override void
Put(string s) Validate(s)
base.Put(s) SecureWidget sw new
SecureWidget()// We cant avoid validation of
arguments to Put, can we?
// Oh, yes we can! Direct call on
superclassldloc swldstr Invalid stringcall
void InsecureWidgetPut(string)
12Analysis
- What went wrong?
- In C, overridden methods can only be invoked
through base calls - In IL, they can be called directly
- So there are programs in IL that can provoke
behaviour not possible from C - What is a good way to characterize this?
- Translation from C to IL fails to be fully
abstract - See Protection in Programming Language
Translation, Abadi, 1998 - How can we fix it?
- Not easily IL was designed for multiple
languages, with conflicting goals
13An ideal full abstraction
- Ensure that all abstractions of the programming
language are enforced by the runtime - programmers dont have to know whats underneath
- if they understand the programming language, they
understand the platform programming model - Ensure that translation from C to IL is fully
abstract
C program
Properties that hold here...
IL program
...also hold here
14Full abstraction
- Two programs are equivalent if they have the same
behaviour in all contexts of the language
e.g. - A translation is fully abstract if it respects
equivalence - For us
- the translation is from source language (C
etc) to MSIL - if there exist contexts (e.g. other code) in MSIL
that can distinguish equivalent source programs,
then the translation fails to be fully abstract
class Secret private int f public
Secret(int fv) f fv public Set(int fv)
f fv
class Secret public Secret(int fv)
public Set(int fv)
15Full abstraction for Java
- Translation from Java to JVML is not quite fully
abstract (Abadi, 1998) - At least one failure access modifiers in inner
classes - a late addition to the language
- not directly supported by the JVM
- compiled by translation gt impractical to make
fully-abstract without changing the JVM
16Full abstraction for C?
- A number of failures
- Excuse multiple languages target the CLR, with
different goals - The JVM was designed for a single language Java.
(Almost) Full- abstraction was probably an
accident though in retrospect its a good thing. - For C/CLR, we can catalogue failures of full
abstraction and propose fixes - either change the translation from C to IL
- or reduce expressivity of IL (fewer IL contexts)
- or increase the expressivity of C (more C
contexts) - At least document the failures, educate
programmers, provide tools to spot insecure
programming patterns
17Example 2Encapsulation of object state
- Programmer expectation instances of types whose
API ensures immutability are immutable. - Ex String, DateTime, Int32
- Boxing shouldnt make any difference, should it?
- // A dictionary keyed on stringsclass
StringDict private Hashtable dict public
object Get(string s) return dicts
internal void Set(string s, object o)
static StringDict personalData// In a
module far away// We cannot update from here
object salary personalData.Get(Salary)
// Oh, yes we can! Just get pointer to
interiorldloc salaryunbox int32stind.i4 1000000
18Example 2Encapsulation of object state
- An equivalence that is not preserved
- Fix?
- In CLR type system disallow update after
unboxing
public static int Foo(int x) object y
(object) x Bar(y) return x
public static int Foo(int x) object y
(object) x Bar(y) return (int) y
19Example 3this is valid object instance?
- Instance methods are always invoked on a valid
instance, surely?class Foo // Instance
registered for privileged action private static
Foo registered null // Only called from
this module (internal access) internal void
Register() registered this public void
Bar() if (this registered) //
Perform privileged action // We cant
execute privileged action from another module
// Oh, yes we can! Just call-direct-with-nullldnu
llcall void FooBar()
20Example 3 this is valid object instance?
- An equivalence that is not preserved
- Fix?
- In C compiler explicit check-for-null at start
of method - In CLR check-for-null at call-site (as with
virtual call)
class C public bool Foo() return true
class C public bool Foo() return this
! null
21Example 4 Exceptions are instances of
System.Exception?
- try // perform some action, to completion
catch (Exception e) // undo action
whenever an exception was thrown in try-block
// Action either ran to completion, or was
fully undone
// Not necessarily! From IL, can throw any
objectnewobj instance void System.Object.ctor()
throw
22Example 5Booleans are two-valued?
- void Foo(bool b) bool c !b if (!c !
b) Console.WriteLine(This cannot
happen)
// Oh yes it can! ldc.i4 2call void Foo(bool)
23Example 5 Booleans are two-valued?
- An equivalence that is not preserved
- Fix?
- Change C compilation of and ! for bool so
that it cares only about zero/non-zero-ness
static bool Foo(bool x, bool y) return (x
false) (y false)
static bool Foo(bool x, bool y) return
xy
24Weak abstractions
- Some abstractions arent broken theyre just a
bit weak - arrays are always mutable
- developers forget this and define readonly
properties with array types - run-time types break privacy by subsumption
- solution to array problem would be to return
array as an IEnumerable (a read-only enumerator) - but run-time types let programmer cast back to
the array - Other abstractions are broken not by IL but by
library classes - e.g. delegates (closures) would encapsulate
code object state if it werent for
System.Delegate.Target and System.Delegate.Method
methods.
25Why bother?
- Even if the translation from C to IL were fully
abstract, reasoning about C programs would still
be hard. - Programmers make mistakes in writing secure code
- Tools for automating reasoning about programs are
still in their infancy - There are many other pitfalls in the language
- So why bother about full abstraction?
- Because its a great starting point
- The ability to reason about C programs in C
is hugely simplifying - Even better if we could cut down to a subset of
C that suffices
26Formalize?
- Proofs of full abstraction are hard
- We dont have a complete formal model of C
- We dont have a complete formal model of IL
- So what to do?
- Optimist even if we cant formalize, we can
identify failures, and fix them all - Pessimist we can never be sure that we have full
abstraction. Instead, focus on certain patterns,
prove that these are watertight. Example - prove that integers are safe!
- prove that private fields dont leak
27Conclusions
- The programming model is a vital part of the
security story for .NET and Java - Programmers need to know what they can trust
- Full abstraction is the ideal
- My choice would be to fix the holes we know about
- Might be hard to do
- If we cant or wont, we should educate
developers - Type safety is now taken for granted as a
necessity - In the future, full abstraction also?