Title: Secure Certifying Compilation
1Secure Certifying Compilation
What do you want to type check today?
- David Walker
- Cornell University
2Extensible Systems
- Many systems have programmable interfaces.
- printers and editors (postscript printers, emacs,
Word) - browsers and servers (applets, plugins,
CGI-scripts) - operating systems (virus scanners)
- networks (active networks, JINI)
System Interface
Code
Download, Link Execute
3Extensible Systems Pros
- Client-side customization
- plug in your own devices, 3rd-party utilities
- Preservation of market-share
- vendors can add features, improve functionality
easily - System maintenance and evolution
- software subscriptions
4Extensible Systems Cons
- Security
- extensibility opens system to malicious attacks
- how do we prevent misuse of resources?
- Reliability
- flexibility makes it hard to reason about system
evolution - how do we limit damage done by erroneous
extensions?
5Extensible Systems Reality
- Strong economic and engineering pros
- Mobile code, systems with programmable interfaces
will proliferate - A necessity practical technology for increasing
the security and reliability of extensible systems
6Outline
- Framework for improved reliability and security
- Idea I certifying compilation
- Idea II security via code instrumentation
- An instance popl '00
- Security automaton specifications
- A dependently-typed target language (TAL)
- Related work research directions
7Certified Code
Untrusted Code
Secure Code
System Interface
Download Check
Link Execute
Certificate
- Attach annotations/certificate (types, proofs,
...) to untrusted object code extensions - Certificates make verification feasible
- Move away from trust-based security reliability
8Certifying Compilation
- Low-level certificate generation must be
automated - Necessary components
- 1) a source-level programming
- language
- 2) a compiler to compile and optimize source
programs while preserving the certificate - 3) a certifying target language
High-level Program
Compile
certificate
Annotated IR
Optimize
Transmit
9Question
How should we obtain the initial certificate?
10Answer
- Use a type-safe language
- Type inference relieves the tedium of proof
construction - Programmers will rewrite programs so they type
check
11Certifying Compilation So Far
Type Safe High-level Program
- 1) a strongly typed source-level programming
language - 2) a type-preserving compiler to compile and
optimize source programs - 3) a certificate language for type-safety
properties
types
Compile
Typed Program
Optimize
Transmit
12Certifying Compilers
- Proof-Carrying Code Necula Lee
- an expressive base logic that can encode many
security policies - in practice, logic is extended with a type system
- compilers produce type safety proofs
- Typed Assembly Language Morrisett, Walker, et
al - flexible type constructor language that can
encode high-level abstractions - guarantees type safety properties
13Conventional Type Safety
- Conventional types ensure basic safety
- basic operations performed correctly
- abstraction/interfaces hide data representations
and system code - Conventional types don't describe complex
security policies - eg policies that depend upon history
- Melissa virus reads Outlook contacts list and
then sends 50 emails
14Outline
- Framework for improved reliability and security
- Idea I certifying compilation
- Idea II security via code instrumentation
- An instance popl '00
- Security automaton specifications
- A dependently-typed target language (TAL)
- Related work research directions
15Flexible Security Policies
High-level Extension
- Specify policies independently of extensible
system - Compiler instruments extensions
- Easier to understand, debug, evolve policies
Compiler
Security Policy
Instrument
Analyze Optimize
16Security Policy Specifications
- Requirement a language for specifying security
policies - Features
- Notation for specifying events of interest
- "network send" and "file read" are
security-sensitive - Notation for specifying illegal behaviour
- a privacy policy "no send after read"
- A feasible compilation strategy
- must be able to prevent programs from violating
the policy
17Examples
- SFI Wahbe et al
- events are read, write, jump
- enforce memory safety properties
- SASI Erlingsson Schneider, Naccio Evans
Twyman - flexible policy languages
- not certifying compilers
18Putting it Together
- define policies in a high-level, flexible and
system-independent specification language - instrument system extensions both with dynamic
security checks and static information - preserve proof of security policy during
compilation and optimization - verify certified compiler output to reduce TCB
19Outline
- Framework for improved reliability and security
- Idea I certifying compilation
- Idea II security via code instrumentation
- An instance popl '00
- Security automaton specifications
- A dependently-typed target language (TAL)
- Related work research directions
20Secure Certified Code
- Overview of Architecture
- Security Automata Erlingsson Schneider
- How to specify security properties
- A simple compilation strategy
- A dependently-typed target language (TAL)
- A brief introduction to TAL
- Extensions for certifying security properties
- theoretical core language proven sound
- can express any security automaton policy
21Security Architecture
Security Automaton Specification
High-level Extension
System Interface
Instrument
Annotate
Secure Typed Extension
Secure Typed Interface
Type Check
Optimize
Secure Executable
Transmit
22Security Automata
- A general mechanism for specifying security
policies - Specify any safety property
- access control policies
- cannot access file foo
- resource bound policies
- allocate no more than 1M of memory
- the Melissa policy
- no network send after file read
23Example
read(f)
start
has read
send
read(f)
bad
send
- Policy No send operation after a read operation
- States start, has read, bad
- Inputs (program operations) send, read
- Transitions (state x input -gt state)
- start x read(f) -gt has read
24Example Contd
read(f)
start
has read
send
read(f)
bad
send
- S.A. monitor program execution
- Entering the bad state security violation
untrusted program s.a. start state send()
ok -gt start read(f) ok -gt has read
send() bad, security violation
25Bounding Resource Use
malloc (i)
0
i
n - 1
...
...
malloc (i)
bad
- Policy "allocate fewer than n bytes"
26Enforcing S.A. Specs
- Every security-relevant operation has an
associated function checkop - Trusted, provided by policy writer
- checkop implements the s. a. transition function
checksend (state) if state start then
start else halt terminates
execution
27Enforcing S.A. Specs
- Easy, wrap all function calls in checks
- Improve performance using program analysis
send()
let next_state checksend(current_state)
in send()
28Outline
- Technology for improved reliability and security
- Idea I certifying compilation
- Idea II security via code instrumentation
- Secure certifying compilation popl '00
- Security automaton specifications
- A dependently-typed target language (TAL)
- Related work research directions
29Brief TAL Overview
Typecheck
Link
- Assembly or machine code with typing annotations
- Object files checked separately and linked
together - Ensures basic safety without run-time checks
- Memory safety can't read/write arbitrary memory
- Control-flow safety can't execute arbitrary data
- Type abstraction TAL can encode and enforce
high-level abstract data types
30A TAL Compiler
- TAL is practical
- We compile "safe C" (aka Popcorn)
- No pointer arithmetic, unsafe casts
- ML-style data types, polymorphism, exceptions
- Some simple optimizations
- null-check elimination, inlining, register
allocation - The compiler bootstraps
- most compiler hacking by Grossman, Morrisett,
Smith
31Other TAL Features
- Memory management features
- Stack types
- Aliasing
- Region-based MM
- See Daves thesis
- Other features
- Dynamic linking
- Run-time code generation
- http//www.cs.cornell/talc
32Typing Assembly Code
- Programs divided into labeled code blocks
- Each block has a code type eax?,ebx?,...
- Code types specify expected register contents
- Assume code type to check the block
- Prove control transfers (jumps) meet the
assumptions
Foo eax int, ecx eax int mov ebx, 3
eax int, ebx int, ecx eax int add
eax, ebx OK jmp ecx OK
33Increasing Expressiveness
- Basic types ensure standard type safety
- functions and data used as intended and cannot be
confused - security checks cant be circumvented
- Introduce a logic into the type system to express
security invariants - Use the logic to encode the s.a. policy
- Use the logic to prove checks unnecessary
34Target Language Predicates
- States (for compile-time reasoning)
- constants start, has read, bad, ...
- variables ?1, ?2, ...
- Predicates
- describe security states
- instate(?)
- describe relationships between states
- transsend(?1,?2)
- describe dependencies between values
- (see the paper)
35Preconditions
- Code types can specify preconditions
- A typical use
foo ??, instate(?), ? ? bad.eax?1, ecx?2
- instantiate polymorphic variable ? - prove
residual preconditions - eg instate(start),
start ? bad - hope proofs are easy (syntactic
matching) - otherwise place explicit proof at
call site - eg jmp foo start, Proof, Proof
bar ... ... Known instate(start) ...
jmp foo start
36Postconditions
- Expressed as a precondition on the return address
type - bar eax ?1, ecx ?instate(has read).eax
?2 - Before returning, bar proves instate(has read)
- After return, assume instate(has read)
37Encoding Security Automata
- Each security-relevant function has a type
specifying 3 preconditions, 1 postcondition - the send function
- P1 instate(?curr)
- P2 transsend(?curr,?next)
- P3 ?next ? bad
- Pre P1, P2, P3
- P4 instate(?next)
- Post P4
send ??curr,?next,P1,P2,P3. ecx ?P4.
38Technical Note
- State predicates behave linearly
- as in linear logic, each state predicate is used
once - instate(?curr) is "consumed" at send call site
- can't be used in future proofs
- can't fool type system into thinking code
continues to be in state ?curr - instate(?next) is "produced" on return
- will be used when next calling a
security-sensitive function
39Compile-time Run-time
- Compile-time reasoning depends on run-time values
foo mov eax, state should represent the
current state mov ecx, ret1 jmp
checksend state argument, state result
in eax
ret1 push eax
save next state on the stack mov ecx, ret2
jmp send must establish
precondition for send checksend
postcond. precond. for ret1, send
40Checksend
- A type for checksend (first try)
- checksend
- ??curr,P1.eaxstate, ecx??next,P1,P2,P3.
eaxstate - where
- P1 instate(?curr), P2 transsend(?curr,?next)
, P3 ?next ? bad
41Checksend
- A type for checksend (first try)
- No correspondence between run-time argument and
static information
- checksend
- ??curr,P1.eaxstate, ecx??next,P1,P2,P3.
eaxstate - where
- P1 instate(?curr), P2 transsend(?curr,?next)
, P3 ?next ? bad
mov eax, wrong_state mov ecx, next jmp checksend
42Checksend
- Solution provide very precise types
- Singleton types
- A type containing one value
- eax state(start)
- means eax contains a data structure that
represents exactly the start state and no other
state - eax state(?)
- eax contains data representing the unknown state
? - useful in many contexts
- Similar to Dependent ML Xi Pfenning
43Using Singletons
- checksend
- implements the automaton transition function
- intuitively has type state -gt state
- singletons help relate run-time values to
compile-time predicates
- ??curr,P1.eaxstate(?curr),ecx??next,P1,P2,P3
.eaxstate(?next) - P1 instate(?curr), P2 transsend(?curr,?next),
P3 ?next ? bad
44Using Checksend
foo ... Assume
instate(?curr), eax state(?curr) mov
ecx, ret1 jmp check_send?curr ret1
??next, instate(?curr), transsend(?curr,?next),
?next ? bad. eaxstate(?next).
push
eax mov ecx, ret2 jmp send
?curr,?next P1 P2 P3 gt ok ret2 ...
45Optimization
- Analysis of s.a. structure makes redundant check
elimination possible - eg
- identify transsend(start,start) as valid
read(f)
start
has read
send
read(f)
bad
send
46Optimization
Low-level Interface send ?' read
?' checksend ?' checkread ?' Axiom A
transsend(start,start)
Policy
High-level Interface
47Optimization
- Type-checker is simple but general
- Typical optimizations
- redundant check removal
- loop invariant removal
loop ?instate(start). mov ecx, loop
jmp send start,start,By A send
??curr,?next,instate(?curr),transsend(?curr,?next
), ?next ?
bad.ecx ?P4.
48Implementation
- TALx86 implementation is sufficient for these
encodings - includes polymorphism, higher-order type
constructors, logical connectives (?,?,?),
singleton types, .... - Lots more work to be done
- axioms in module interfaces
- policy compiler
49Outline
- Technology for improved reliability and security
- Idea I certifying compilation
- Idea II security via code instrumentation
- Secure certifying compilation popl '00
- Security automaton specifications
- A certifying target language
- Related work research directions
50Research Directions
- Design of policy languages
- What kinds of logics can we compile certify?
- Mawl Sandholm Schwartzbach
- TALres Crary Weirich
- Design of safety architecture
- How do we "clean up" after halting a program?
- Support for mutually distrustful agents
- Policy-directed optimizations
51Summary
- A recipe for secure certified code
- types
- ensure basic safety without run-time overhead
- add a logic to encode complex invariants
- policy-directed code instrumentation
- specify security policies independently of the
rest of the system - use dynamic checking to enforce policies when
they cant be proven statically