CS711 Overview of PCC - PowerPoint PPT Presentation

1 / 41

About This Presentation

Title:

CS711 Overview of PCC

Description:

CS711. Overview of PCC. Greg Morrisett. Cornell University. Thanks to G. ... Cedilla has built a certifying compiler for Java. generates optimized x86 code ... – PowerPoint PPT presentation

Number of Views:49

Avg rating:3.0/5.0

Slides: 42

Provided by: GregMor1

Learn more at: https://www.cs.cornell.edu

Category:

more less

Transcript and Presenter's Notes

Title: CS711 Overview of PCC

1
CS711Overview of PCC

Greg Morrisett
Cornell University
Thanks to G.Necula P.Lee

2
Papers for this Lecture

G. Necula, Proof-Carrying Code. PoPL'97.
G.Necula and P.Lee. Safe Kernel Extensions
Without
Run-Time Checking. OSDI'96.
G.Necula and P.Lee. The Design and
Implementation of a Certifying Compiler.
PLDI98, June 1998. pldi98.ps
I also highly recommend Neculas PhD thesis (CMU).

3
Ideally
trusted computing base
Security Policy
Your favorite language
verifier
System Binary
Low-Level IL
optimizer
machine code
4
Idea 1 Theorem Prover!
trusted computing base
NuPRL
Security Policy
Your favorite language
System Binary
Low-Level IL

optimizer
machine code
5
Unfortunately...
trusted computing base
NuPRL

6
Observation
Finding a proof is hard, but verifying a proof
is easy.
7
PCC
trusted computing base
verifier
Security Policy
optimizer
System Binary
machine code
prover
certified binary
code
proof
in- variants
8
Making Proof Rigorous

Specify machine-code semantics and security
policy using axiomatic semantics.
Pre ld r2,r1(i) Post
Given
security policy (i.e., axiomatic semantics and
associated logic for assertions)
untrusted code
annotated with invariant assertions
its possible to calculate a verification
condition
an assertion A such that
if A is true then the code respects the policy.

9
The Client

The client takes its code the policy
constructs some loop invariants.
constructs the verification condition A from the
code, policy, and loop invariants.
constructs a proof that A is true.

code
proof
in- variants
certified binary
10
Verification

The Verifier ( 4-6 pages of C code)
takes code, loop invariants, and policy
calculates the verification condition A.
checks that the proof is a valid proof of A
fails if some step doesnt follow from an axiom
or inference rule
fails if the proof is valid, but not a proof of A

code
proof
in- variants
certified binary
11
Advantages of PCC

In Principle
Simple, small, and fast TCB.
No external authentication or cryptography.
No additional run-time checks.
Tamper-proof.
Precise and expressive specification of code
safety policies.

code
proof
in- variants
12
An Experiment Packet Filters

Safety Policy
given a packet, returns yes/no
packet is read-only, small scratchpad
no loops
Compare
Berkeley Packet Filter Interpreter
Modula-3 (but turn off type-checking)
Software Fault Isolation (sandboxing)
PCC (hand-optimized, proved)

13
Results

PCC wins

14
Is PCC the answer?

PCC seems to offer everything we need
small, simple trusted computing base
optimize all you want, any language, any security
policy, etc.
But how do we make it scale to real programs?

15
Scaling Problem 1

How to generate proofs?
Manual construction is too painful for real
programs.
Interactive theorem provers are really only
feasible for a relatively small fraction of the
code.
We need something thats fully automatic most of
the time.

16
One Approach

Restrict the safety policy to type safety.
Necessary for most policies anyway
cannot execute code or access data for which you
do not have a capability.
type systems are a meta-policy that allow
programmers to define fine-grained notions of
capability and access.
abstract types, interfaces, static scope, etc.
Start with a well-typed, high-level program
you have a proof for the high-level code
preserve the proof as you compile

17
Type-Preserving Compilation
Source code
binary
Type-checker
Optimizer
Code- generator
Proof of type-safety
Proof of type-safety
18
Touchstone Necula

Compiles type-safe subset of C to certified
binaries for the DEC Alpha.
Security policy is type-safety
parameters of the right type to functions
values of the right type in arrays, structs
array indices in bounds
Highly-optimizing
competitive with GCC, DEC cc
eliminates array bound checks when possible

19
Touchstone Performance
In spite of the fact that C compilers do not
insert array bound checks, Touchstone
is competitive.
20
Touchstone Compilation Time

Geometric means
compilation 75
VC generation 2

proving 21
proof checking 2

21
JVM vs. Touchstone

JVM
portable
Touchstone
extremely good performance
extremely small TCB
fast verification

22
However...

Touchstones type system suits only one very
simple language
no abstract data types, objects, etc.
no threads
Proof size was an issue
proofs were 1-3x the size of the code, just for a
really simple notion of type-safety.
but recent work by Necula shows that this can be
compressed down to tiny overhead (e.g., 10)

23
Touchstone proof size
Touchstones proof size relative to code
and invariant annotations.
24
Summary thus far...

Proof-carrying code is great in principle.
Its the right general framework.
For special-purpose applications, cant be beat.
But for general-purpose extensions
Need some way to get the proof automatically
(limit policy to type-safety).
Engineering proof size is an issue.
Compiling high-level languages is an issue.

25
Design Details
Server
Client
Safety policy
Certifying Compiler
Source
VC Generator
Logic
VC
Code
Theorem Prover
Proof Checker
Proof
Untrusted Complex Slow
Trusted Simple Fast
26
Abstract Machine

Instructions (from DEC Alpha)
ADD/SUB rs, Op, rd (Op n r)
LD rd, n(rs), ST rs, n(rd)
BEQ/NE rs, n, RET
INV(P)
States (R,pc)
Rr is a 64-bit integer
Rmem is memory Int64-gtInt64
pc is current program counter
Expressions
e n r e1 e2 e1 e2 sel(m,e)
m mem upd(m,e1,e2)

27
Semantics

(R,pc) -gt (R',pc')
relative to fixed instruction sequence S
Rewriting rules
R' Rrd R(rt) R(rs), pc' pc1 if
S(pc) ADD rs,rt,rd
R' Rrd sel(R(m),R(rs)n), pc'pc1 if
S(pc) LD rd,n(rs) and readable(R,rs,n)
R' Rm upd(R(m),R(rd)n,R(rs)) pc' pc1
if S(pc) ST rs, n(rd) and writeable (R,rd,n)
R R', pc pcn1 if S(pc) BEQ rs,n and R(rs)
0 (and pcn1 in 0..S.size-1)

28
Predicates

P true false P1 P2 P1 gt P2 All x.P
e1 e2 e1 ! e2 e T
T RO RW
quantifiers range over numbers and are meant to
hold in every state.
eT predicate asserting that e has type T
Example pre-condition
r0RO (r08)RO (sel(m,r0) ! 0) gt
(r08)RW

29
Axioms and Proof Rules

The usual ones for predicate logic
Some rules for reasoning about 64-bit arithmetic
values
Rules for reasoning about memory
sel(upd(m,e1,e2),e3) e2 when e1 e3
sel(upd(m,e1,e2),e3) sel(m,e3) when e1 ! e3.
upd(upd(m,e1,e2),e3,e4) upd(upd(m,e3,e4),e1,e2)
when e1 ! e3
Note aliasing strikes again!
Rules for reasoning about types
eRW gt eRO

30
Notes on Axioms

When you scale PCC up
you still need a rich type system to specify
interfaces (i.e., pre-conditions)
you still have to prove the consistency and
soundness of your axioms w.r.t. the machine
i.e., you still have to write down a TAL and
prove its soundness
you'll tend to use the same type invariance
tricks to ensure soundness

31
Verification Conditions

VC(i)
rsrt / rd VC(i1) if S(i) ADD rs,rt,rd
(rsn)RO sel(m,rsn)/rdVC(i1) if S(i)
LD rd,n(rs)
(rdn)RW upd(m,rdn,rs)/mVC(i1) if S(i)
ST rs,n(rd)

32
VC continued

VC(i) (rs 0 gt VC(in1)) (rs ! 0 gt
VC(i1)) when S(i) BEQ rs,n
PostCondition when VC(i) RET
P when VC(i) INV(P)

33
Notes on VCGen

Computes the weakest pre-condition of the program
if you start form the post-condition at the
RET(s) and work back.
Need to cut cycles (back-edges in CFG) with INV
nodes or more properly.
Note that INV isn't trusted it's assumed for
the continuation, but verified if you ever get to
it.
Accomplished by adding INV gt VC(i1) to the
final safety predicate.
Now all you need is a proof that VCGen is implied
by the pre-condition.

34
Example

add r2,r2,5
ld r1, r2(3)
st r5, r1(1)
true

35
Example
add r2,r2,5 ld r1, r2(3) st r5, r1(1)
(r11)RW true true
36
Example
add r2,r2,5 ld r1, r2(3) (r23)RO
(sel(m,r2,3)1)RW st r5, r1(1)
(r11)RW true
37
Example
add r2,r2,5 (r253)RO
(sel(m,r25,3)1)RW ld r1, r2(3) (r23)RO
(sel(m,r2,3)1)RW st r5,
r1(1) (r11)RW true
38
Proof Representation

Use a variant of LF to represent assertions and
proofs.
write down assertion language
write down inference rules for the logic
proof-checking becomes LF type-checking
decouples the logic and assertion language from
the verifier.
of course, you still have to establish the
soundness and consistency of the logic that you
encode within LF.
and some logics (e.g., linear or temporal or
modal) do not encode so nicely into LF (see Twelf)

39
Representing LF Proofs

In practice LF proof objects are HUGE.
Recent work on proof oracles compresses this down
to nothing PoPL2001?
assume you can match the goal against the
conclusions of the proof rules (e.g., 1st-order
unification.) If you cant match with this, then
force the representation to contain more
information
only some (small) subset of the rules will apply
(say k of them.)
so you only need to spit out lg(k) bits to
indicate which rule is actually used in the
proof.
the matching lets you then establish sub-goals
that need to be proven.

40
Where PCC stands

Cedilla has built a certifying compiler for Java.
generates optimized x86 code
but you can write your own code too!
uses a Nelson-Oppen-style prover
The proof checker is actually machine independent
map object code up to a machine-independent IL
(Secure Assembly Language)
proofs are with respect to that the SAL code
retargeting the prover to another machine just
involves writing a (correct) mapping from the
machine code to SAL.

41
Foundational PCC Appel, Felty

Eliminate more trust from PCC
logic encoded into LF
implicit machine semantics
Rather, encode things from the machine semantics
up.
you prove w.r.t. the semantics that PreCPost
is valid.
Interesting observation
to do any reasonable proof, you start introducing
types or invariants that look suspiciously like
TAL
except that you have a semantic encoding as to
what the TAL types mean w.r.t. the machine.