Mobility,%20Security,%20and%20Proof-Carrying%20Code%20%20Peter%20Lee%20Carnegie%20Mellon%20University

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: Mobility,%20Security,%20and%20Proof-Carrying%20Code%20%20Peter%20Lee%20Carnegie%20Mellon%20University


1
Mobility, Security, andProof-Carrying Code
Peter LeeCarnegie Mellon University
  • Lecture 3
  • July 12, 2001
  • VC Generation and Proof Representation

Lipari School on Foundations of Wide Area Network
Programming
2
Whew!
3
Recap
  • When the host system receives certified code, it
  • inspects the code, generating verification
    conditions (VCs), and
  • finds a proof for each VC (if it can).
  • Abstractly, one thinks of generating a single
    predicate, which is the conjunction of all the
    VCs.
  • Generation of VCs is done relative to a safety
    policy.

4
High-Level Architecture
Code
Verification condition generator
Checker
Explanation
Agent
Safety policy
Host
5
What Is a Safety Policy?
  • Yesterday, we gave the intuition of a reference
    interpreter that aborts the program just prior to
    any unsafe operation.
  • In this case, the reference interpreter
    essentially defines the safety policy.

6
Safety Policies
  • More formally, we begin by defining the
    small-step operational semantics of a machine,
    call it the s86.
  • ?, ?, pc ? instr ? ?, pc
  • We define the machine so that only safe
    executions are defined.

program
program counter
register state
7
Safety Policies, contd
  • For convenience we choose the s86 to be a
    restriction of the x86.
  • Hence all s86 programs will execute faithfully on
    a real x86.
  • The goal then is to prove that any given program
    always makes progress (or returns) in the s86.
  • With such a proof, the x86 is then just as good
    as an s86.

8
Verification Conditions
  • The point of the verification conditions, then,
    is to provide such progress theorems for each
    instruction in the program.
  • In other words, a VCs validity says that the
    corresponding instruction has a defined
    execution in the s86 operational semantics.

9
Symbolic Evaluator
  • We can define the verification condition
    generator (VCGen) via a symbolic evaluator
  • SE?,?,?0,Post(i, ?, L)
  • The result of symbolic evaluation is a
    conjunction of VCs, so the overall progress
    theorem is then
  • Pre ? SE?,?,?0,Post(i, ?, L)

annotations
LF signature
entry point
postcondition
10
Soundness
  • For particular operational semantics (a safe x86
    and a safe Alpha), we have presented theorems
    that say, essentially
  • Thm If Pre ? SE?,?,?0,Post(i, ?, L), then
    execution of ?, given Pre and ?0, and starting
    from entry point i, will always make progress (or
    return).

11
Getting from Concept to Implementation
  • In an actual implementation, it is also handy to
    have a bit more than just a VC generator.
  • Precise syntax for VCs.
  • Pre/post-conditions for each entry point expected
    by the host in any downloaded code.
  • Precisely specified logical system for proving
    the VCs.

12
Safety Policy Implementations
  • Safety policies are thus given in three parts
  • A verification-condition generator (VCGen).
  • A specification of the pre post conditions for
    all required procedures.
  • A specification of the inference rules for
    constructing valid proofs.
  • LF is used for the rule and pre/post
    specifications, C for the VCGen.

13
C?!_at__at_!
  • The use of C to define and implement the VCGen
    is, at best, expedient and at worst dubious.
  • However, since any code-inspection system must
    parse object files (not trivial!) and understand
    the instruction set, this seems to have practical
    benefits.
  • Clearly, a more formal approach would be
    desirable.

14
ExampleJava Type-Safety Specification
  • Our largest example of a safety-policy
    specification is for the SpecialJ Java
    native-code compiler.
  • It contains about 140 inference rules.
  • Roughly speaking, these rules can be separated
    into 5 classes.

15
Safety PolicyRule Excerpts
1. Standard syntax and rules for first-order
logic.
Syntax of predicates.
/\ pred -gt pred -gt pred. \/ pred -gt pred -gt
pred. gt pred -gt pred -gt pred. all (exp -gt
pred) -gt pred. pf pred -gt type. truei pf
true. andi Ppred Qpred pf P -gt pf Q -gt pf
(/\ P Q). andel Ppred Qpred pf (/\ P Q)
-gt pf P. ander Ppred Qpred pf (/\ P Q) -gt
pf Q.
Type of valid proofs, indexed by predicate.
Inference rules.
16
Safety PolicyRule Excerpts
2. Syntax and rules for arithmetic and equality.
csuble means ? in the x86 machine.
exp -gt exp -gt pred. ltgt exp -gt exp -gt
pred. eq_le Eexp E'exp pf (csubeq E E')
-gt pf (csuble E E'). moddist
Eexp E'exp Dexp pf ( (mod ( E E')
D) (mod ( (mod E D) E') D)). sym Eexp
E'exp pf ( E E') -gt pf ( E' E). ltgtsym
Eexp E'exp pf (ltgt E E') -gt pf (ltgt E'
E). tr Eexp E'exp E''exp pf (
E E') -gt pf ( E' E'') -gt pf ( E E'').
17
Safety PolicyRule Excerpts
3. Syntax and rules for the Java type system.
jint exp. jfloat exp. jarray exp -gt
exp. jinstof exp -gt exp. of exp -gt exp -gt
pred. faddf Eexp E'exp pf (of E
jfloat) -gt pf (of E' jfloat) -gt pf (of
(fadd E E') jfloat). ext Eexp Cexp
Dexp pf (jextends C D) -gt pf (of E
(jinstof C)) -gt pf (of E (jinstof D)).
18
Safety PolicySample Rules
4. Rules describing the layout of data structures.
aidxi Iexp LENexp SIZEexp pf
(below I LEN) -gt pf (arridx (add (imul I
SIZE) 8) SIZE LEN). wrArray4 Mexp Aexp
Texp OFFexp Eexp pf (of A
(jarray T)) -gt pf (of M mem) -gt pf
(nonnull A) -gt pf (size T 4) -gt
pf (arridx OFF 4 (sel4 M (add A 4))) -gt pf
(of E T) -gt pf (safewr4 (add A OFF) E).
This sel4 means the result of reading 4 bytes
from heap M at address A4.
19
Safety PolicySample Rules
5. Quick hacks.
nlt0_0 pf (csubnlt 0 0). nlt1_0 pf (csubnlt 1
0). nlt2_0 pf (csubnlt 2 0). nlt3_0 pf
(csubnlt 3 0). nlt4_0 pf (csubnlt 4 0).
Sometimes unclean things are put into the
specification...
20
How Do We Know That Its Right?
21
Homework Exercise
  • 4. Some of the proof rules are specific to the
    type system of the source language (Java), even
    though we are actually verifying x86 machine
    code.
  • Why has this been done?

22
A Note about Memory
  • We define a type for valid heap memory states
  • mem exp
  • and operators for reading and writing heap
    memory
  • (sel M A)
  • (upd M A E)

23
The VCGen, via Detailed Examples
24
High-Level Architecture
Code
Verification condition generator
Checker
Explanation
Agent
Safety policy
Host
25
Example Source Code
public class Bcopy public static void
bcopy(int src, int dst)
int l src.length int i 0
for(i0 iltl i) dsti srci

26
Example Target Code
L7 ANN_LOOP(INV (csubneq ebx 0), (csubneq
eax 0), (csubb edx ecx), (of rm mem),
MODREG (EDI,EDX,EFLAGS,FFLAGS,RM)) cmpl esi,
edx jae L13 movl 8(ebx, edx, 4),
edi movl edi, 8(eax, edx, 4) incl edx cmpl
ecx, edx jl L7 ret L13 call __Jv_ThrowBadA
rrayIndex ANN_UNREACHABLE nop L6 call __Jv_Thr
owNullPointer ANN_UNREACHABLE nop
ANN_LOCALS(_bcopy__6arrays5BcopyAIAI,
3) .text .align 4 .globl _bcopy__6arrays5BcopyAIAI
_bcopy__6arrays5BcopyAIAI cmpl 0,
4(esp) je L6 movl 4(esp), ebx movl 4(ebx),
ecx testl ecx, ecx jg L22 ret L22 xorl e
dx, edx cmpl 0, 8(esp) je L6 movl 8(esp),
eax movl 4(eax), esi
27
Cut Points
  • Each loop entry must be annotated as a cut point.
  • VCGen requires this so that checking can be
    performed in a single scan of the code.
  • As a convenience, the modified registers are also
    declared in the cut annotations.

28
Example Target Code
L7 ANN_LOOP(INV (csubneq ebx 0), (csubneq
eax 0), (csubb edx ecx), (of rm mem),
MODREG (EDI,EDX,EFLAGS,FFLAGS,RM)) cmpl esi,
edx jae L13 movl 8(ebx, edx, 4),
edi movl edi, 8(eax, edx, 4) incl edx cmpl
ecx, edx jl L7 ret L13 call __Jv_ThrowBadA
rrayIndex ANN_UNREACHABLE nop L6 call __Jv_Thr
owNullPointer ANN_UNREACHABLE nop
ANN_LOCALS(_bcopy__6arrays5BcopyAIAI,
3) .text .align 4 .globl _bcopy__6arrays5BcopyAIAI
_bcopy__6arrays5BcopyAIAI cmpl 0,
4(esp) je L6 movl 4(esp), ebx movl 4(ebx),
ecx testl ecx, ecx jg L22 ret L22 xorl e
dx, edx cmpl 0, 8(esp) je L6 movl 8(esp),
eax movl 4(eax), esi
VCGen requires annotations in order to simplify
the process.
29
Example Source Code
public class Bcopy public static void
bcopy(int src, int dst)
int l src.length int i 0
for(i0 iltl i) dsti srci

30
The VCGen Process (1)
_bcopy__6arrays5BcopyAIAI cmpl 0, src
je L6 movl src, ebx movl 4(ebx),
ecx testl ecx, ecx jg L22
ret L22 xorl edx, edx cmpl 0,
dst je L6 movl dst, eax movl
4(eax), esi L7 ANN_LOOP(INV
A0 (type src_1 (jarray jint)) A1 (type dst_1
(jarray jint)) A2 (type rm_1 mem) A3 (csubneq
src_1 0) ebx src_1 ecx (sel4 rm_1
(add src_1 4)) A4 (csubgt (sel4 rm_1
(add src_1 4)) 0) edx 0 A5 (csubneq dst_1
0) eax dst_1 esi (sel4 rm_1 (add
dst_1 4))
31
The VCGen Process (2)
L7 ANN_LOOP(INV (csubneq ebx 0),
(csubneq eax 0), (csubb edx ecx), (of
rm mem), MODREG (EDI, EDX,
EFLAGS,FFLAGS,RM)) cmpl esi, edx jae
L13 movl 8(ebx,edx,4), edi movl
edi, 8(eax,edx,4)
A3 A5 A6 (csubb 0 (sel4 rm_1 (add src_1
4))) edi edi_1 edx edx_1 rm rm_2 A7
(csubb edx_1 (sel4 rm_2 (add dst_1
4)) !!Verify!! (saferd4 (add src_1 (add
(imul edx_1 4) 8)))
32
The Checker (1)
The checker is asked to verify that
(saferd4 (add src_1 (add (imul edx_1 4) 8)))
under assumptions
A0 (type src_1 (jarray jint)) A1 (type dst_1
(jarray jint)) A2 (type rm_1 mem) A3 (csubneq
src_1 0) A4 (csubgt (sel4 rm_1 (add src_1 4))
0) A5 (csubneq dst_1 0) A6 (csubb 0 (sel4
rm_1 (add src_1 4))) A7 (csubb edx_1 (sel4 rm_2
(add dst_1 4))
The checker looks in the PCC for a proof of this
VC.
33
The Checker (2)
In addition to the assumptions, the proof may use
axioms and proof rules defined by the host, such
as
szint pf (size jint 4) rdArray4 Mexp
Aexp Texp OFFexp pf (type A
(jarray T)) -gt pf (type M mem) -gt
pf (nonnull A) -gt pf (size T 4) -gt
pf (arridx OFF 4 (sel4 M (add A 4))) -gt
pf (saferd4 (add A OFF)).
34
Checker (3)
A proof for
(saferd4 (add src_1 (add (imul edx_1 4) 8)))
in the Java specification looks like this
(excerpt)
(rdArray4 A0 A2 (sub0chk A3) szint (aidxi 4
(below1 A7)))
This proof can be easily validated via LF type
checking.
35
VCGenSummary
  • VCGen is a symbolic evaluator for the object
    language.
  • It essentially implements a reference
    interpreter, except
  • it uses symbolic values in order to model all
    possible executions, and
  • instead of performing run-time checks, it asks a
    Checker to verify the safety of dangerous
    instructions.

36
Homework Exercises
  • 5. When a loop invariant is encountered for the
    second time, what actions must the VCGen perform?
  • 6. In principle, how big can a VC get, relative
    to the size of the program?
  • 7. What kind of program might make a VC get very
    large?

37
Another Exampleby George Necula
void fir (int data, int dlen, int
filter, int flen) int i, j for (i0
iltdlen-flen i) int s 0 for (j0
jltflen j) s filterj dataij
datai s
Skip this example
38
Compiled Example
/ rddata, rdldlen, rffilter, rflflen /
ri 0 sub t1 rdl, rfl L0 CUT(ri,rj,rs,t2,t3,
t4,rm) le t2 ri, t1 jeq t2, L3 rs 0 rj
0 L1 CUT(rj,rs,t2,t3,t4) lt t2 rj, rfl jeq
t2, L2 ult t2 rj, rfl jeq t2, Labort ld t3
rf 4rj add t2 ri, rj
ult t4 t2, rdl jeq t4, Labort ld t2 rd
4t2 mul t2 t3, t2 add rs rs, t2 add rj
rj, 1 jmp L1 L2 ult t2 ri, rdl jeq t2,
Labort st rd 4ri rs add ri ri, 1 jmp
L0 L3 ret Labort call abort
39
The Safety Policy
  • The safety policy defines verification conditions
    of the form
  • true, E E
  • saferd(M, E), safewr(M, E, E)
  • array(EA, ES, EL), vector(EA, ES, EL)
  • Prefir array(rd,4,rdl),
    vector(rf,4,rfl)
  • Postfir true

40
VCGen Example
Set rdcd rdlcdl rfcf rflcfl rmcm
Assume precondition array(cd,4,cdl)
vector(cf,4,cfl)
Set ri 0
ri 0 sub t1 rdl, rfl L0 CUT(ri,rj,rs,t2,t3,
t4,rm) le t2 ri, t1 jeq t2, L3 L3 ret
Set t1 sub(cdl,cfl)
Set rici rjcj rscs t2c2 t3c3
t4c4 rmcm
Set t2 le(ci, sub(cdl,cfl))
Assume not(le(ci, sub(cdl,cfl)))
Check postcondition Check rd,rdl,rf,rfl have
initial values
41
VCGen Example
Set ri 0
ri 0 sub t1 rdl, rfl L0 CUT(ri,rj,rs,t2,t3,
t4,rm) le t2 ri, t1 jeq t2, L3 rs 0 rj
0 L1 CUT(rj,rs,t2,t3,t4) lt t2 rj, rfl jeq
t2, L2 L2 ult t2 ri, rdl jeq t2,
Labort st rd 4ri rs
Set t1 sub(cdl,cfl)
Set rici rjcj rscs t2c2 t3c3
t4c4 rmcm
Set t2 le(ci, sub(cdl,cfl))
Assume le(ci, sub(cdl,cfl))
Set rs 0
Set rj 0
Set rjcj rscs t2c2 t3c3 t4c4
Set t2 lt(cj, cfl)
Assume not(lt(cj, cfl))
Set t2 ult(ci, cdl)
Assume ult(ci, cdl)
Check safewr(cm,
add(cd,mul(4,ci)),cs)
42
More on the Safety Policy
  • The safety policy is defined as an LF signature.

rdarray saferd(M,add(A,mul(S,I))) lt-
array(A,S,L), ult(I,L). rdvector
saferd(M,add(A,mul(S,I))) lt-
vector(A,S,L), ult(I,L). wrarray
safewr(M,add(A,mul(S,I)),V) lt-
array(A,S,L), ult(I,L).
43
The Checker
  • When the Checker is invoked on
  • safewr(cm, add(cd,mul(4,ci)), cs)
  • There are assumptions
  • assume0 ult(ci,cdl).
  • assume1 not(lt(cj,cfl)).
  • assume2 le(ci, sub(cdl,cfl)).
  • assume3 vector(cf,4,cfl).
  • assume4 array(cd,4,cdl).

44
The Checker, contd
  • The VC
  • safewr(cm, add(cd,mul(4,ci)), cs)
  • can be verified by using the rule
  • wrarray safewr(M,add(A,mul(S,I)),V) lt-
  • array(A,S,L), ult(I,L).
  • and assumptions
  • assume0 ult(ci,cdl).
  • assume4 array(cd,4,cdl).

45
Proof Representation
  • A simple (but somewhat naïve) representation of
    the proof is simply the sequence of proof rules
  • wrarray, assume4, assume0
  • We shall see that better representations are
    possible.
  • LF typechecking is sufficient for proofchecking.

46
Optimized Code
  • The previous example was somewhat simplified.
  • More realistic code is optimized, usually based
    on inferences about integer values.
  • Such optimizations require that arithmetic
    invariants be placed in the cut points.

47
Optimized Example
/ rddata, rdldlen, rffilter, rflflen /
ri 0 sub t1 rdl, rfl L0 CUT(rigt0,ri,rj,)
le t2 ri, t1 jeq t2, L3 rs 0 rj
0 L1 CUT(rjgt0,rj,rs,) lt t2 rj, rfl jeq
t2, L2 ld t3 rf 4rj add t2 ri, rj
ld t2 rd 4t2 mul t2 t3, t2 add rs
rs, t2 add rj rj, 1 jmp L1 L2 st rd 4ri
rs add ri ri, 1 jmp L0 L3 ret
48
VCGen Example
Set ri 0
ri 0 sub t1 rdl, rfl L0 CUT(rigt0,
ri,rj,rs,t2,t3,t4,rm le t2 ri, t1 jeq
t2, L3 rs 0 rj 0
Set t1 sub(cdl,cfl)
Set rici rjcj rscs t2c2 t3c3
t4c4 rmcm
Assume gt(ci,0)
Set t2 le(ci, sub(cdl,cfl))
Assume le(ci, sub(cdl,cfl))
Write a Comment
User Comments (0)
About PowerShow.com