Title: Bytecode verification Part II: type inference
1Bytecode verificationPart II type inference
- Jean-Sébastien Coron,
- David Naccache,
- Christophe Tymen
- ENS seminar presentation
2A few pre-requisites on the Java Virtual Machine
- Emulates a stack based machine
- A set of typed instructions (byte codes)
- working on a stack ...
- and a set of registers (local variables)
- Works like other interpreted stack based
languages - PostScript
- Forth
- ...
3The workspace a stack frame
4The instruction set
- Stack and local variable operations
- Integer arithmetic
- Control flow
- Objects and arrays
- Method invocation
- Logic
- Type conversion
- Exceptions
- Finally clauses
- Floating point arithmetic
- Threads synchronization
5Most instructions are typed
- The beginning of the instruction indicates the
type of the data manipulated - E.g. iadd, adds two integers
6Stack and local variable operations
- Push a constant onto the stack
- sconst_0, bspush x, sspush xx, bipush x, sipush
xx - Load a local variable onto the stack
- sload_0, sload x, ...
- Store a value from the stack into a local
variable - sstore_0, sstore x, ...
- Generic stack operations
- pop, pop2, dup, dup2
7Example 1 What is the result of the following
program ?
18
sconst_0
0
0
bspush 18
sstore_0
sstore 1
sload_0
sload 1
pop
dup
8Lets execute it!
18
sconst_0
0
18
0
bspush 18
18
sstore_0
sstore 1
sload_0
sload 1
0
pop
18
dup
9Integer arithmetic
- Addition
- sadd, iadd
- Subtraction
- ssub, isub
- Multiplication
- smul, imul
- Division
- sdiv, idiv
- Incrementation of loc. var. x by i
- sinc x i, iinc
10Example 2 What is the result of the following
program ?
18
sconst_m1
0
0
sspush 42
sadd
sconst_3
sdiv
11Lets execute it!
sconst_m1
3
42
sspush 42
-1
13
41
sadd
sconst_3
sdiv
12Control flow
- Just an example for short values
- if condition X
- branches to offset X if condition is met
- Compares value on the stack with 0
- condition can be eq, ne, lt, gt, le, ge
- A second example for integers
- icmp
- pops two integers from the stack and compares
them - pushes back onto the stack
- 1 if gt
- 0 if eq
- -1 if lt
- Unconditional jump (just goes to the offset X)
- goto X
13Example 3Tests if a short value is odd.
sload_1
18
0
sconst_2
0
srem
ifeq 3
sconst_0
goto 2
x
sconst_1
sstore 1
14The workspace
15The workspace
might be a pointer (reference) on
16Abstract interpretation (1/2)
- Instead of computing on the full semantic
domain, we will compute on a restricted
abstract domain - For Java the abstract domain is the set of types
- COMPUTE WITH TYPES INSTEAD OF VALUES
6
I
_at_346
Ljava/lang/String
FH
5.4
FL
4
I
17Abstract interpretation (1/2)
- Instead of computing on the full semantic
domain, we will compute on a restricted
abstract domain - For Java the abstract domain is the set of types
- COMPUTE WITH TYPES INSTEAD OF VALUES
integer
6
I
_at_346
Ljava/lang/String
FH
5.4
FL
4
I
18Abstract interpretation (1/2)
- Instead of computing on the full semantic
domain, we will compute on a restricted
abstract domain - For Java the abstract domain is the set of types
- COMPUTE WITH TYPES INSTEAD OF VALUES
address where an object of type string is present
6
I
_at_346
Ljava/lang/String
FH
5.4
FL
4
I
19Abstract interpretation (1/2)
- Instead of computing on the full semantic
domain, we will compute on a restricted
abstract domain - For Java the abstract domain is the set of types
- COMPUTE WITH TYPES INSTEAD OF VALUES
6
I
_at_346
Ljava/lang/String
FH
5.4
FL
MSB of a float
4
I
20Abstract interpretation (1/2)
- Instead of computing on the full semantic
domain, we will compute on a restricted
abstract domain - For Java the abstract domain is the set of types
- COMPUTE WITH TYPES INSTEAD OF VALUES
6
I
_at_346
Ljava/lang/String
FH
5.4
FL
4
I
LSB of a float
21Abstract interpretation (1/2)
- Instead of computing on the full semantic
domain, we will compute on a restricted
abstract domain - For Java the abstract domain is the set of types
- COMPUTE WITH TYPES INSTEAD OF VALUES
6
I
_at_346
Ljava/lang/String
FH
5.4
FL
4
I
another integer
22Abstract interpretation (2/2)
- In Java the abstract interpreter will check that
at each point of the code the stack contents and
the variables are correctly typed wrt the
instruction
23The fixpoint algorithm
- Aim compute the memory type (stack local
variables) at each instruction - VERY SIMILAR TO THE STACK ALGORITHM SEEN IN PART
1 - Mark the first instruction
- While there are some marked instructions
- Select a marked instruction
- Verify that the memory is compatible with the
instruction - Model the instruction effect on the memory
- Find the following instruction(s)
- Unify the memory with memory of the following
instructions - Mark the instruction where the memory has been
modified - End of while
24A linear verification
Local variables 0 1 2 3
Stack
- ?0 iload_1 . L
i i ? 1 iload_2 ?
? ? ? ? 2 if_icmpne 12
? ? ? ? ? 5 iload_1
? ? ? ? ?
6 iload_2 ? ? ?
? ? 7 iadd ?
? ? ? ? 8 istore_3 ?
? ? ? ? 9 goto 16
? ? ? ? ? 12
iload_1 ? ? ?
? ? 13 iload_1 ?
? ? ? ? 14 imul ?
? ? ? ? 15 istore_3
? ? ? ? ? 16 iload_3
? ? ? ? ?
17 ireturn ? ? ?
? ?
WhileSelect marked Verif. mem. comp. Model
ins. Find nexts Unify mem. Mark if modified
25A linear verification
Local variables 0 1 2 3
Stack
- 0 iload_1 . L
i i ? ?1 iload_2 i
L i i ? 2 if_icmpne 12
? ? ? ? ? 5 iload_1
? ? ? ? ?
6 iload_2 ? ? ?
? ? 7 iadd ?
? ? ? ? 8 istore_3 ?
? ? ? ? 9 goto 16
? ? ? ? ? 12
iload_1 ? ? ?
? ? 13 iload_1 ?
? ? ? ? 14 imul ?
? ? ? ? 15 istore_3
? ? ? ? ? 16 iload_3
? ? ? ? ?
17 ireturn ? ? ?
? ?
WhileSelect marked Verif. mem. comp. Model
ins. Find nexts Unify mem. Mark if modified
26A linear verification
Local variables 0 1 2 3
Stack
- 0 iload_1 . L
i i ? 1 iload_2 i
L i i ??2 if_icmpne 12
i,i L i i ? 5 iload_1
? ? ? ? ?
6 iload_2 ? ? ?
? ? 7 iadd ?
? ? ? ? 8 istore_3 ?
? ? ? ? 9 goto 16
? ? ? ? ? 12
iload_1 ? ? ?
? ? 13 iload_1 ?
? ? ? ? 14 imul ?
? ? ? ? 15 istore_3
? ? ? ? ? 16 iload_3
? ? ? ? ?
17 ireturn ? ? ?
? ?
WhileSelect marked Verif. mem. comp. Model
ins. Find nexts Unify mem. Mark if modified
27A linear verification
Local variables 0 1 2 3
Stack
- 0 iload_1 . L
i i ? 1 iload_2 i
L i i ? 2 if_icmpne 12
i,i L i i ? ?5 iload_1
. L i i ?
6 iload_2 ? ? ?
? ? 7 iadd ?
? ? ? ? 8 istore_3 ?
? ? ? ? 9 goto 16
? ? ? ? ? ?12
iload_1 . L i
i ? 13 iload_1 ?
? ? ? ? 14 imul ?
? ? ? ? 15 istore_3
? ? ? ? ? 16 iload_3
? ? ? ? ?
17 ireturn ? ? ?
? ?
WhileSelect marked Verif. mem. comp. Model
ins. Find nexts Unify mem. Mark if modified
28A linear verification
Local variables 0 1 2 3
Stack
- 0 iload_1 . L
i i ? 1 iload_2 i
L i i ? 2 if_icmpne 12
i,i L i i ? 5 iload_1
. L i i ?
?6 iload_2 i L
i i ? 7 iadd ?
? ? ? ? 8 istore_3 ?
? ? ? ? 9 goto 16
? ? ? ? ? ?12
iload_1 . L i
i ? 13 iload_1 ?
? ? ? ? 14 imul ?
? ? ? ? 15 istore_3
? ? ? ? ? 16 iload_3
? ? ? ? ?
17 ireturn ? ? ?
? ?
WhileSelect marked Verif. mem. comp. Model
ins. Find nexts Unify mem. Mark if modified
29A linear verification
Local variables 0 1 2 3
Stack
- 0 iload_1 . L
i i ? 1 iload_2 i
L i i ? 2 if_icmpne 12
i,i L i i ? 5 iload_1
. L i i ?
6 iload_2 i L i
i ? ?7 iadd i,i
L i i ? 8 istore_3 ?
? ? ? ? 9 goto 16
? ? ? ? ? ?12 iload_1
. L i i ?
13 iload_1 ? ? ?
? ? 14 imul ?
? ? ? ? 15 istore_3 ?
? ? ? ? 16 iload_3
? ? ? ? ? 17
ireturn ? ? ?
? ?
WhileSelect marked Verif. mem. comp. Model
ins. Find nexts Unify mem. Mark if modified
30A linear verification
Local variables 0 1 2 3
Stack
- 0 iload_1 . L
i i ? 1 iload_2 i
L i i ? 2 if_icmpne 12
i,i L i i ? 5 iload_1
. L i i ?
6 iload_2 i L i
i ? 7 iadd i,i
L i i ? ?8 istore_3 i
L i i ? 9 goto 16
? ? ? ? ? ?12
iload_1 . L i
i ? 13 iload_1 ?
? ? ? ? 14 imul ?
? ? ? ? 15 istore_3
? ? ? ? ? 16 iload_3
? ? ? ? ?
17 ireturn ? ? ?
? ?
WhileSelect marked Verif. mem. comp. Model
ins. Find nexts Unify mem. Mark if modified
31A linear verification
Local variables 0 1 2 3
Stack
- 0 iload_1 . L
i i ? 1 iload_2 i
L i i ? 2 if_icmpne 12
i,i L i i ? 5 iload_1
. L i i ?
6 iload_2 i L i
i ? 7 iadd i,i
L i i ? 8 istore_3 i
L i i ? ?9 goto 16
. L i i i ?12
iload_1 . L i
i ? 13 iload_1 ?
? ? ? ? 14 imul ?
? ? ? ? 15 istore_3
? ? ? ? ? 16 iload_3
? ? ? ? ?
17 ireturn ? ? ?
? ?
WhileSelect marked Verif. mem. comp. Model
ins. Find nexts Unify mem. Mark if modified
32A linear verification
Local variables 0 1 2 3
Stack
- 0 iload_1 . L
i i ? 1 iload_2 i
L i i ? 2 if_icmpne 12
i,i L i i ? 5 iload_1
. L i i ?
6 iload_2 i L i
i ? 7 iadd i,i
L i i ? 8 istore_3 i
L i i ? 9 goto 16
. L i i i ?12
iload_1 . L i
i ? 13 iload_1 ?
? ? ? ? 14 imul ?
? ? ? ? 15 istore_3
? ? ? ? ? ?16 iload_3
. L i i i
17 ireturn ? ? ?
? ?
WhileSelect marked Verif. mem. comp. Model
ins. Find nexts Unify mem. Mark if modified
33A linear verification
Local variables 0 1 2 3
Stack
- 0 iload_1 . L
i i ? 1 iload_2 i
L i i ? 2 if_icmpne 12
i,i L i i ? 5 iload_1
. L i i ?
6 iload_2 i L i
i ? 7 iadd i,i
L i i ? 8 istore_3 i
L i i ? 9 goto 16
. L i i i ?12
iload_1 . L i
i ? 13 iload_1 ?
? ? ? ? 14 imul ?
? ? ? ? 15 istore_3
? ? ? ? ? 16 iload_3
. L i i i
?17 ireturn i L
i i i
WhileSelect marked Verif. mem. comp. Model
ins. Find nexts Unify mem. Mark if modified
34A linear verification
Local variables 0 1 2 3
Stack
- 0 iload_1 . L
i i ? 1 iload_2 i
L i i ? 2 if_icmpne 12
i,i L i i ? 5 iload_1
. L i i ?
6 iload_2 i L i
i ? 7 iadd i,i
L i i ? 8 istore_3 i
L i i ? 9 goto 16
. L i i i ?12
iload_1 . L i
i ? 13 iload_1 ?
? ? ? ? 14 imul ?
? ? ? ? 15 istore_3
? ? ? ? ? 16 iload_3
. L i i i
17 ireturn i L i
i i
WhileSelect marked Verif. mem. comp. Model
ins. Find nexts Unify mem. Mark if modified
35A linear verification
Local variables 0 1 2 3
Stack
- 0 iload_1 . L
i i ? 1 iload_2 i
L i i ? 2 if_icmpne 12
i,i L i i ? 5 iload_1
. L i i ?
6 iload_2 i L i
i ? 7 iadd i,i
L i i ? 8 istore_3 i
L i i ? 9 goto 16
. L i i i 12
iload_1 . L i
i ? ?13 iload_1 i
L i i ? 14 imul ?
? ? ? ? 15 istore_3
? ? ? ? ? 16 iload_3
. L i i i
17 ireturn i L i
i i
WhileSelect marked Verif. mem. comp. Model
ins. Find nexts Unify mem. Mark if modified
36A linear verification
Local variables 0 1 2 3
Stack
- 0 iload_1 . L
i i ? 1 iload_2 i
L i i ? 2 if_icmpne 12
i,i L i i ? 5 iload_1
. L i i ?
6 iload_2 i L i
i ? 7 iadd i,i
L i i ? 8 istore_3 i
L i i ? 9 goto 16
. L i i i 12
iload_1 . L i
i ? 13 iload_1 i
L i i ? ?14 imul i,i
L i i ? 15 istore_3
? ? ? ? ? 16 iload_3
. L i i i
17 ireturn i L i
i i
WhileSelect marked Verif. mem. comp. Model
ins. Find nexts Unify mem. Mark if modified
37A linear verification
Local variables 0 1 2 3
Stack
- 0 iload_1 . L
i i ? 1 iload_2 i
L i i ? 2 if_icmpne 12
i,i L i i ? 5 iload_1
. L i i ?
6 iload_2 i L i
i ? 7 iadd i,i
L i i ? 8 istore_3 i
L i i ? 9 goto 16
. L i i i 12
iload_1 . L i
i ? 13 iload_1 i
L i i ? 14 imul i,i
L i i ? ?15 istore_3
i L i i ? 16 iload_3
. L i i i
17 ireturn i L i
i i
WhileSelect marked Verif. mem. comp. Model
ins. Find nexts Unify mem. Mark if modified
38A linear verification
Local variables 0 1 2 3
Stack
- 0 iload_1 . L
i i ? 1 iload_2 i
L i i ? 2 if_icmpne 12
i,i L i i ? 5 iload_1
. L i i ?
6 iload_2 i L i
i ? 7 iadd i,i
L i i ? 8 istore_3 i
L i i ? 9 goto 16
. L i i i 12
iload_1 . L i
i ? 13 iload_1 i
L i i ? 14 imul i,i
L i i ? 15 istore_3
i L i i ? 16 iload_3
. L i i i
17 ireturn i L i
i i
Success
WhileSelect marked Verif. mem. comp. Model
ins. Find nexts Unify mem. Mark if modified
39An error
Local variables 0 1 2 3
Stack
- 0 iload_1 . L
i i ? 1 iload_2 i
L i i ? 2 if_icmpne 12
i,i L i i ? 5 iload_1
. L i i ?
6 iload_2 i L i
i ? 7 iadd i,i
L i i ? 8 istore_3 i
L i i ? 9 goto 16
. L i i i ?12
iload_1 . L i
i ? 13 imul ?
? ? ? ? 14 istore_3 ?
? ? ? ? 15 iload_3
. L i i i 16 ireturn
i L i i i
WhileSelect marked Verif. mem. comp. Model
ins. Find nexts Unify mem. Mark if modified
40An error
Local variables 0 1 2 3
Stack
- 0 iload_1 . L
i i ? 1 iload_2 i
L i i ? 2 if_icmpne 12
i,i L i i ? 5 iload_1
. L i i ?
6 iload_2 i L i
i ? 7 iadd i,i
L i i ? 8 istore_3 i
L i i ? 9 goto 16
. L i i i 12
iload_1 . L i
i ? ?13 imul i
L i i ? 14 istore_3 ?
? ? ? ? 15 iload_3
. L i i i 16 ireturn
i L i i i
Failure
WhileSelect marked Verif. mem. comp. Model
ins. Find nexts Unify mem. Mark if modified
41Unification
Local variables 0 1 2 3
Stack
- 0 iload_1 . L
i i ? 1 iload_2 i
L i i ? ?15 istore_3
i L i i ? 16
iload_3 . L i
i i 17 ireturn i
L i i i
WhileSelect marked Verif. mem. comp. Model
ins. Find nexts Unify mem. Mark if modified
42General case for unification
- Define a lattice structure on types
- Take the least upper bound of the 2 type to be
unified
43Using the unification
- Verifying public Animal myMethod()
if
n aload_3 Stack ? Local 3 ?
n1 areturn Stack ?
Local 3 ?
44Using the unification
- Verifying public Animal myMethod()
if
n iload_3 Stack ? Local 3
? n1 areturn Stack ?
Local 3 ?
45Using the unification
- Verifying public Animal myMethod()
if
n iload_3 Stack ? Local 3
? n1 areturn Stack ?
Local 3 ?
46Using the unification
- Verifying public Animal myMethod()
if
n iload_3 Stack . Local 3
LBird n1 areturn Stack
? Local 3 ?
47Using the unification
- Verifying public Animal myMethod()
if
n iload_3 Stack . Local 3
LBird n1 areturn Stack
LBird Local 3 LBird
48Using the unification
- Verifying public Animal myMethod()
if
n iload_3 Stack . Local 3
LBird n1 areturn Stack
LBird Local 3 LBird
49Using the unification
- Verifying public Animal myMethod()
if
Local 3 LCat
n iload_3 Stack . Local 3
LBird n1 areturn Stack
LBird Local 3 LBird
50Using the unification
- Verifying public Animal myMethod()
if
Local 3 LCat
n iload_3 Stack . Local 3
LAnimal n1 areturn Stack
LBird Local 3 LBird
51Using the unification
- Verifying public Animal myMethod()
if
Local 3 LCat
n iload_3 Stack . Local 3
LAnimal n1 areturn Stack
LAnimal Local 3 LAnimal
52Using the unification
- Verifying public Animal myMethod()
Success
if
Local 3 LCat
n iload_3 Stack . Local 3
LAnimal n1 areturn Stack
LAnimal Local 3 LAnimal
53An error case
- Verifying public Bird myMethod()
if
n iload_3 Stack ? Local 3
? n1 areturn Stack ?
Local 3 ?
54An error case
- Verifying public Bird myMethod()
if
n iload_3 Stack . Local 3
LBird n1 areturn Stack
Lbird Local 3 LBird
55An error case
- Verifying public Bird myMethod()
if
Local 3 LCat
n iload_3 Stack . Local 3
LBird n1 areturn Stack
Lbird Local 3 LBird
56An error case
- Verifying public Bird myMethod()
if
Local 3 LCat
n iload_3 Stack . Local 3
LAnimal n1 areturn Stack
Lbird Local 3 LBird
57An error case
- Verifying public Bird myMethod()
Failure
if
Local 3 LCat
n iload_3 Stack . Local 3
LAnimal n1 areturn Stack
LAnimal Local 3 LAnimal
58Backward jump
- In case of backward jump (e.g. loop) the
verification algorithm can loop but - Theorem It terminates iff
- We compute on a complete latice
- We only use a composition of monotonous function
on this lattice - We obtain a fixed point (an x where f(x) x)
59Some constraints on the BC
- We need to have a fixed point
- At each instruction the stack and the local
variable must have one type compatible with
Javas rules whatever execution path brought us
there. - It excludes some secure but unverifiable programs
- For i1 to 3 pushFor i1 to 3 pop
For i1 to 3 push pop
60Introduction to proof carrying code
Local variables 0 1 2 3
Stack
- 0 iload_1 . L
i i ? 1 iload_2 i
L i i ? 2 if_icmpne 12
i,i L i i ? 5 iload_1
. L i i ?
6 iload_2 i L i
i ? 7 iadd i,i
L i i ? 8 istore_3 i
L i i ? 9 goto 16
. L i i i 12
iload_1 . L i
i ? 13 iload_1 i
L i i ? 14 imul i,i
L i i ? 15 istore_3
i L i i ? 16 iload_3
. L i i i
17 ireturn i L i
i i
Proof
WhileSelect marked Verif. mem. comp. Model
ins. Find nexts Unify mem. Mark if not
verified
61Classic verification vs PCC
- Classic verification
- Need to find an x such as f(x)x
- Complicated algorithm, long verification
- Works in RAM (complexity number of instructions
times the size of memory (stacklocals) used by
applet). This is an issue in smart cards - PCC
- Need to verify that for a given x, f(x)x
- Algorithm simpler (linear scan on code, no
loops). - Verification takes less time
- Fixpoint data needs to be added (bigger applet)
but this data is static (can be in EEPROM).
62How to check types defensively?
- Need to duplicate the stack cell and the local
variables (for each variable and stack element
keep both the value and the type). - Checks are done in real time
- Too slow, too big
6
I
_at_346
Ljava/lang/String
FH
5.4
FL
4
I
63What research can one do here?
- Find strategies for fixpoint calculation with
less iterations. - Find strategies for fixpoint calculation with
less RAM - Find memory-time tradeoffs for fixpoint
calculation - Design languages for which fixpoint calculation
is particularly efficient. - Find ways to compact the proof of PCC.
- This is a very active research area with a huge
practical impact, due to Javas increased
popularity.