Title: CSSV%20
1CSSV C String Static Verifier
- Nurit Dor
- Michael Rodeh
- Mooly Sagiv
- Greta Yorsh
- Tel-Aviv University
- http//www.cs.tau.ac.il/nurr
2The ProblemDetecting String Manipulation Errors
- An important problem
- Common errors
- Cause security vulnerability
- A challenging problem
- Use of pointers
- Use of pointer arithmetic
- Error point vs. failure point
3Example unsafe call to strcpy()
simple() char s20 char p char t
10 strcpy(s,Hello) p s
5 strcpy(p, world!) strcpy(t,s)
4Complicated Example
/ from web2c fixwrites.c / define BUFSIZ
1024 char bufBUFSIZ char insert_long(char
cp) char tempBUFSIZ for (i 0
bufi lt cp i) tempi
bufi strcpy(tempi,(long)) strcpy(temp
i6,cp)
(long)
temp
5Complicated Example
/ from web2c fixwrites.c / define BUFSIZ
1024 char bufBUFSIZ char insert_long(char
cp) char tempBUFSIZ for (i 0
bufi lt cp i) tempi
bufi strcpy(tempi,(long)) strcpy(temp
i6,cp)
buf
cp
( l o n g )
temp
6Complicated Example
/ from web2c fixwrites.c / define BUFSIZ
1024 char bufBUFSIZ char insert_long(char
cp) char tempBUFSIZ for (i 0
bufi lt cp i) tempi
bufi strcpy(tempi,(long)) strcpy(temp
i6,cp)
(long)
temp
7Real Example
void RTC_Si_SkipLine(const INT32 NbLine, char
const PtrEndText) INT32 indice for
(indice0 indiceltNbLine indice)
PtrEndText '\n' (PtrEndText)
PtrEndText '\0' return
PtrEndText
8Vulnerable String Manipulation
- Pointers to buffers char p buffer
while( ) p - Standard string manipulation functions
- strcpy(), strcat(),
- NULL termination
- strncpy(),
9Are String Violations Common?
- FUZZ study (1995)
- Random test programs on various systems
- 9 different UNIX systems
- 18 23 hang or crash
- 80 are string related errors
- CERT advisory
- 50 of attacks are abuses of buffer overflows
10Current Methods
- Runtime
- Safe-C PLDI94
- Purify
- Bound-checking
-
- Static Runtime
- CCured POPL02
11Current Methods
- Static
- Wagner et. al. NDSS00
- LCLints extension USENIX01
- Dor, Rodeh and Sagiv SAS01
12Goals
- Static detection of string errors
- References beyond array limit
- Unsafe pointer arithmetic
- Missing null terminator
- Additional properties
- References beyond null
- Specified using preconditions
- Sound
- Never miss errors
- Few false alarms
IS IT POSSIBLE ?
13Challenges in Static Analysis
- Soundness
- Precision
- Combine integer and pointer analysis (pi)
\0 strcpy(q, p) - Scalability to handle real applications
- Complexity of Chaotic iterations
- Handles full C
14CSSV Solution
- Use powerful static domain
- Exponential abstract interpretation
- Use pre- and post-conditions to specify procedure
requirements on strings - No interprocedural analysis
- Modular analysis
- Automatic generation of procedure specification
15CSSV
Pointer Analysis
ProceduresPointer info
Procedure name
C2IP
Integer Proc
Potential Error Messages
Integer Analysis
16Advantages of Specifications
- Allows modular analysis
- Not all the code is available
- Enables more precise analyses
- User control of the verification
- Detect errors at point of logical error
- Improve the precision of the analysis
- Check additional properties
- Beyond ANSI-C
17Specification and Soundness
- Preconditions are handled conservatively
- All errors are detected
- Inside a procedures bodyOR
- At call statements to the procedure
18Specification strcpy
- char strcpy(char dst, char src)
- requires mod
- ensures
( string(src) ? alloc(dst) gt len(src) )
( len(dst) pre_at_len(src) ? return
pre_at_dst )
19Specification insert_long()
/ insert_long.c / include "insert_long.h"
char bufBUFSIZ char insert_long (char cp)
char tempBUFSIZ int i for (i0
bufi lt cp i) tempi bufi
strcpy (tempi,"(long)") strcpy
(tempi 6, cp) strcpy (buf, temp)
return cp 6
char insert_long(char cp) requires(
string(cp) ? buf ? cp lt buf BUFSIZ ) mod
cp.strlen ensures ( cp.strlen
precp.strlen 6 ? return_value cp
6 )
20Difficulties with Specifications
- Legacy code
- Complexity of software
- Need to know context
21CSSV
Pointer Analysis
ProceduresPointer info
Procedure name
C2IP
Integer proc
Potential Error Messages
Integer Analysis
22CSSV Pointer Analysis
- Models the string objects
- Pre compute points-to information
- Determines which objects may be updated thru a
pointer
- char s20
- char p
-
- p s 5
- strcpy(p, world!)
23Integrating Pointer Information?
foo(char p, char q) char local100 p
local q 0
main() char s10, t20, r30 char
temp foo(s,t) foo(s,r) temp s
24Projection for foo()
foo(char p, char q) char local100 p
local
local
p
q
param1
param2
25CSSV
Pointer Analysis
ProceduresPointer info
Procedure name
C2IP
Integer proc
Potential Error Messages
Integer Analysis
26C2IP C to Integer Program
- Generate an integer program
- Integer variables only
- No function calls
- Non deterministic
- Goal
- String error in the C program
- Assert violated in the IP
27C2IP C to Integer Program
- Inline specification
- Based on points-to information
- Generate constraint variables
- Generate assert statements
- Generate update statements
28C2IP - Constraint Variable
- For every pointer
- p.offset
s
p.offset 2
29C2IP - Constraint Variable
- For every abstract location
- aloc.is_nullt
- aloc.len
- aloc.msize
aloc5
0
s
t
30C2IP
char bufBUFSIZ
int buf.offset 0 int sbuf.msize BUFSIZ
int sbuf.len int sbuf.is_nullt
char insert_long (char cp)
int cp.offset
char tempBUFSIZ
int temp.offset 0 int
stemp.msize BUFSIZ int stemp.len int
stemp.is_nullt
int i
int i
require string(cp)
assume(sbuf.is_nullt ? 0 ? cp.offset ? sbuf.len
? sbuf.alloc )
for(i0 bufi lt cp i)
tempicpi
for (i0 ilt cp.offset i )
assert(0 ? i ? stemp.msize ?
(stemp.is_nullt? i? stemp.len)) assert(-i
?cp.offsetlt -i sbuf.len) if
(sbuf.is_nullt ? sbuf.len i )
stemp.len i stemp.is_nullt true
else
31C2IP
char insert_long (char cp)
int cp.offset
char tempBUFSIZ
int temp.offset 0 int
stemp.msize BUFSIZ int stemp.len int
stemp.is_nullt
int i
int i
require string(cp)
assume(sbuf.is_nullt ? 0 ? cp.offset ? sbuf.len
? sbuf.alloc )
for(i0 bufi lt cp i)
tempicpi
for (i0 ilt cp.offset i )
assert(0 ? i ? stemp.msize ?
(stemp.is_nullt? i? stemp.len)) assert(-i
?cp.offsetlt -i sbuf.len) if
(sbuf.is_nullt ? sbuf.len i )
stemp.len i stemp.is_nullt true
else
assert(0 ? i lt 6 - stemp.msize
) assume(stemp.len i 6)
strcpy(tempi,"(long)")
32C2IP - Assert statements
assert ( 5 lt s.alloc (!s.is_nullt 5 lt
s.len))
p s 5
33C2IP - Update statements
p s 5
p.offset s.offset 5
34C2IP - Use points-to information
aloc1
p
aloc5
p 0
if () aloc1.len p.offset aloc1.is_nullt
true else alloc5.len p.offset alloc5.is_
nullt true
35C2IP - Inline specification
strcpy(s, hello)
assert( s.offset lt s.alloc s.alloc
s.offset gt s.len) eliminate( s.len ) assume(
s.len s.offset 5)
36Handling structures
- Pointer analysis handles structures
- C2IP handles pointer arithmetic
- Generate constraint variables per field
37Handling structures
Sp.first_name.alloc Sp.first_name.len Sp.first_na
me.is_nullt Sp.last_name Sp.nameLen p_Ptr.of
fset name.offset
struct person char first_name10 char last_
name20 int nameLen struct person
p struct person p_Ptr char name
38Handling structures
struct person p char name name p.last_name
name.offset offsetof(last_name, person)
39CSSV
Pointer Analysis
ProceduresPointer info
Procedure name
C2IP
Integer proc
Potential Error Messages
Integer Analysis
40Integer Analysis
- Interval analysis is not enough
- assert(-i ?cp.offsetlt -i sbuf.len)
- Use a powerful abstract domain
- Polyhedra (Cousot Halbwachs, 78)
- Statically analyzes program variable relations
and detects constraints a1 var1 a2 var2
an varn ? b
41Linear Relation Analysis
- Statically analyzes program variable relations
and detects constraints a1 var1 a2 var2
an varn ? b - Polyhedron
42Integer Analysis insert_long()
buf.offset 0 temp.offest 0 0 ? cp.offset
i i ? sbuf.len lt s buf.msize sbuf.msize
1024 stemp.msize 1024
( l o n g )
assert(0 ? i lt 6 - stemp.msize ) //
strcpy(tempi,"(long)")
Potential violation when cp.offset ? 1018
43CSSV
Pointer Analysis
ProceduresPointer info
Procedure name
C2IP
Integer proc
Potential Error Messages
Integer Analysis
44CSSV
Pointer Analysis
ProceduresPointer info
LeafProcedure
C2IPside effect
Mod
Integer proc
45CSSV
Pointer Analysis
ProceduresPointer info
LeafProcedure
C2IPside effect
Pre Mod
Integer proc
Potential Error Messages
Integer Analysis
46AWP
- Approximate the Weakest Precondition
- Backward integer analysis
- Generates a precondition
47AWP insert_long()
- Generate the following precondition
- sbuf.is_nullt ?
- sbuf.len lt sbuf.alloc ?
- 0 ? cp.offset ? sbuf.len ?
48AWP insert_long()
- Generate the following precondition
- string(cp) ?
- sbuf.len ? cp.offset 1017
- Not the weakest precondition
- string(cp) ?
- sbuf.len ? 1017
49Implementation
- Using
- ASToolKit Microsoft
- GOLF Microsoft Das Manuvir
- New Polka IMAG - Bertrand Jeannet
- Main steps
- Simplifier
- Pointer analysis
- C2IP
- Integer Analysis
50Implementation step 1
Inline Annotation
Procedure name
C
Simplifier
51Core C
- Simplify the analysis implementation
- A limited form of C expressions
- Adds temporaries
- At most one operator per statement
- Convert value into location computation
- No lost of precision
52Implementation step 2
GOLF pointer analysis
Procedure name
GFCvisible variables
GlobalPointer info
Visible variables
Procedurespointer projection
ProceduresPointer info
53Implementation step 3 , 4
GFCC2IP
Procedure name
Integer Program
ModularPointer info
backward
Integer Analysis
forward
Potential Error Messages
54Preliminary results (web2C)
FA errors space(Mb) time(sec) coreCline line Proc
0 2 13 2.0 64 14 insert_long
0 2 0.3 0.1 25 10 fprintf_pascal_string
0 0 0.2 0.1 23 9 space_terminate
0 2 1.7 0.2 28 14 external_file_name
1 2 5.2 0.6 53 15 join
0 0 4.6 0.6 105 25 remove_newline
0 2 0.2 0.1 23 9 null_terminate
- Up to four times faster than SAS01
55Preliminary results (EADS/RTC_Si)
FA errors space(Mb) time(sec) coreCline line Proc
0 0 0.5 1.6 34 19 FiltrerCarNonImp
0 0 1.9 0.8 42 12 SkipLine
0 0 21 7.9 134 37 StoreIntInBuffer
56Status
- Implemented
- Simplifier
- GFC
- Procedures pointer analysis
- C2IP excluding structures
- AWP excluding side effect
- TBD
- Structure
- Inline specification
- Side effect analysis
- More applications
57Conclusion
- Static checking for string errors is feasible!
- Can show the absence of string errors in
complicated string manipulation procedures - Identified rare bugs
- Techniques used
- Modular analysis (assume/guarantee)
- Pointer analysis
- Integer analysis
- Open questions
- Can this be fully automated?
- Extension to handle dynamic allocations (ITVLA)
58(No Transcript)