Using System-Specific Compiler Extensions to Find Errors in Systems Code - PowerPoint PPT Presentation

About This Presentation

Title:

Using System-Specific Compiler Extensions to Find Errors in Systems Code

Description:

Talk about how to find hundreds of errors by just putting a little bit of system specific knowledge in a compiler Using System-Specific Compiler Extensions to Find ... – PowerPoint PPT presentation

Number of Views:78

Avg rating:3.0/5.0

Slides: 23

Provided by: publicpc6

Learn more at: http://web.stanford.edu

Category:

more less

Transcript and Presenter's Notes

Title: Using System-Specific Compiler Extensions to Find Errors in Systems Code

1
Using System-Specific Compiler
Extensions to Find Errors in Systems Code
Talk about how to find hundreds of errors by just
putting a little bit of system specific knowledge
in a compiler

Dawson Engler
Ben Chelf, Andy Chou, Seth Hallem
Stanford University

2
Checking systems software
Never do X (do not use floating point, allocate
large vars on 6K kernel stack) Always do
X before/after Y (acquire lock before use,
release after use)

Systems software has many ad-hoc restrictions
acquire lock L before accessing shared variable
X
do not allocate large variables on 6K kernel
stack
Error crashed system. How to find errors?
Formal verification
rigorous
- costly expensive. Very rare to do for
software
Testing
simple, few false positives
- requires running code doesnt scale can be
impractical
Manual inspection
flexible
- erratic doesnt scale well.
What to do??

Practically intractable far too strenous, and
even if you do, spec isnt code.
Method of choice if you build systems for money
test. Problem O(paths) exponential in length
of code. So if you build systems, what you wind
up with is a system that only crashes after a
week. Further, mapping crash back to cause can
be really hard.
Inspection is dead modify code, have to do it
again.
3
Another approach
Never do X (do not use floating point, allocate
large vars on 6K kernel stack) Always do
X before/after Y (acquire lock before use,
release after use)

Observation rules can be checked with a compiler
scan source for relevant acts check if they
make sense E.g., to check disabled interrupts
must be re-enabled scan for calls to
disable()/enable(), check that they match, not
done twice
Main problem
compiler has machinery to automatically check,
but not knowledge
implementor has knowledge but not machinery
Meta-level compilation (MC)
give implementors a framework to add
easily-written, system-specific compiler
extensions

Want to make compilers aggressively system
specific if you design a system or interface and
see a use of it, you invariably see ways of doing
it better give you way to articulate this
knowledge and have compiler do it for you
automatically
4
Meta-level compilation (MC)
Actual error in Linux raid5 driver disables
interrupts, and then if it fails to allocate
buffer, returns with them disabled. This kernel
deadlock is actually hidden by an immediate
segmentation fault since the callers dereference
the pointer without checking for NULL

Implementation
Extensions dynamically linked into GNU g
compiler
Applied down all paths in input program source
E.g. 64-line extension to check disable/disable
(82 bugs)
Static detection of real errors in real systems
600 bugs in Linux, OpenBSD, FLASH, Xok exokernel
most extensions lt 100 lines, written by system
outsiders

save(flags) cli() if(!(buf alloc()))
return NULL restore(flags) return buf
GNU C compiler
Linux raid5.c
The main results it works really well, and its
easy
5
A bit more detail
include linux-includes.h sm chk_interrupts
decl unsigned flags // named patterns
pat enable sti()
restore_flags(flags) pat disable
cli() // states is_enabled disable gt
is_disabled enable gt err("double
enable") is_disabled enable gt
is_enabled disable gt err("double
disable") end_of_path gt
err("exiting w/intr disabled!")

6
X before Y rule system call pointers

Applications are evil
OS much check all input pointers before use
one missing check security hole
MC checker
Bind syscall ptrs to tainted state
tainted vars only touched w/ safe
routines
or explicit check to make clean

Each input ptr P
copyin(p), copyout(p)
P.tainted
use(p)
check(p)
/ from sys/kern/disk.c / int sys_disk_request(
struct buf reqbp, u_int k) ... /
bypass for direct scsi commands / if
(reqbp-gtb_flags B_SCSICMD) return
sys_disk_scsicmd (sn, k, reqbp)
error
use(p)
7
Deriving specification from common usage
Drivers pull them from all sorts of random
locations

Problem difficult to specify all user pointers
sosee what code usually does, deviations
probably errors
if ever pass ptr to paranoid routine, make sure
always do
Found 5 security errors in Linux.
Canonical example hole in an ioctl routine for
some obscure device driver.

/ drivers/usb/evdev.c / static int
evdev_ioctl(..., unsigned long arg) ...
switch (cmd) case EVIOCGVERSION
return put_user(EV_VERSION,
(__u32 ) arg) case EVIOCGID /
copy_to_user(to, from)! / return
copy_to_user(dev-gtid, (void ) arg,
sizeof(struct input_id))
8
Kernel alloc/dealloc rules

Must check that alloc succeeded
Must allocate enough space
Must not use after free()
Must free allocd object on error

/ from drivers/char/tea6300.c /
client kmalloc(sizeof client,GFP_KERNEL)
if (!client) return -ENOMEM
... tea kmalloc (sizeof tea,
GFP_KERNEL) if (!tea)
return -ENOMEM ...
MOD_INC_USE_COUNT / bonus bug kmalloc could
sleep /
47 error leaks, 48 false positives. 72
mod_inc/mod_dec errors
9
Stripped-down kernel malloc/free checker
Interesting since its so small a few lines of
code and we have a path sensitive high-level
analysis pass
decl scalar sz // match
any scalar decl const int retv
// match const ints state decl any_ptr
v // match any pointer, can bind to a state
// Bind malloc results to unknown until
observed start v (any)malloc(sz) gt
v.unknown free(v) gt
v.freed // can compare in states unknown,
null, not_null v.unknown, v.null,
v.not_null (v 0) gt
true v.null, false v.not_null
(v ! 0) gt true v.not_null, false
v.null // Cannot reach error path with
unknown or not-null v.unknown, v.not_null
return retv gt
if(mgk_int_cst(retv) lt 0) err("Error path
leak!") // No dereferences of null,
freed, or unknown ptrs. v.null, v.freed
v.unknown (any )v
gt err("Using ptr illegally!")
10
Some amusing bugs

No check (130 errors, 11 false pos). Worse case
(many uses)
use after free (14 errors, 3 false pos) 5
cutpaste of
wrong size (2 errors)

/ include/linux/coda_linux.hCODA_ALLOC / ptr
(cast)vmalloc((unsigned long) size) ... if
(ptr 0) printk("kernel malloc returns
0\n) memset( ptr, 0, size )
/ drivers/isdn/pcbitpcbit_init_dev /
kfree(dev) iounmap((unsigned
char)dev-gtsh_mem) release_mem_region(dev-gtph
_mem, 4096)
/ drivers/parport/daisy.cadd_dev50 / newdev
kmalloc (GFP_KERNEL,sizeof(struct daisydev))
11
In context Y, dont do X blocking
400 lines later have this violation. This is a
common pattern implementor just doesnt know
about the rule, so keeps violating it. Happens
since rules manually enforced and poorly
documented.

Linux if interrupts are disabled, or spin lock
held, do not call an operation that could block.
MC checker
Compute transitive closure of all potentially
blocking fns
Hit disable/lock warn of any calls
123 errors, 8 false pos

clean
lock(l)
unlock(l)
enable()
disable()
Block call
/ drivers/net/pcmcia/wavelan_cs.c
/ spin_lock_irqsave (lp-gtlock, flags) / 1889
/ switch(cmd) ... case SIOCGIWPRIV
... if(copy_to_user(wrq-gtu.data.poin
ter, )) / 2305 / ret -EFAULT
error
12
Example statically checking assert
Another way to use MC is to push dynamic checks
to static. Usually have some amount of dynamic
type checking going on where you have a series of
if statements at the beginning of your routine to
check for error conditions. So just pull into
the compiler and check.

Assert(x) used to check x at runtime. Abort if
false
compiler oblivious, so cannot analyze statically
Use MC to build an assert-aware extension
Result found 5 errors in FLASH.
Common code cutpaste from other context
Manual detection questionable 300-line path
explosion between violation and check
General method to push dynamic checks to static

msg.len 0 ... assert(msg.len !0)
line 211assert failure!
assert checker
13
Result overview
High bits small number of LOC big bug count,
21 ratio of false positives
(others)
14
Conclusions
Given a set of uses of some interface youve
built, you invariably see better ways of doing
things. This gives you a way to articulate this
knowlege and have the compiler do it for you
automatically. Let one person do it.

MC goal make programming much more powerful
How Raise compilation from level of programming
language to the meta level of the systems
implemented in that language
MC works well in real, heavily tested systems
We found bugs in every system weve looked at.
Over 600 bugs in total, many capable of crashing
system
Easily written by people unfamiliar w/ checked
system
Currently
making correctors, using domain-knowledge to
extract verifiable specs, deriving errors by
usage deviations, performing meta-level
optimization

15
Conclusions
Given a set of uses of some interface youve
built, you invariably see better ways of doing
things. This gives you a way to articulate this
knowlege and have the compiler do it for you
automatically. Let one person do it.

Meta-level compilation
Make compilers aggressively system-specific
Easy digest sentence fragment, write
checker/optimizer.
Result Static, precise, immediate error
diagnosis
As outsiders found errors in every system
looked at
Over 600 bugs, many capable of crashing system
Currently
making correctors, using domain-knowledge to
extract verifiable specs, deriving errors by
usage deviations, performing meta-level
optimization

16
Bugs as deviant behavior
See system call polarity return lt 0 or gt 0 on
error found 8 places in linux. If ever check a
routine failure make sure they always check for
failure. See what they do after calling a
security check (suser) if they do it a lot and
deviate, whine. Weakness always do wrong, never
do right.

One way to find bugs
have a deep understanding of code semantics,
detect when code makes no sense. Hard.
Easier
see what code usually does deviations probably
bugs
x protected by lock(a) 1000 times, by lock(b)
once, probably an error
Find inverses by looking for common pairings
More general derive temporal orderings. Use
machine learning to derive more sophisticated
patterns?

lock(a) x unlock(a)
lock(b) x unlock(b)
lock(a) x unlock(a)
lock(a) x unlock(a)
lock(a) x unlock(a)
lock(a) x unlock(a)
17
What to do when static analysis too weak?

Static analysis works in some cases, not well in
others
hit undecidable problems with loop termination
conditions, data values, pointers,
Alternative
use domain-specific slicing to extract spec from
code
run through verifier
Main lever a little domain knowledge goes a long
way
e.g., strip out Linux TCP finite-state-machine by
keying off of variable sk-gtstate
Real example checking FLASH code

18
Extracting specs from FLASH code
Deeply nested control structures (29 conditional
compilation directives, 21 if statements)

Embedded sw for cache coherence in FLASH machine
errors crash or deadlock machine can take week
to track
typical protocol 18K lines of hairy C code
Extract specifications from source by simple
slicing
found 9 errors in code
despite 5 years of heavy testing and formal
verification!
How?
Given list of data structure fields and message
operations, slice out all relevant operations
Compose with specification (manual) boilerplate
run through Murphi model checker
Levers aliasing and globals, but in a stylized
way that we can mostly ignore. 4 loops in code.

19
FLASH vs Murphi
HANDLER_GLOBALS(header.nh.len)
LEN_CACHELINE if (! HANDLER_GLOBALS(h.hl.Pending)
) if (! HANDLER_GLOBALS(h.hl.Dirty))
ASSERT(!HANDLER_GLOBALS(h.hl.IO))
PI_SEND(F_DATA, F_FREE, F_SWAP,,)
HANDLER_GLOBALS(h.hl.Local) 1 / ...
deleted 14 lines / else
ASSERT(!HANDLER_GLOBALS(h.hl.List))
ASSERT(!HANDLER_GLOBALS(h.hl.RealPtrs))
FLASH
nh.len len_cacheline if ((DH.Pending 0))
then if ((DH.Dirty 0)) then
assert(nh.len ! len_nodata) mbResult
pi_send_func(src, PI_Putt) DH.Local 1
else assert((DH.List 0))
assert((DH.RealPtrs 0))
Murphi
20
Checkers into Correctors
Malloc can insert null pointer check (if you can
figure out right value to return) or can
preallocate. Hoist mod_inc/dec above sleeping
operations.
Empirical observation we have sent out hundreds
of bug reports. 20-30 have gotten fixed.

Problem big system, lots of bugs
may not be your system or take too long to fix
manually
Can turn some classes of checkers into
correctors
Do not allocate large variables on kernel
stack if you hit a violation, rewrite code to
dynamically allocate var
Do not call blocking memory allocator with
interrupts disabled hoist allocation out
On error paths, rollback side-effects
dynamically track what these are, and reverse.
Interesting trade dynamic checks for simplicity

21
MC optimization

Optimization rules similar to checking
if data is not shared with interrupt handlers,
protect using spin locks rather than interrupt
disable/enable
to save an instruction when setting a message
opcode, xor it with the new and old (msg.opcode
(new old))
replace quicksort with radix sort when sorting
integers
Common rule In situation X, do Y rather than
Z
if a variable is not modified, protect using
read locks
and with a few lines change opt into checker

22
MC analysis vs. traditional compiler analysis

Meaning more apparent domain-specific knowlege
Easier to bound side-effects use knowledge of
abstract state to ignore many concrete actions
Aliasing less of a problem
typical opaque handles vs normal mess of
pointers
Operations more coarse grain
read()/write() vs load/store matrix ops vs /-

Bigint a, b, c set(a, 3) mul(b, a, a) mul(c,
b, b) printf(s, bigint_to_str(c))
Interfaces let compilers treat implementation as
a black box just like programmers!
printf(81)
Could imagine following bunches of pointers and
memory allocation instead, we know that it does
a , - whatever and we can ignore it. Similarly
ignoring ptrs.
Many compiler problems hard because static. Not
a problem with runtime checking. Less control
needed pushing around function calls, rather
than doing register allocation

Write a Comment

User Comments (0)