An Overview of the Saturn Project - PowerPoint PPT Presentation

About This Presentation
Title:

An Overview of the Saturn Project

Description:

Saturn delays abstraction to function boundaries. Slogan: Analysis design is summary design! ... Saturn Overview. Scalability. Interpreter is not very ... – PowerPoint PPT presentation

Number of Views:148
Avg rating:3.0/5.0
Slides: 51
Provided by: yiche2
Category:

less

Transcript and Presenter's Notes

Title: An Overview of the Saturn Project


1
An Overview of the Saturn Project
2
The Three-Way Trade-Off
  • Precision
  • Modeling programs accurately enough to be useful
  • Scalability
  • Saying anything at all about large programs
  • Human Effort
  • How much work must the user do?
  • Either giving specifications, or interpreting
    results

Todays focus
Not so much about this . . .
3
Precision
  • int f(int x)
  • . . .
  • . . .

Primary abstraction is done at function
boundaries.
formula
A(Ff), A(Fg), A(Fh)
A(Ff)
Ff
A(Fg)
A(Ff), A(Fg), A(Fh)
A(Ff), A(Fg), A(Fh)
A(Fh)
Intraprocedural analysis with minimal abstraction.
4
Scalability
  • Design constraint
  • SAT formula size function size
  • Analyze one function at a time
  • Parallel implementation
  • Server sends functions to clients to analyze
  • Typically use 50-100 cores to analyze Linux

5
Summaries
  • Abstract at function boundaries
  • Compute a summary for functions behavior
  • Summaries should be small
  • Ideally linear in the size of the functions
    interface
  • Summaries are our primary form of abstraction
  • Saturn delays abstraction to function boundaries
  • Slogan Analysis design is summary design!

6
Expressiveness
  • Analyses written in Calypso
  • Logic programs
  • Express traversals of the program
  • E.g., backwards/forwards propagation
  • Constraints
  • For when we dont know traversal order
  • Written 40,000 lines of Calypso code

7
Availability
  • An open source project
  • BSD license
  • All Calypso code available for published
    experiments
  • saturn.stanford.edu

8
People
Isil Dillig
Suhabe Bugrara
Thomas Dillig
Peter Hawkins
9
Outline
  • Saturn overview
  • An example analysis
  • Intraprocedural
  • Interprocedural
  • What else can you do?
  • Survey of results

10
Saturn Architecture
C Program
11
Parsing and C Frontend
Source Code
Build Interceptor
Preprocessed Source Code
CIL frontend
Abstract Syntax Tree Databases
12
Calypso
  • General purpose logic programming language
  • Pure
  • Prolog-like syntax
  • Bottom-up evaluation
  • Magic sets transformation
  • Also a (minor) moon of Saturn

13
Helpful Features
  • Strong static type and mode checking
  • Permanent data (sessions)
  • stored as Berkeley DB databases
  • Sessions are just a named bundle of predicates
  • Support for unit-at-a-time analysis

14
Extensible Interpreter
SAT Solver sat predicate,
Logic Program Interpreter
LP Solver
DOT graph package
UI package
15
Scalability
  • Interpreter is not very efficient
  • OK, its slow
  • But can run distributed analyses
  • 50-100 CPUs
  • Scalability is more important than raw speed
  • Can run intensive analyses of the entire Linux
    kernel (gt6MLOC) in a few hours.

16
Cluster Architecture
Calypso DB
Worker Node 1
Master Node
Databases
Calypso DB
Worker Node 100
17
Job Scheduling
Job a function body
Dynamically track dependencies between jobs
  • Rerun jobs if new dependencies found
  • Optimistic concurrency control

Iterate to fixpoint for circular dependencies
18
Calypso Analyses
C Syntax Predicates
Constraint Solvers
19
The Paradigmatic Locking Analysis
  • Check that a thread does not
  • acquire the same lock twice
  • release the lock twice
  • Otherwise the application may deadlock or crash.

20
Specification
unlock
unlock
error
locked
unlocked
lock
lock
21
Basic Setup
  • We assume
  • one locking function lock(l)
  • one unlocking function unlock(l).
  • We analyze one function at a time
  • produce locking summary describing the FSM
    transitions associated with a given lock.

22
An Example Function Summary
f( . . ., lock L, . . .) lock(L)
. . . unlock(L)
L unlocked -gt unlocked locked -gt error
  • Summaries are input state -gt output state
  • The net effect of the function on the lock
  • Summary size is independent of function size
  • Bounded by the square of the number of states

23
Lock States
  • type lockstate locked unlocked error.
  • Predicates to describe lock states on nodes and
    edges of the CFG
  • predicate node_state(Ppp,Lt_trace,Slockstate,G
    g_guard).
  • predicate edge_state(Ppp,Lt_trace,Slockstate,G
    g_guard).

24
The Intraprocedural Analysis
  • 1. Initialize lock states at function entry
  • 2. Join operator
  • Combine edges to produce successors node_state
  • 3. Transfer functions for every primitive
  • assignments
  • tests
  • function calls

25
Initializing a Lock
  • Use fresh boolean variable ?
  • Interpretation
  • ? is true ) L is locked
  • ? is true ) L is unlocked
  • Enforces that L cannot be both locked and
    unlocked simultaneously

26
Notation
(lock, state, guard)
P
At program point P, the lock is in state if
guard is true.
27
Initialization Rules
  • node_state(P0,L,locked,LG)-
  • entry(P0),
  • is_lock(L),
  • fresh_variable(L, LG).
  • node_state(P0,L,unlocked,UG)-
  • entry(P0),
  • node_state(P0,L,locked,LG),
  • not(LG, UG).

f( . . ., lock L, . . .) . . .
(L, locked, LG)
P0
(L, unlocked, UG)
28
The Intraprocedural Analysis
  • 1. Initialize lock states at function entry
  • 2. Join operator
  • Combine edges to produce successors node_state
  • 3. Transfer functions for every primitive
  • assignments
  • tests
  • function calls

29
Joins
(L, locked, F2)
(L, locked, F1)
(L, locked, F1ÇF2)
node_state(P,L,S,G) - edge_state(P,L,S,_),
\/edge_state(P,L,S,EG)or_all(EG,G).
Note There is no abstraction in the join . . .
30
The Intraprocedural Analysis
  • 1. Initialize lock states at function entry
  • 2. Join operator
  • Combine edges to produce successors node_state
  • 3. Transfer functions for every primitive
  • assignments
  • function calls
  • etc.

31
Assignments
  • Assignments do not affect lock state
  • edge_state(P1,L,S,G) -
  • assign(P0,P1,_),
  • node_state(P0,L,S,G).

P0
(L, S, G)
X E
(L,S, G)
P1
32
Interprocedural Analysis Basics
  • Function summaries are the building blocks of
    interprocedural analysis.
  • Generating a function summary requires
  • Predicates encoding relevant facts
  • A session to store these predicates.

33
Interprocedural Analysis Outline
  • 1. Generating function summaries
  • 2. Using function summaries
  • How do we retrieve the summary of a callee?
  • How do we map facts associated with a callee to
    the namespace of the currently analyzed function?

34
Summary Declaration
  • session sum_locking(FNstring) containinglock_tra
    ns.
  • predicate lock_trans(L t_trace, S0 lockstate,
    S1 lockstate).

Declares a persistent database sum_locking
(function name) holding lock_trans facts
sum_locking
35
Summary Generation Primitives
  • Summaries for lock and unlock
  • sum_locking("lock")-gtlock_trans(arg0,locked,error
    ) - .
  • sum_locking("lock")-gtlock_trans(arg0,unlocked,loc
    ked) - .
  • sum_locking("unlock")-gtlock_trans(arg0,unlocked,e
    rror) - .
  • sum_locking("unlock")-gtlock_trans(arg0,locked,unl
    ocked) -.

36
Summary Generation Other Functions
  • sum_locking(F)-gtlock_trans(L, S0, S1) -
  • current_function(F),
  • entry(P0),
  • node_state(P0, L, S0 , G0),
  • exit(P1),
  • node_state(P1, L, S1, G1),
  • and(G0, G1, G),
  • guard_satisfiable(G).

F( . . ., lock L, . . .) . .
.
P0
(L, S0, G0)
P1
(L, S1, G1)
if SAT(G1 Æ G2), then . . .
h
F S0 ! S1
37
Summary Application Rule
  • call_transfer(I, L, S0, S1, G) -
  • direct_call(I, F),
  • call(P0, _, I),
  • sum_locking(F)-gtlock_trans(CL, S0, S1),
  • instantiate(s_callI, P0, CL, L, G).

G( . . .) F(. . .)
P0
(S0, L, G)
F S0 ! S1
(S1, L, G)
38
Applications
  • Bug finding
  • Verification
  • Software Understanding

39
Saturn Bug Finding
  • Early work
  • Locking
  • Scalable Error Detection using Boolean
    Satisfiability. POPL 2005
  • Memory leaks
  • Context- and Path-Sensitive Memory Leak
    Detection. FSE 2005
  • Scripting languages
  • Static Detection of Security Vulnerabilities in
    Scripting Languages. 15th USENIX Security
    Symposium, 2006
  • Recent work
  • Inconsistency Checking
  • Static Error Detection Using Semantic
    Inconsistency Inference. PLDI 2007

40
Examples Null pointer dereferences
Application KLOC Warnings Bugs False Alarms FA Rate
Openssl-0.9.8b 339 55 47 6 11.30
Samba-3.0.23b 516 68 46 19 29.20
Openssh-4.3p2 155 9 8 1 11.10
Pine-4.64 372 150 119 28 19.00
Mplayer-1.0pre8 762 119 89 28 23.90
Sendmail-8.13.8 365 9 8 1 11.10
Linux-2.6.17.1 6200 373 299 66 18.10
Total 8793 783 616 149 19.50
41
Lessons Learned
  • Saturn-based tools improve bug-finding
  • Multiple times more bugs than previous results
  • Lower false positive rate
  • Why?
  • Sounder than previous bug finding tools
  • bit-level modeling, handling casts, aliasing,
    etc.
  • Precise
  • Fully intraprocedurally path-sensitive
  • Partially interprocedurally path-sensitive

42
Lessons Learned (Cont.)
  • Design of function summary is key to scalability
    and precision
  • Summary-based analysis only looks at the relevant
    parts of the heap for a given function
  • Programmers write functions with simple
    interfaces

43
Saturn Verification
  • Unchecked user pointer dereferences
  • Important OS security property
  • Also called probing or user/kernel pointers
  • Precision requirements
  • Context-sensitive
  • Flow-sensitive
  • Field-sensitive
  • Intraprocedurally path-sensitive

44
Current Results for Linux-2.6.1
  • 6.2 MLOC with 91,543 functions
  • Verified 616 / 627 system call arguments
  • 98.2
  • 11 false alarms
  • Verified 851,686 / 852,092 dereferences
  • 99.95
  • 406 false alarms

45
Preliminary Lessons Learned
  • Bug finders can be sloppy ignore functions or
    points-edges that inhibit scalability or
    precision
  • Soundness substantially more difficult than
    finding bugs
  • Lightweight, sparsely placed annotations
  • Have programmers add some information
  • Makes verification tractable
  • Only 22 annotations need for user pointer analysis

46
Saturn for Software Understanding
  • A program analysis is a code search engine
  • Generic question Do programmers ever do X?
  • Write an analysis to find out
  • Run it on lots of code
  • Classify the results
  • Write a paper . . .

47
Examples
  • Aliasing is used in very stylized ways, at least
    in C
  • Cursors into data structures
  • Parent/child pointers
  • And 7 other idioms
  • How is Aliasing Used in Systems Software? FSE
    2006
  • Do programmers take the address of function ptrs?
  • Answer Almost never.
  • Allows simpler analysis of function pointers

48
Other Things Weve Thought About
  • Shape analysis
  • We notice the lack of shape information
  • Interprocedural path-sensitivity
  • Needed for some common programming patterns
  • Proving correctness of Saturn analyses

49
Related Work
  • Lots
  • All bug finding and verification tools of the
    last 10 years
  • Particularly, though
  • Systems using logic programming (bddbddb)
  • ESP
  • Metal
  • CQual
  • Blast

50
saturn.stanford.edu
Write a Comment
User Comments (0)
About PowerShow.com