- PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Description:

Unsoundness gives very few false positives. Unsoundness allows efficient path- and context-sensitive analysis. Tested on real code ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 30
Provided by: alansu3
Learn more at: http://www.cs.umd.edu
Category:

less

Transcript and Presenter's Notes

Title:


1
Tracking Pointers with Path and Context
Sensitivity for Bug Detection in C Programs
CMSC 838Z Spring 2004
  • V. Benjamin Livshits and Monica S. Lam
  • presented by Mujtaba Ali
  • (Based on a presentation given at ACM SIGSOFT
    FSE-11, September 2003)

2
Bugs are Bad
  • Costs lots of money to fix after deployment
  • Nastiest bugs security violations
  • Hard to discover effects of malicious attacks
  • Legal/liability issues

3
CVE Classification
62
Would like to address these
  • Study of security reports in 2002
  • source SecurityFocus.com

4
Motivation
  • Goal Report hard-to-find security violations
  • Errors spanning many functions and files
  • Reduce false positives
  • Applications
  • Buffer overflows
  • Format string vulnerabilities

5
Examples
  • Two security vulnerabilities
  • Buffer overrun in gzip, compression utility
  • Format string violation in muh, network game
  • Cause Unsafe use of user-supplied data
  • gzip copies data to statically-sized buffer
  • may result in an overrun
  • muh uses data as format argument call to
    vsnprintf
  • user can maliciously embed n into format string

6
Buffer Overrun in gzip
gzip.c593
  • 0592     while (optind lt argc) 
  • 0593         treat_file(argvoptind)
  • 0594

0704 local void treat_file(char iname) ... 0716 
    if (get_istat(iname, istat) ! OK) return
gzip.c716
0997 local int get_istat(char iname,
struct stat sbuf) ... 1009     strcpy(ifname,
 iname)
gzip.c1009
Need a model of strcpy
0233 char ifnameMAX_PATH_LEN /input file
name/
gzip.c233
7
Format String Violation in muh
muh.c839
  • 0838             s  ( char  )malloc( 1024 )
  • 0839             while( fgets( s, 1023, messagelog
     ) ) 
  • 0841                irc_notice(c_client, status.n
    ickname, s)
  • 0842             
  • 0843             FREESTRING( s )

irc.c263
257 void irc_notice(con_type con, char nick, 
char format, ... )259     va_list va260  
   char buffer BUFFERSIZE 261 262     va_star
t( va, format )263     vsnprintf( buffer, BUFFER
SIZE - 10, format, va )
8
Easy Bugs are Boring
  • Programs have security violations despite code
    reviews and years of use
  • Common observation about hard errors
  • Errors on interface boundaries need to follow
    data flow between procedures
  • Errors occur along complicated control-flow
    paths need to follow long definition-use chains

9
Need for Alias Analysis
  • Both examples involved complex flow of data
  • Tracking data flow in modern languages requires
    alias analysis
  • Steensgaards or Andersens analysis?
  • Flow- and context- insensitive
  • Fast, but imprecise too many false positives
  • But flow- and context- sensitive analyses do not
    scale

10
Tradeoff Scalability vs Precision
3-value logic
This analysis
Wilson Lam
high
Precision
Andersen
Steensgaard
low
slow and expensive
fast
Speed / Scalability
11
A Hybrid Analysis
  • Maintain precision selectively
  • Analyze precisely
  • Local variables
  • Function parameters
  • Global variables
  • their dereferences and fields
  • These are essentially access paths, i.e.
    p.next.data.
  • The rest Break into equivalence classes
  • Represent by abstract locations
  • Recursive data structures
  • Arrays
  • Locations accessed through pointer arithmetic

12
Linearity
  • Regular assignments result in strong updates
  • Assignments to abstract memory locations weak
    updates

x  1 x0 1 x  2 x1 2 y
x y0 x1
x is 2
Ai  1 m0 1 Aj  2 m1
?(m0, 2) b Ak b0 m1
Either 1 or 2
13
Static Single Assignment
  • Sparse representation of program
  • Propagate facts about definition where needed
  • Definition-use relationships
  • Each variable is defined only once
  • Give new names (subscripts) to definitions
  • Solves flow-sensitivity problem
  • But a definition may be used many times

14
SSA Join points
  • In standard SSA, use ? function at joins
  • d3 ? (d1, d2)
  • In Gated SSA, use ? function at joins
  • d3 ? ( ltP, d1gt, ltP, d2gt)
  • Solves path-sensitivity problem

15
IPSSA Intraprocedurally
  • Extension of Gated SSA
  • Provides pointer resolution
  • Replace indirect pointer dereferences with direct
    accesses of potentially new temporary locations

16
Example of Pointer Resolution
int a0,b1 int c2,d3 if(Q)    p  aels
e    p  b c   p p  d
a0 0, b0 1
c0 2, d0 3
p1 a
p2 b
p3 ?(ltQ, p1gt, ltQ, p2gt)
Load resolution
c1 ?(ltQ, a0gt, ltQ, b0gt)
a1 ?(ltQ, d0gt, ltQ, a0gt)
Store resolution
b1 ?(ltQ, b0gt, ltQ, d0gt)
17
Pointer Resolution Rules
  • When resolving definition d, next step depends on
    RHS of d
  • Expressed as conditional rewrite rules
  • A few sample rules
  • d x, result is x
  • d ?(), result is d
  • d ?(ltP1, d1gt,,ltPn, dngt), follow d1dn
  • Refer to the paper for details

18
Interprocedural Example
  • Data flow in and out of functions
  • Create links between formal and actual parameters
  • Reflect stores and assignments to globals at the
    callee

int f(int p) p  100  int main()
     int x 0      int q  x        c
f(q)    
p0 ?(ltc,q0gt)
p1 100
Formal-actual connection for call site c
x0 0
q0 x
Reflect store inside of f within main
x1 ?(ltf,100gt)
19
Unsound Unaliasing Assumption
A1 No aliased parameters A2 No aliased abstract locations
Assumption Locations accessible through different parameters are distinct Things pulled out of an abstract location is not aliased
Justification Matches how good interfaces are written Holds in most usage cases
Consequence Context-independent procedure summaries Give unique names when we get data from abstract location
20
Interprocedural Algorithm
  • Process one SCC of the call graph at a time
  • Bottom-up
  • SCC strongly-connected component
  • For each SCC, within each procedure
  • Resolve all pointer operations (loads and stores)
  • Create links between formal and actual parameters
  • Reflect stores and assignments to globals at call
    sites
  • Iterate within SCC until the representation
    stabilizes

21
Framework
Framework makes it easy to add new analyses
Program sources
Buffer overruns
IPSSA construction
Format violations
Error traces
IP data flow info
others
Abstracts away many details. Makes it easy to
write tools
22
Application
  • Start at roots sources of user input such as
  • argv elements
  • Input functions fgets, gets, recv, getenv, etc.
  • Follow data flow chains provided by IPSSA for
    every definition, IPSSA provides a list of its
    uses
  • A sink is a potentially dangerous usage such as
  • A buffer of a statically defined length
  • A format argument of vulnerable functions
    printf, fprintf, snprintf, vsnprintf
  • Report bug, record full path

23
Experimental Setup
  • Implementation w/ SUIF2 compiler suite
  • Pentium IV 2GHz, 2GB of RAM running Linux

24
Summary of Results
Many definitions
Many procedures
25
False Positive in pcre
  • Copying tainted user data to a statically-sized
    buffer may be unsafe
  • Turns out to be safe in this case

Tainted data
sprintf(buffer, .512s, filename)
Limits the length of copied data. Buffer is big
enough!
26
Related Work
  • xgcc
  • Also unsound
  • Many more false positives
  • Even regular developers can specify new
    analyses
  • Cqual
  • Interprocedural, sound analysis
  • Requires annotations
  • Flow-, context-, and path-insensitive
  • More false positives

27
The Good
  • Unsoundness gives very few false positives
  • Unsoundness allows efficient path- and
    context-sensitive analysis
  • Tested on real code
  • Good presentations to steal slides from

28
The Bad
  • Cqual is much improved now
  • Very few false positives on these types of
    security vulnerabilities
  • Lets see some more interesting vulnerabilities
  • Does not detect buffer overflows due to array
    bounds violations
  • Not tested with large programs

29
Singular Key Idea
  • IPSSA coupled with a (slightly) unsound alias
    analysis facilitates efficient detection of
    hard-to-find security violations with very few
    false positives
Write a Comment
User Comments (0)
About PowerShow.com