Title: Mining Windows Kernel API Rules
1Mining Windows Kernel API Rules
- Jinlin Yang
- jinlin_at_cs.virginia.edu
- 09/28/2005 CS696
2My Background
- Bounded exhaustive testing, 09/2001-01/2004
- D. Coppit, J. Yang, S. Khurshid, W. Le, and K.
Sullivan. Software Assurance by Bounded
Exhaustive Testing. IEEE Transactions on Software
Engineering. April 2005 - K. Sullivan, J. Yang, D. Coppit, S. Khurshid, and
D. Jackson. Software Assurance by Bounded
Exhaustive Testing. ISSTA 04 - Temporal properties inference, 01/2004-present
- J. Yang and D. Evans. Dynamically Inferring
Temporal Properties. PASTE 04 - J. Yang and D. Evans. Automatically Inferring
Temporal Properties for Program Evolution. ISSRE
04 - J. Yang and D. Evans. Automatically Discovering
Temporal Properties for Program Verification.
Submitted to FMSD - J. Yang, D. Evans, D. Bhardwah, T. Bhat, and M.
Das. Terracotta Mining Temporal API Rules from
Imperfect Traces. Submitted to ICSE 06
3Overview
- Problem unavailability of specification is a big
issue in defect detection - Solution automatically inferring specification
from execution traces - Benefits better understanding of legacy code and
opportunity to find more defects - Experiments on finding kernel API rules
- Found one previously unknown bug in Windows
- Found interesting properties that should have
been checked
4Problem
- Defect detection technique
- Generic properties
- E.g. pointer and buffer usage
- PREfix Bush et al, SPE00, PREfast
- Very effective
- Application specific properties
- E.g. lock/unlock, resource creation/deletion
- SLAM/SDV Ball et al, SPIN01, ESP Das et al,
PLDI02 - Where do we get such properties?
5My Approach
Instrumented Program
Execution Traces
Inferred Properties
Program
Report
Instrumentation
Running
Inference
Post-processing
Property Templates
Test Suite
J. Yang and D. Evans. Dynamically inferring
temporal properties. PASTE 04.
6An Example
- Alternating template
- (PS), P?S. P and S are placeholders
7Implementation
- Terracotta
- Inference engine
- Context-aware trace analysis
- Heuristics for prioritizing and presenting
properties - Performance linear to length of trace and number
of distinct events - More information
- http//www.cs.virginia.edu/terracotta
8Lessons
- Missing interesting properties
- Original algorithm requires 100 satisfaction
- Real world is never perfect ?
- Trace collected by sampling
- Object information unavailable
- Imperfect programs
- Can we develop better inference to handle this?
- Too many noises in results
- Interesting properties are buried in a group of
uninteresting ones - Can we develop heuristics to select interesting
ones?
9Refinement of Inference
- How to detect interesting properties in face of
imperfect traces? - Example
- PS PS PS PS PS PS PS PS PS PPP
- The dominant behavior is P and S alternate
- 10 subtraces, 90 satisfy Alternating
10Refinement of Inference (2)
- How to pick out interesting properties?
- Which one is more likely to be interesting?
- Heuristics C?D is often more interesting
- Compute call graph for windows binaries
- Keep A?B if B is not reachable from A
void A() ... B() ... Case 1
void x() C() ... D() Case 2
void KeSetTimer() KeSetTimerEx()
void x() ExAcquireFastMutexUnsafe(m) ...
ExReleaseFastMutexUnsafe(m)
11Refinement of Inference (3)
- Heuristics the more similar two events are, the
more likely that the properties is interesting - Relative edit distance between A and B
- Partition A and B into words
- A has wA words, B has wB, w common words
-
- For example
- Ke Acquire In Stack Queued Spin Lock
- ? Ke Release In Stack Queued Spin Lock
- Similarity 85.7
12Results Kernel
- Approximation
- PAL threshold 0.90
- 7611 properties
- Call-graph and edit distance based reduction
- Use the call-graph of ntoskrnl.exe, edit dist gt
0.5 - 142 properties. 53 times reduction!
- Small enough for manual inspection
- 56 apparently interesting properties (40)
- Locking discipline
- Resource allocation and deletion
13Result Kernel (2)
- Found interesting properties that should be
checked - Several types of kernel SpinLock
- The Static Device Verifier should have checked
them - ESP found one previously unknown bug in ntfs.sys
- Double-acquire of FastMutex
- Confirmed and fixed by the responsible developers
Static Driver Verifier Finding Bugs in Device
Drivers at Compile-Time. WinHEC, April 2004.
M. Das, S. Lerner, and M. Seigle. ESP
Path-Sensitive Program Verification in Polynomial
Time. PLDI 02
14Summary of Experiments
- We inferred interesting rules about kernel APIs!
- SDV already encodes some properties
- http//download.microsoft.com/download/5/b/5/5b5be
c17-ea71-4653-9539-204a672f11cf/SDV-intro.doc - We inferred undocumented ones too
- Inference scales well to realistic traces
- Approximation is effective in tolerating
imperfect traces and detect dominant patterns - Call-graph and edit distance based reduction is
very effective - Check with defect detection tool is promising
- Other experiments Vulcan APIs, Daisy file system
15Conclusion
- Constructing interesting properties is important
and difficult - Automatic inference from execution traces is
light-weight and effective - Practical values
- Helping developers understand legacy code
- Giving us opportunity of leveraging sophisticated
static analysis tools to find application
specific defects
16Q A
- For more information
- jinlin_at_cs.virginia.edu
- http//www.cs.virginia.edu/terracotta
- Great collaborators
- UVa
- David Evans, Ed Mitchell
- Microsoft
- Stephen Adams,
- Deepali Bhardwaj,
- Thirumalesh Bhat,
- Manuvir Das,
- Damian Hasse,
- Marne Staples, Rick Vicik,
- Jason Yang, Zhe Yang