Low Overhead Program Monitoring and Profiling - PowerPoint PPT Presentation

About This Presentation
Title:

Low Overhead Program Monitoring and Profiling

Description:

Low Overhead Program Monitoring and Profiling Naveen Kumar, Bruce Childers Mary Lou Soffa Department of Computer Science University of Pittsburgh – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 19
Provided by: nkum
Category:

less

Transcript and Presenter's Notes

Title: Low Overhead Program Monitoring and Profiling


1
Low Overhead Program Monitoring and Profiling
Naveen Kumar, Bruce Childers
Mary Lou Soffa
  • Department of Computer Science
  • University of Pittsburgh
  • Pittsburgh, Pennsylvania 15260
  • naveen, childers_at_cs.pitt.edu

Department of Computer Science University of
Virginia Charlottesville, Virginia
22904 soffa_at_virginia.edu
2
Introduction
  • Program instrumentation Insertion of additional
    code into a program
  • Monitor program behavior or gather information
  • Can be inserted at source intermediate or binary
    level
  • Applications
  • Detect program invariants Ernst
  • Dynamic slicing Zhang
  • Software testing Misurda
  • Software security checks Scott

3
Running Example
  • Consider a software security system that monitors
    the memory behavior of untrusted programs (e.g.
    Dynamo RIO)
  • Instrumentation at binary instruction level
  • Instrument all loads and stores
  • Program can be instrumented statically as well as
    dynamically

4
Static instrumentation
  • ro1 ro1 ltlt 10
  • ro1 ro1 0x228
  • ro0 ro2 ltlt 0x14
  • rl4 ro0 ltlt 0x14
  • Mrl0 0x10 ro2
  • Mro1 0x228 ro0
  • ri4 ro1
  • rl1 ro0
  • jmp r31
  • Mrl0 0x20 ro0
  • rsp rsp -112
  • ro0 ro0 ltlt 10
  • ro1 Mro0 0x3d0

probe1 Mrsp -20 rl0 save call
save_gp_regs ro0 Mrsp 0x68
ro0 ro0 0x10 call secure ro1
rg0 1 call restore_gp_regs restore
rsp rsp 124 Mrl0 0x10 ro2
jmp probe1_ret
probe1 call secure() probe2 call
secure() probe3 call secure() probe4
call secure()
jmp probe1
jmp probe2
jmp probe3
jmp probe4
Example from gzip. Instrumentation performed
before execution starts
5
Dynamic instrumentation
  • ro1 ro1 ltlt 10
  • ro1 ro1 0x228
  • ro0 ro2 ltlt 0x14
  • rl4 ro0 ltlt 0x14
  • Mrl0 0x10 ro2
  • Mro1 0x228 ro0
  • ri4 ro1
  • rl1 ro0
  • jmp r31
  • Mrl0 0x20 ro0
  • rsp rsp -112
  • ro0 ro0 ltlt 10
  • ro1 Mro0 0x3d0

probe1 call secure() probe2 call
secure() probe3 call secure() probe4
call secure()
jmp probe1
jmp probe2
jmp probe3
jmp probe4
Instrumentation performed at run-time on code
that executes More powerful than static
instrumentation, possibly less expensive
6
Motivation
  • Stumbling block high overhead
  • Slowdown by an order of magnitude or more Ernst
  • Existing solutions user guided
  • Sampling Arnold
  • Smaller data sets analyzed (test data set of SPEC
    instead of Ref) Mock
  • Less aggressive uses, especially in dynamic
    settings Deusterwald
  • User has to decide how best to apply
    instrumentation
  • What is needed are automatic techniques to
    mitigate the overheads systematically

7
Goals
  • Gather exact information
  • Separate out the accuracy from efficiency
  • User should focus on what to gather, rather than
    how to efficiently gather
  • Efficient
  • Comparable to hand-optimized instrumentation
  • Automatic
  • No or little user guidance

8
Instrumentation Optimization
  • Costs associated with instrumentation
  • Dynamic probe count Number of probes executed
  • Probe cost Number of instructions in a probe
  • Payload cost Frequency of invocation and cost of
    payload
  • Optimize instrumentation code to reduce costs
  • Dynamic probe coalescing
  • Partial context switches
  • Partial payload inlining

9
Base Instrumenter
  • ro1 ro1 ltlt 10
  • ro1 ro1 0x228
  • ro0 ro2 ltlt 0x14
  • rl4 ro0 ltlt 0x14
  • Mrl0 0x10 ro2
  • Mro1 0x228 ro0
  • ri4 ro1
  • rl1 ro0
  • jmp r31
  • Mrl0 0x20 ro0
  • rsp rsp -112
  • ro0 ro0 ltlt 10
  • ro1 Mro0 0x3d0

probe1 call secure() probe2 call
secure() probe3 call secure() probe4
call secure()
jmp probe1
jmp probe2
jmp probe3
jmp probe4
Base instrumenter generates a list of
Instrumentation Points
10
Dynamic Probe Coalescing
probe1 call secure() probe2 call
secure() probe3 call secure() probe4
call secure()
probe5 call secure() call
secure() probe3 call secure() probe4
call secure()
  • ro1 ro1 ltlt 10
  • ro1 ro1 0x228
  • ro0 ro2 ltlt 0x14
  • rl4 ro0 ltlt 0x14
  • Mrl0 0x10 ro2
  • Mro1 0x228 ro0
  • ri4 ro1
  • rl1 ro0
  • jmp r31
  • Mrl0 0x20 ro0
  • rsp rsp -112
  • ro0 ro0 ltlt 10
  • ro1 Mro0 0x3d0

probe6 call secure() call secure()
call secure()
jmp probe1
jmp probe5
jmp probe2
jmp probe3
jmp probe6
jmp probe4
11
Partial Context Switch
probe6 call secure() call secure()
call secure() probe4 call secure()
  • ro1 ro1 ltlt 10
  • ro1 ro1 0x228
  • ro0 ro2 ltlt 0x14
  • rl4 ro0 ltlt 0x14
  • Mrl0 0x10 ro2
  • Mro1 0x228 ro0
  • ri4 ro1
  • rl1 ro0
  • jmp r31
  • Mrl0 0x20 ro0
  • rsp rsp -112
  • ro0 ro0 ltlt 10
  • ro1 Mro0 0x3d0

probe6 Mrsp -20 rl0 Mrsp -28
ro1 save call save_gp_regs effective
address call secure effective address
call secure effective address call
secure call restore_gp_regs restore
jmp probe6_ret
jmp probe6
jmp probe4
Analyze register usage in payload
Remove spill and reload of GP registers
Regs. used in payload Not used g0g7
12
Partial Payload Inlining
probe6 Mrsp -20 rl0 Mrsp -28
ro1 rsp rsp -140 effective
address call secure effective address
call secure effective address call
secure rsp rsp 140 jmp
probe6_ret
  • ro1 ro1 ltlt 10
  • ro1 ro1 0x228
  • ro0 ro2 ltlt 0x14
  • rl4 ro0 ltlt 0x14
  • Mrl0 0x10 ro2
  • Mro1 0x228 ro0
  • ri4 ro1
  • rl1 ro0
  • jmp r31
  • Mrl0 0x20 ro0
  • rsp rsp -112
  • ro0 ro0 ltlt 10
  • ro1 Mro0 0x3d0

void secure(address) if(address gt
REDZONE) return redAlerts
createReport() if(critical(address))
assert(address)
ro1 Mrg10 ro1 ro1 -
ro0 ri0 1 jmp r31 ro3
Mrg2 0 ro3 ro3 1 !call
createReport !call assert call
__full_secure
void __inlined_secure(address)
__full_secure(address, tag)
void __full_secure(address, tag)
jmp probe6
jmp probe4
13
Implementation
  • Strata dynamic translation system Scott et.
    al.
  • Generates code at run-time for an application
  • Suitable for dynamic instrumentation
  • FIST base instrumentation system Kumar et. al.
  • Flexible for diverse instrumentation needs
  • Generates a list of instrumentation points (IPs)
  • INS-OP developed in this work
  • Constructs an IR for the list of IPs obtained
    from FIST
  • Each optimization is a pass that modifies the IR

14
Case Studies
  • Case study 1 Program profiling
  • Lightweight instrumentation application
  • Lower initial overhead implies lesser benefits
  • Demonstrates efficacy of the optimizations in an
    unfavorable scenario
  • Case study 2 Memory simulation
  • Relatively heavy-weight instrumentation
    application
  • Can compare with state-of-the-art systems to see
    the benefits of optimization

15
Case study 1 Program profiling
  • The benefit of optimization varies depends
    upon the initial overhead
  • The speedups range from 1.26 to 2.63

16
Case study 2 Memory Simulation
  • Strata-Embra is a SPARC implementation of
    cache simulator from SimOS
  • Strata-Embra-Opt is optimized cache simulator
    using INS-OP
  • INS-OP optimizes the fastest cache simulator
    we could find by 2 - 3.3 times

17
Conclusions
  • Introduced instrumentation optimization to
    reduce the cost of instrumented code
  • Reduced probe count
  • Reduce cost of an individual probe
  • Reduce the cost of payload
  • Speedups between 1.2 - 3.3 times
  • More detailed information gathering
  • Accuracy need not be sacrificed for efficiency
  • Feasibility of certain applications
  • Run-time monitoring more feasible
  • Example applications that perform continuous
    testing

18
Effectiveness of optimizations
Write a Comment
User Comments (0)
About PowerShow.com