Feedback-directed optimizations with estimated edge profiles from hardware event sampling - PowerPoint PPT Presentation

About This Presentation
Title:

Feedback-directed optimizations with estimated edge profiles from hardware event sampling

Description:

Background. Traditional FDO model: Instrument Run Recompile. Usage Model ... Samples per source line stored in profile datafile ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 9
Provided by: verti4
Category:

less

Transcript and Presenter's Notes

Title: Feedback-directed optimizations with estimated edge profiles from hardware event sampling


1
Feedback-directed optimizations with estimated
edge profiles from hardware event sampling
  • Open64 workshop, CGO 2008
  • April 6, 2008

Vinodha Ramasamy, Robert Hundt Google Inc., Dehao
Chen, Wenguang Chen Tsinghua University
2
Background
  • Traditional FDO model Instrument Run
    Recompile
  • Usage Model
  • Difficulties in generating representative
    training datasets
  • High overhead of profile collection
  • Requires dual-compilation - tightly coupled
    builds
  • Benefits
  • Supports both value and edge profiling
  • High performance potential

TRAINING DATA
3
Overview
4
Algorithm
  • Basic block counts
  • Scale samples per source line by of
    instructions
  • Samples per source line stored in profile
    datafile
  • Annotate IR statements in basic blocks with
    source line sample counts
  • Scale basic block sample count
  • BB.count (? IR.count) / num_IR_stmts

pbla.c60 iplus iplus-gtpred // 280 4
70 100 804a8b7 mov    0x10(ebp),eax30
804a8ba mov    0x8(eax),eax70 804a8bd
mov    eax,0x10(ebp)80 804a8c0 jmp   
804a94b ltprimal_iminus0x137gt
IR1 70 IR2 10 IR3 70 IR4 0 IR5 0
?IR.count 70 10 70 0 0 150 BB.count
150 5 30
5
Edge frequency estimation
  • Edge counts from basic block counts
  • Uses higher level program structure - branch,
    loop etc.,
  • Recursive algorithm used to smooth sample counts

500
ENTRY 0
ENTRY 500
BODY 0
BODY 7954
?
BR 7954
BR 7954
NT 30
T 7922
BACK 0
BACK 7454
NT 32
T 7922
JOIN 420
JOIN 7954
EXIT 0
EXIT 500
6
Challenges
  • Inaccuracies inherent to sampling
  • Source position information issues
  • Missing information due to optimization
    transformations
  • Disambiguating samples per source line
  • if (cond) stmt1 stmt2
  • Edge estimation heuristics
  • Evaluate algorithm proposed by Levin et. al.
  • Inlining
  • Annotate early inlined functions with scaled
    sample counts

7
Results
  • SPEC2006 C benchmarks
  • Intel Core-2 platform using 64-bit binaries
  • -O2 FDO with instrumented runs
  • 45 gain over default O2 runs
  • -O2 FDO with sampled profiles
  • Profile collection using O2 binaries
  • 60 of FDO instrumented gain

8
Thank You!
  • QA
Write a Comment
User Comments (0)
About PowerShow.com