Phase Capture and Prediction with Applications - PowerPoint PPT Presentation

About This Presentation
Title:

Phase Capture and Prediction with Applications

Description:

A period of execution that exhibits the same characteristics. Motivation ... Phases are often repeated at different times in execution ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 60
Provided by: brianp63
Category:

less

Transcript and Presenter's Notes

Title: Phase Capture and Prediction with Applications


1
Phase Capture and Prediction with Applications
  • Martin Hock
  • Brian Pellin
  • Karthik Jayaraman
  • Vivek Shrivastava
  • University of Wisconsin-Madison

2
Phases
  • Definition
  • A period of execution that exhibits the same
    characteristics

3
Motivation
  • Programs go through different phases of their
    execution
  • Phases are often repeated at different times in
    execution
  • During each phase hardware is exercised
    differently

4
Sample Phase Behavior gcc
5
Outline
  • Phase Tracking
  • Phase Prediction
  • Applications
  • Phase Based Branch Prediction
  • Phase Based Cache Configuration
  • Summary / Conclusions

6
Phase Tracking
  • Goal
  • Identify program phases with different behavior
  • Based on Phase Tracking and Prediction
    Sherwood, Sair, Calder
  • Use reconfigurable hardware to take advantage of
    phase information
  • Reconfigurable caches
  • Instruction window size
  • Dynamic branch predictor

7
Detecting Phases
  • Track groups of 10 million instructions
  • Collect information about instructions and store
  • Build a phase footprint
  • After each 10 m insts. Compare footprint with
    past footprints
  • If footprint close enough, it is considered a
    repetition of the phase

8
Accumulator
0
0
0
Branch PC
0
Hash
0
of inst. since branch
0
0
0

9
Accumulator
0
0
0
Branch PC
2
0
Hash
0
of inst. since branch
0
0
20
0

Branch occurs, must increment entry 2 by 20.
10
Accumulator
0
0
20
Branch PC
3
0
Hash
0
of inst. since branch
0
0
80
0

New branch, increment entry 3 by 10.
11
Accumulator
0
0
20
Branch PC
80
Hash
0
of inst. since branch
0
0
0

After a phase completes we need somewhere to
store data about previous phases.
12
Past Footprint Table
Accumulator
0
0
20
Branch PC
80
Hash
0
of inst. since branch
0
0
0

At 100 instructions
13
Past Footprint
Past Footprint Table
Accumulator
0
0
0
0
20
0
Branch PC
80
0
Hash
0
0
of inst. since branch
0
0
0
0
0
0

Accumulator Data is stored in Past Footprint table
14
Past Footprint
Past Footprint Table
Accumulator
0
90
0
0
20
0
Branch PC
80
5
Hash
0
0
of inst. since branch
0
5
0
0
0
0

At 200 instructions
Take the Manhattan distance between accumulator
and Past Footprints
9020755 190
15
Past Footprint
Past Footprint Table
Accumulator
0
90
0
0
0
0
20
0
0
Branch PC
80
5
0
Hash
0
0
0
of inst. since branch
0
5
0
0
0
0
0
0
0

At 200 instructions
16
Past Footprint
Past Footprint Table
Accumulator
0
90
0
0
0
0
20
0
21
Branch PC
80
5
79
Hash
0
0
0
of inst. since branch
0
5
0
0
0
0
0
0
0

At 300 instructions
Manhattan distance between this phase and first
phase is 2.
This phase is close enough to the first phase to
be considered the same as phase one.
17
Past Footprint
Past Footprint Table
Accumulator
0
430
0
0
0
0
20
0
21
Branch PC
10
80
9
Hash
0
0
0
of inst. since branch
0
70
0
0
0
0
0
0
0

At 30 million instructions
Manhattan distance between this phase and first
phase is 2.
This phase is close enough to the first phase to
be considered the same as phase one.
18
Outline
  • Phase Tracking
  • Phase Prediction
  • Applications
  • Phase Based Branch Prediction
  • Phase Based Cache Configuration
  • Summary / Conclusions

19
Phase prediction
  • When we detect a phase, its over
  • In order to adjust hardware, we need to know what
    phase we are in
  • Three strategies
  • Last seen
  • Markov with RLE
  • Perceptron

20
Last seen
  • Predict next phase last phase
  • Because last seen is so simple, another predictor
    would have to beat it significantly to justify
    the added cost

21
RLE Markov
  • Adapted from Sherwood
  • Assumes that if we see phase X exactly Y times in
    a row, followed by phase Z, then if we see phase
    X exactly Y times again, it will again be
    followed by Z

22
Perceptron
  • Individual perceptrons work in binary (1)
  • Given history h1, h2, , hn (1), weights w0,
    w1, w2, , wn (integers),compute S w0 w1h1
    w2h2 wnhn
  • If S 0, predict yes, else predict no
  • To train, if hi current , increment wi, else
    decrement (for w0, add current)
  • But there are many phases, not just 2
  • Combine perceptrons for multivalue prediction

23
Multivalue perceptron
  • We have perceptrons P1, P2, , Pn
  • Perceptron Pi tries to predict phase i
  • Train Pi only if in phase i
  • History hi 1 if it agrees with the current
    phase, -1 if disagrees
  • Have the perceptrons vote for who is correct
    most positive one wins

24
Phase prediction results
  • GCC
  • Last phase 96 accurate
  • RLE Markov 94 accurate
  • Perceptron much lower

25
Phase prediction comments
  • Sherwood had lower accuracy for last phase (70),
    perhaps due to oscillation
  • Training cost of multiple perceptron means that
    it does not always adapt quickly
  • Not worth improving due to the accuracy of last
    phase

26
Outline
  • Phase Tracking
  • Phase Prediction
  • Applications
  • Phase Based Branch Prediction
  • Phase Based Cache Configuration
  • Summary / Conclusions

27
Phase Based Dynamic Branch Predictor
  • Previous research shows the usefulness of
    adapting branch predictors at run time
  • Dynamic history-length fitting a third level of
    adaptivity for branch prediction Juan,
    Sanjeevan, Navarro.
  • Combining Branch Predictors McFarling
  • Single branch predictor may not perform well
    within and across different executions.
  • A study of Branch Prediction Strategies Smith
  • Program behavior almost uniform within a phase -gt
    choose best predictor for each phase

28
Methodology
  • Select a small group of relevant predictors
  • At the beginning of each new phase, sample all
    the predictors and choose the best
  • Save the best for each phase and use it if a
    phase reoccurs

29
Methodology
  • Select a small group of relevant predictors
  • At the beginning of each new phase, sample all
    the predictors and choose the best
  • Save the best for each phase and use it if a
    phase reoccurs

30
Methodology
  • Select a small group of relevant predictors
  • At the beginning of each new phase, sample all
    the predictors and choose the best
  • Save the best for each phase and use it if a
    phase reoccurs

31
Methodology
  • Select a small group of relevant predictors
  • At the beginning of each new phase, sample all
    the predictors and choose the best
  • Save the best for each phase and use it if a
    phase reoccurs

32
Methodology
  • Select a small group of relevant predictors
  • At the beginning of each new phase, sample all
    the predictors and choose the best
  • Save the best for each phase and use it if a
    phase reoccurs

33
Methodology
  • Select a small group of relevant predictors
  • At the beginning of each new phase, sample all
    the predictors and choose the best
  • Save the best for each phase and use it if a
    phase reoccurs

Phase 2
34
Dynamic Adaptations
  • Possible dynamic adaptations
  • Multiple Branch Predictors
  • 2Level, Bimodal
  • Sample each for one profiling period
  • Select on basis of miss rate, number of
    mis-speculated instructions,
  • Varying History Lengths
  • History lengths 0,12
  • Some workloads give better performance with
    smaller history

35
Multiple Branch Predictors
  • Set of predictors
  • 2level 110248 (Baseline predictor)
  • Bimodal 1024
  • 2level 8 512 8
  • 2level 1 512 8
  • Profile period
  • 10 million instructions

36
Multiple Branch Predictors
  • Simulator Used
  • Simplescalar v3.0d
  • Set of benchmarks
  • gcc, vpr, mcf, ammp, art
  • Selection Criterion
  • Least Miss Rate
  • If miss rates of two predictors are within 1 ,
    select the less expensive (simpler) one

37
Multiple Branch Predictor ResultsIPC (gcc)
38
Multiple Branch Predictors Results
Branch Predictor Misses (gcc)
39
Multiple Branch Predictor ResultsIPC (vpr)
40
Multiple Branch Predictors ResultsBranch
Predictor Misses (vpr)
41
Multiple Branch Predictors ResultsBranch
Predictor Misses (mcf)
42
Multiple Branch PredictorsIPC Comparison
43
Multiple Branch PredictorsBranch Prediction
Misses Comparison
44
Varying History Length
  • G-share predictor with varying history lengths
  • Set of history lengths sampled
  • 0,3,6,8,12
  • Selection Criterion
  • Least Miss Rate
  • If miss rates of two predictors are within 1 ,
    select the less expensive (simpler) one

45
Varying History Length
  • Set of benchmarks
  • gcc, mcf
  • Simulator Used
  • Simplescalar v3.0d
  • Profile Period
  • 10 million instructions

46
Varying History Length ResultsIPC (gcc)
47
Varying History Length ResultsBranch Predictor
Misses (gcc)
48
Varying History Length Result Instruction
Cache Misses(IL1) (gcc)
49
Outline
  • Phase Tracking
  • Phase Prediction
  • Applications
  • Phase Based Branch Prediction
  • Phase Based Cache Configuration
  • Summary / Conclusions

50
Cache optimization
  • Smaller caches use less power
  • Some phases of execution will use less memory or
    execute a smaller region of code and therefore
    need less cache
  • We can use a smaller cache for these phases
    without affecting performance

51
Methodology
  • Try 4 possibilities of data and instruction cache
    simultaneously
  • Data cache and instruction cache misses should be
    independent
  • Select the best combination

Data
Instr
Phase 2
52
Cache optimization results
  • GCC IPC
  • Fixed 32K cache (16K 16K) 1.807
  • Fixed 128K cache (64K 64K) 1.896
  • Optimizer 1.855
  • Average 49K total

53
Cache comparison
54
Outline
  • Phase Tracking
  • Phase Prediction
  • Applications
  • Phase Based Branch Prediction
  • Phase Based Cache Configuration
  • Summary / Conclusions

55
Summary
  • Significant reduction in branch mispredictions
    (29.88 - 44.35) using phase based branch
    predictors
  • Simple predictors beat more complex predictor in
    many phases
  • Marginal gains in IPC using multiple branch
    predictor (2.24 - 4.70)
  • Marginal gains in IL1 misses using phase based
    multiple branch predictors.

56
Summary (cont...)
  • Phase based dynamic history length fitting does
    not give good gains

57
Conclusions 1
  • Phase based optimizations provides scope for
    improvements using reconfigurable hardware
  • Using phase specific branch predictor provides
    good improvements in mis predictions
  • A good strategy for saving power as
    mis-predictions may result in reduction of mis-
    speculated instructions,

58
Conclusion 2
  • However, varying history length does not result
    in substantial savings
  • More benchmarks need to be considered to
    understand the effect of history length
    adaptations

59
Questions??
Write a Comment
User Comments (0)
About PowerShow.com