Phase Capture and Prediction with Applications - PowerPoint PPT Presentation

About This Presentation

Title:

Phase Capture and Prediction with Applications

Description:

A period of execution that exhibits the same characteristics. Motivation ... Phases are often repeated at different times in execution ... – PowerPoint PPT presentation

Number of Views:33

Avg rating:3.0/5.0

Slides: 60

Provided by: brianp63

Learn more at: https://pages.cs.wisc.edu

Category:

more less

Transcript and Presenter's Notes

Title: Phase Capture and Prediction with Applications

1
Phase Capture and Prediction with Applications

Martin Hock
Brian Pellin
Karthik Jayaraman
Vivek Shrivastava
University of Wisconsin-Madison

2
Phases

Definition
A period of execution that exhibits the same
characteristics

3
Motivation

Programs go through different phases of their
execution
Phases are often repeated at different times in
execution
During each phase hardware is exercised
differently

4
Sample Phase Behavior gcc
5
Outline

Phase Tracking
Phase Prediction
Applications
Phase Based Branch Prediction
Phase Based Cache Configuration
Summary / Conclusions

6
Phase Tracking

Goal
Identify program phases with different behavior
Based on Phase Tracking and Prediction
Sherwood, Sair, Calder
Use reconfigurable hardware to take advantage of
phase information
Reconfigurable caches
Instruction window size
Dynamic branch predictor

7
Detecting Phases

Track groups of 10 million instructions
Collect information about instructions and store
Build a phase footprint
After each 10 m insts. Compare footprint with
past footprints
If footprint close enough, it is considered a
repetition of the phase

8
Accumulator
0
0
0
Branch PC
0
Hash
0
of inst. since branch
0
0
0

9
Accumulator
0
0
0
Branch PC
2
0
Hash
0
of inst. since branch
0
0
20
0

Branch occurs, must increment entry 2 by 20.
10
Accumulator
0
0
20
Branch PC
3
0
Hash
0
of inst. since branch
0
0
80
0

New branch, increment entry 3 by 10.
11
Accumulator
0
0
20
Branch PC
80
Hash
0
of inst. since branch
0
0
0

After a phase completes we need somewhere to
store data about previous phases.
12
Past Footprint Table
Accumulator
0
0
20
Branch PC
80
Hash
0
of inst. since branch
0
0
0

At 100 instructions
13
Past Footprint
Past Footprint Table
Accumulator
0
0
0
0
20
0
Branch PC
80
0
Hash
0
0
of inst. since branch
0
0
0
0
0
0

Accumulator Data is stored in Past Footprint table
14
Past Footprint
Past Footprint Table
Accumulator
0
90
0
0
20
0
Branch PC
80
5
Hash
0
0
of inst. since branch
0
5
0
0
0
0

At 200 instructions
Take the Manhattan distance between accumulator
and Past Footprints
9020755 190
15
Past Footprint
Past Footprint Table
Accumulator
0
90
0
0
0
0
20
0
0
Branch PC
80
5
0
Hash
0
0
0
of inst. since branch
0
5
0
0
0
0
0
0
0

At 200 instructions
16
Past Footprint
Past Footprint Table
Accumulator
0
90
0
0
0
0
20
0
21
Branch PC
80
5
79
Hash
0
0
0
of inst. since branch
0
5
0
0
0
0
0
0
0

At 300 instructions
Manhattan distance between this phase and first
phase is 2.
This phase is close enough to the first phase to
be considered the same as phase one.
17
Past Footprint
Past Footprint Table
Accumulator
0
430
0
0
0
0
20
0
21
Branch PC
10
80
9
Hash
0
0
0
of inst. since branch
0
70
0
0
0
0
0
0
0

At 30 million instructions
Manhattan distance between this phase and first
phase is 2.
This phase is close enough to the first phase to
be considered the same as phase one.
18
Outline

Phase Tracking
Phase Prediction
Applications
Phase Based Branch Prediction
Phase Based Cache Configuration
Summary / Conclusions

19
Phase prediction

When we detect a phase, its over
In order to adjust hardware, we need to know what
phase we are in
Three strategies
Last seen
Markov with RLE
Perceptron

20
Last seen

Predict next phase last phase
Because last seen is so simple, another predictor
would have to beat it significantly to justify
the added cost

21
RLE Markov

Adapted from Sherwood
Assumes that if we see phase X exactly Y times in
a row, followed by phase Z, then if we see phase
X exactly Y times again, it will again be
followed by Z

22
Perceptron

Individual perceptrons work in binary (1)
Given history h1, h2, , hn (1), weights w0,
w1, w2, , wn (integers),compute S w0 w1h1
w2h2 wnhn
If S 0, predict yes, else predict no
To train, if hi current , increment wi, else
decrement (for w0, add current)
But there are many phases, not just 2
Combine perceptrons for multivalue prediction

23
Multivalue perceptron

We have perceptrons P1, P2, , Pn
Perceptron Pi tries to predict phase i
Train Pi only if in phase i
History hi 1 if it agrees with the current
phase, -1 if disagrees
Have the perceptrons vote for who is correct
most positive one wins

24
Phase prediction results

GCC
Last phase 96 accurate
RLE Markov 94 accurate
Perceptron much lower

25
Phase prediction comments

Sherwood had lower accuracy for last phase (70),
perhaps due to oscillation
Training cost of multiple perceptron means that
it does not always adapt quickly
Not worth improving due to the accuracy of last
phase

26
Outline

Phase Tracking
Phase Prediction
Applications
Phase Based Branch Prediction
Phase Based Cache Configuration
Summary / Conclusions

27
Phase Based Dynamic Branch Predictor

Previous research shows the usefulness of
adapting branch predictors at run time
Dynamic history-length fitting a third level of
adaptivity for branch prediction Juan,
Sanjeevan, Navarro.
Combining Branch Predictors McFarling
Single branch predictor may not perform well
within and across different executions.
A study of Branch Prediction Strategies Smith
Program behavior almost uniform within a phase -gt
choose best predictor for each phase

28
Methodology

Select a small group of relevant predictors
At the beginning of each new phase, sample all
the predictors and choose the best
Save the best for each phase and use it if a
phase reoccurs

29
Methodology

Select a small group of relevant predictors
At the beginning of each new phase, sample all
the predictors and choose the best
Save the best for each phase and use it if a
phase reoccurs

30
Methodology

Select a small group of relevant predictors
At the beginning of each new phase, sample all
the predictors and choose the best
Save the best for each phase and use it if a
phase reoccurs

31
Methodology

Select a small group of relevant predictors
At the beginning of each new phase, sample all
the predictors and choose the best
Save the best for each phase and use it if a
phase reoccurs

32
Methodology

Select a small group of relevant predictors
At the beginning of each new phase, sample all
the predictors and choose the best
Save the best for each phase and use it if a
phase reoccurs

33
Methodology

Select a small group of relevant predictors
At the beginning of each new phase, sample all
the predictors and choose the best
Save the best for each phase and use it if a
phase reoccurs

Phase 2
34
Dynamic Adaptations

Possible dynamic adaptations
Multiple Branch Predictors
2Level, Bimodal
Sample each for one profiling period
Select on basis of miss rate, number of
mis-speculated instructions,
Varying History Lengths
History lengths 0,12
Some workloads give better performance with
smaller history

35
Multiple Branch Predictors

Set of predictors
2level 110248 (Baseline predictor)
Bimodal 1024
2level 8 512 8
2level 1 512 8
Profile period
10 million instructions

36
Multiple Branch Predictors

Simulator Used
Simplescalar v3.0d
Set of benchmarks
gcc, vpr, mcf, ammp, art
Selection Criterion
Least Miss Rate
If miss rates of two predictors are within 1 ,
select the less expensive (simpler) one

37
Multiple Branch Predictor ResultsIPC (gcc)
38
Multiple Branch Predictors Results
Branch Predictor Misses (gcc)
39
Multiple Branch Predictor ResultsIPC (vpr)
40
Multiple Branch Predictors ResultsBranch
Predictor Misses (vpr)
41
Multiple Branch Predictors ResultsBranch
Predictor Misses (mcf)
42
Multiple Branch PredictorsIPC Comparison
43
Multiple Branch PredictorsBranch Prediction
Misses Comparison
44
Varying History Length

G-share predictor with varying history lengths
Set of history lengths sampled
0,3,6,8,12
Selection Criterion
Least Miss Rate
If miss rates of two predictors are within 1 ,
select the less expensive (simpler) one

45
Varying History Length

Set of benchmarks
gcc, mcf
Simulator Used
Simplescalar v3.0d
Profile Period
10 million instructions

46
Varying History Length ResultsIPC (gcc)
47
Varying History Length ResultsBranch Predictor
Misses (gcc)
48
Varying History Length Result Instruction
Cache Misses(IL1) (gcc)
49
Outline

Phase Tracking
Phase Prediction
Applications
Phase Based Branch Prediction
Phase Based Cache Configuration
Summary / Conclusions

50
Cache optimization

Smaller caches use less power
Some phases of execution will use less memory or
execute a smaller region of code and therefore
need less cache
We can use a smaller cache for these phases
without affecting performance

51
Methodology

Try 4 possibilities of data and instruction cache
simultaneously
Data cache and instruction cache misses should be
independent
Select the best combination

Data
Instr
Phase 2
52
Cache optimization results

GCC IPC
Fixed 32K cache (16K 16K) 1.807
Fixed 128K cache (64K 64K) 1.896
Optimizer 1.855
Average 49K total

53
Cache comparison
54
Outline

Phase Tracking
Phase Prediction
Applications
Phase Based Branch Prediction
Phase Based Cache Configuration
Summary / Conclusions

55
Summary

Significant reduction in branch mispredictions
(29.88 - 44.35) using phase based branch
predictors
Simple predictors beat more complex predictor in
many phases
Marginal gains in IPC using multiple branch
predictor (2.24 - 4.70)
Marginal gains in IL1 misses using phase based
multiple branch predictors.

56
Summary (cont...)

Phase based dynamic history length fitting does
not give good gains

57
Conclusions 1

Phase based optimizations provides scope for
improvements using reconfigurable hardware
Using phase specific branch predictor provides
good improvements in mis predictions
A good strategy for saving power as
mis-predictions may result in reduction of mis-
speculated instructions,

58
Conclusion 2

However, varying history length does not result
in substantial savings
More benchmarks need to be considered to
understand the effect of history length
adaptations

59
Questions??

Write a Comment

User Comments (0)