Branches - PowerPoint PPT Presentation

About This Presentation
Title:

Branches

Description:

Branches. Daniel ngel Jim nez. Departments of Computer Science. UT ... Like taekwondo, piano, traveling, Spanish music. Current favorite band Ojos de Brujo ... – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 19
Provided by: daniela186
Category:
Tags: branches

less

Transcript and Presenter's Notes

Title: Branches


1
Branches
Daniel Ángel Jiménez Departments of Computer
Science UT San Antonio Rutgers
2
About Me
  • Born in Fort Hood, Texas in 1969 (80 miles north
    on IH-35)
  • Dad from Mexico, Mom from Texas
  • Lived in Temple, Texas
  • Moved to San Antonio, Texas in 1973 (80 miles
    south on IH-35)
  • B.S. at UTSA, 1992
  • M.S. at UTSA, 1994
  • Moved to San Marcos, Texas in 1995 (30 miles
    south on IH-35)
  • Started Ph.D. program at UT Austin
  • Moved back to San Antonio in 1996
  • Non-tenure-track faculty, UTHSCSA
  • Moved to Austin in 1999
  • Ph.D. UT Austin, 2002
  • Moved to New Jersey in 2002, New York 2003
  • Asst. Professor, Rutgers
  • Sabbatical in Barcelona, Spain in 2005
  • Back to San Antonio in 2007
  • Associate Professor, UTSA
  • Mostly for the breakfast tacos

3
More about me
  • Always liked computer programming
  • First computer was Tandy Color Computer in 1984
  • Fortunate sequence of mentors guided me into my
    career
  • Mom Education is important (didnt believe her
    at the time)
  • Neal Wagner theory is exciting
  • Hugh Maynard math is my friend
  • Betty Travis Research Careers for Minority
    Scholars
  • Calvin Lin perfect fit Ph.D. advisor
  • Uli Kremer welcomed me into being a professor
  • Like taekwondo, piano, traveling, Spanish music
  • Current favorite band Ojos de Brujo

4
This Talk
  • How an instruction is processed pipelining
  • Kinds of branches
  • Branch prediction
  • Accuracy
  • Technique
  • Empirical properties of branches
  • How to handle branches
  • Conclusion

5
How an Instruction is Processed
Processing can be divided into five stages
Instruction fetch
Instruction decode
Execute
Memory access
Write back
6
Instruction-Level Parallelism
To speed up the process, pipelining overlaps
execution of multiple instructions, exploiting
parallelism between instructions
Instruction fetch
Instruction decode
Execute
Memory access
Write back
7
Control Hazards Branches
Conditional branches create a problem for
pipelining the next instruction can't be fetched
until the branch has executed, several stages
later.
Branch instruction
8
Pipelining with Branches
Branches cause bubbles in the pipeline, where
some stages are left idle.
Instruction fetch
Instruction decode
Execute
Memory access
Write back
Unresolved branch instruction
9
Branch Prediction
A branch predictor allows the processor to
speculatively fetch and execute instructions down
the predicted path.
Instruction fetch
Instruction decode
Execute
Memory access
Write back
Speculative execution
Branch predictors must be highly accurate to
avoid mispredictions!
10
Kinds of Branches
  • Conditional
  • Very common, 1/4 to 1/10 of instructions
  • Must be predicted, can be hard to predict
  • Loops back edges with short fixed trip counts can
    be predicted perfectly
  • Unconditional
  • Targets still have to be predicted with BTB
  • Indirect
  • E.g. jumping through a table of addresses
  • Can be predicted, often just use BTB as predictor
  • Returns
  • Predicted with RAS
  • gt99 possible if you avoid deep recursion

11
Branch Predictor Accuracy is Critical
  • The cost of a misprediction is proportional to
    pipeline depth
  • Predictor accuracy is more important for deeper
    pipelines
  • Need good branch predictor to feed core with
    right-path insts
  • Deeper pipelines allow higher clock rates by
    decreasing the delay of each pipeline stage
  • Decreasing misprediction rate from 9 to 4
    results in 31 speedup for 32 stage pipeline
  • Todays pipelines have been scaled back, but only
    temporarily

Simulations with SimpleScalar/Alpha
12
Conditional Branch Prediction
  • Most predictors are based on 2-level adaptive
    branch prediction Yeh Patt 91
  • Branch outcomes are shifted into a history
    register, 1 for taken, 0 for not taken
  • History bits and address bits combine to index a
    pattern history table (PHT) of 2-bit saturating
    counters
  • Prediction is high bit of counter
  • Counter is incremented if branch is taken,
    decremented if branch is not taken

GAs a common type of predictor
13
Characteristics of Branch Behavior
  • Branches tend to be highly biased
  • 53 are strongly biased, taken at least 98 or at
    most 2 of the time
  • Remaining branches also exhibit weak biases
  • A few branches show no bias
  • Branch outcomes are highly correlated with past
    branch history

14
Important Facts about Branches
  • A taken branch is (often) more costly than an
    untaken branch
  • Trace caches can mitigate this
  • Mispredicted branches are very costly
  • Some mispredictions are more costly than others
    how to exploit that?
  • Be aware of your machines indirect branch
    predictor
  • Whats the best way to compile dense switch/case
    stmts?
  • What to do about virtual dispatch?
  • Some ISAs have hint bits
  • These can help a lot if set correctly
  • But only if microarch uses them

15
What to do about mispredictions?
  • Capacity/Conflict
  • Too many program paths, collisions in tables
  • Solutions use the hint bits or align branches
  • Unfortunately branch predictors are secret so
    options are limited
  • Branches not correlated with recent history
  • Split loops so trip counts are within history
    length
  • Data dependent branches with unfriendly
    distributions
  • Predicate if possible
  • Profile
  • Performance counters tools such as VTune or
    Oprofile

16
Conclusion
  • Branches can have variable costs due primarily to
    prediction
  • Be aware of the implementation of branches
  • Profiling and ISA support for branches
  • Different causes and effects of mispredictions
  • Impact of mispredictions has crept up in recent
    years

17
The End http//www.cs.utsa.edu/dj
18
Related Compiler Work
  • Profile-guided code placement to improve
    instruction locality
  • Program restructuring for virtual memory
    Hatfield Gerald 71
  • Reducing conflict misses in direct-mapped I
    McFarling 88, 89
  • Procedure placement Petis Hansen 90, Gloy
    Smith 99
  • Transformations for reducing branch costs
  • Branch alignment Calder Grunwald 94,Young
    et al. 97
  • Software trace cache Ramirez et al. 99
  • Transformations for improving predictor accuracy
  • Static correlated branch prediction Young
    Smith 99
  • Address adjustment Chen King 99
  • Reverse-engineering branch predictors Milenkovic
    et al. 04
  • PHT partitioning Jiménez 05
Write a Comment
User Comments (0)
About PowerShow.com