What Programming LanguageCompiler Researchers should Know about Computer Architecture - PowerPoint PPT Presentation

About This Presentation

Title:

What Programming LanguageCompiler Researchers should Know about Computer Architecture

Description:

What Programming Language/Compiler Researchers should Know ... Let me get more arrogant. A large part of modern out of order processors was designed because ... – PowerPoint PPT presentation

Number of Views:74

Avg rating:3.0/5.0

Slides: 31

Provided by: lizy

Learn more at: https://www.cs.utexas.edu

Category:

more less

Transcript and Presenter's Notes

Title: What Programming LanguageCompiler Researchers should Know about Computer Architecture

1
What Programming Language/Compiler Researchers
should Know about Computer Architecture

Lizy Kurian John
Department of Electrical and Computer Engineering
The University of Texas at Austin

2
Somebody once said

Computers are dumb actors and compilers/programme
rs are the master playwrights.

3
Computer Architecture Basics

ISAs
RISC vs CISC
Assembly language coding
Datapath (ALU) and controller
Pipelining
Caches
Out of order execution
Hennessy and Patterson architecture books

4
Basics

ILP
DLP
TLP
Massive parallelism
SIMD/MIMD
VLIW
Performance and Power metrics

Hennessy and Patterson architecture books ASPLOS,
ISCA, Micro, HPCA
5
The Bottomline

Programming Language choice affects performance
and power
eg Java
Compilers affect Performance and Power

6
A Java Hardware Interpreter

Radhakrishnan, Ph. D 2000 (ISCA2000, ICS2001)
This technique used by Nazomi Communications,
Parthus (Chicory Systems)

7
HardInt Performance

Hard-Int performs consistently better than the
interpreter
In JIT mode, significant performance boost in 4
of 5 applications.

8
Compiler and Power
A
A
A
E
Cycle 1
Cycle 1
B
C
B
B
C
E
C
Cycle 2
Cycle 2
D
E
D
D
Cycle 3
Cycle 3
F
F
F
Cycle 4
Cycle 4
DDG
Peak Power 2 Energy 6
Peak Power 3 Energy 6
9
Valluri et al 2001 HPCA workshop

Quantitative Study
Influence of state-of-the-art optimizations on
energy and power of the processor examined
Optimizations studied
Standard O1 to O4 of DEC Alphas cc compiler
Four individual optimizations simple
basic-block instruction scheduling, loop
unrolling, function inlining, and aggressive
global scheduling

10
Standard Optimizations on Power
11
Somebody once said

Computers are dumb actors and compilers/programme
rs are the master playwrights.

12
A large part of modern out of order processors

is hardware that could have been eliminated if a
good compiler existed.

13
Let me get more arrogant

A large part of modern out of order processors
was designed because
computer architects thought compiler writers
could not do a good job.

14
Value Prediction

Is a slap on your face
Shen and Lipasti

15
Value Locality

Likelihood that an instructions computed result
or a similar predictable result will occur soon
Observation a limited set of unique values
constitute majority of values produced and
consumed during execution

16
Load Value Locality
17
Causes of value locality

Data redundancy many 0s, sparse matrices, white
space in files, empty cells in spread sheets
Program constants
Computed branches base address for jump tables
is a run-time constant
Virtual function calls involve code to load a
function pointer can be constant

18
Causes of value locality

Memory alias resolution compiler conservatively
generates code may contain stores that alias
with loads
Register spill code stores and subsequent loads
Convergent algorithms convergence in parts of
algorithms before global convergence
Polling algorithms

19
2 Extremist Views

Anything that can be done in hardware should be
done in hardware.
Anything that can be done in software should be
done in software.

20
What do we need?

The Dumb actor
Or the
The defiant actor who pays very little
attention to the script

21
Challenging all compiler writers

The last 15 years was the defiant actors era
What about the next 15? TLP, Multithreading,
Parallelizing compilers Its time for a lot
more dumb acting from the architects side.
And its time for some good scriptwriting from
the compiler writers side.

22
BACKUP
23
Compiler Optimzations

cc - Native C compiler on Dec Alpha 21064 running
OSF1 operating system
gcc Used to study the effect of individual
optimizations

24
Std Optimizations Levels on cc

-O0 No optimizations performed
-O1 Local optimizations such as CSE, copy
propagation, IVE etc
-O2 Inline expansion of static procedures and
global optimizations such as loop unrolling,
instruction scheduling
-O3 Inline expansion of global procedures
-O4 s/w pipelining, loop vectorization etc

25
Std Optimizations Levels on gcc

-O0 No optimizations performed
-O1 Local optimizations such as CSE, copy
propagation, dead-code elimination etc
-O2 aggressive instruction scheduling
-O3 Inlining of procedures
Almost same optimizations in each level of cc and
gcc
In cc and gcc, optimizations that increase ILP
are in levels -O2, -O3, and -O4
cc used where ever possible, gcc used used where
specific hooks are required

NOTE
26
Individual Optimizations

Four gcc optimizations, all optimizations applied
on top -O1
-fschedule-insns local register allocation
followed by basic-block list scheduling
-fschedule-insns2 Postpass scheduling done
-finline-functions Integrated all simple
functions into their callers
-funroll-loops Perform the optimization of loop
unrolling

27
Some observations

Energy consumption reduces when of instructions
is reduced, i.e., when the total work done is
less, energy is less
Power dissipation is directly proportional to IPC

28
Observations (contd.)