Performance and Quantitative Principles - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Performance and Quantitative Principles

Description:

Reading for today: Chapter 1.1 - 1.4, Amdahl article ... Megabytes per second. Answers per month. Operations per second. ENGS 116 Lecture 2. 12 ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 22
Provided by: engineerin9
Category:

less

Transcript and Presenter's Notes

Title: Performance and Quantitative Principles


1
Performance and Quantitative Principles
  • Vincent H. Berk
  • September 26th, 2008
  • Reading for today Chapter 1.1 - 1.4, Amdahl
    article
  • Reading for Monday Chapter 1.5 1.11, Mazor
    article
  • Homework for Wednesday 1.1, 1.3, 1.6, 1.7, 1.13

2
Review
  • Task of Computer Designers
  • Determine which attributes are important for a
    new machine
  • Design a machine to maximize performance without
    violating cost/power/functionality constraints
  • 3 Components of Architecture
  • Instruction set design
  • Organization
  • Hardware

3
Benchmarking Games
  • Different configurations used to run the same
    workload on two systems.
  • Compiler customized to optimize the workload.
  • Workload arbitrarily picked to skew results.
  • Test specification written to be biased toward
    one machine.

4
Design benchmarks for
  • Industrial and design
  • Consumer Electronics
  • Networking, routers
  • Office applications
  • Telecommunications
  • Weapon systems

5
Execution time
  • Weighted arithmetic mean sum over execution
    time of all programs run, times their relative
    frequencies
  • Normalized execution time take a reference
    machine, set it to 1, then compute normalized
    execution times for others based on this machine
  • Geometric mean of normalized execution time
    (reference computer becomes irrelevant, ratios
    can arbitrarily be compared)

6
Amdahls Law
Execution time after improvement


Make the common case fast
7
Amdahls Law
  • Speedup due to enhancement E
  • Suppose that enhancement E accelerates a
    fraction F of the task by a factor S, and the
    remainder of the task is unaffected
  • ExTime (E)
  • Speedup (E)

ExTime w/o E
Performance w/ E
?
Speedup(E)


ExTime w/ E
Performance w/o E
8
Amdahls Law
9
Amdahls Law
  • Example Floating point instructions improved to
    run 2X, but only 10 of actual instructions are
    FP

10
Corollary Make The Common Case Fast
  • All instructions require an instruction fetch,
    only a fraction require a data fetch/store.
  • Optimize instruction access over data access
  • Programs exhibit locality.
  • Spatial Locality Temporal Locality
  • Access to small memories is faster.
  • Provide a storage hierarchy such that the most
    frequent accesses are to the smallest (closest)
    memories.

11
Metrics of Performance
Application
Answers per month Operations per second
Programming Language
Compiler
Millions of instructions per second
MIPS Millions of FP operations per second MFLOPS
ISA
Datapath
Megabytes per second
Control
Function Units
Cycles per second (clock rate)?
Transistors
Wires
Pins
12
Marketing Metrics
  • Machines with different instruction sets?
  • Programs with different instruction mixes?
  • Dynamic frequency of instructions
  • Uncorrelated with performance
  • Machine dependent
  • Often not where time is spent

13
Aspects of CPU Performance
  • Instr. Count
    CPI Clock Rate
  • Program
  • Compiler
  • Instruction Set
  • Organization
  • Technology

14
Aspects of CPU Performance
  • Instr. Count
    CPI Clock Rate
  • Program X
  • Compiler X (X)
  • Instruction Set X X
  • Organization X X
  • Technology X

15
Cycles Per Instruction
  • Average Cycles per Instruction
  • CPI (CPU Time ? Clock Rate) / Instruction
    Count Cycles / Instruction Count CPU time
    Cycle Time ?
  • Instruction Frequency
  • Invest resources where time is spent!

16
Example Calculating CPI
  • Base Machine (Reg / Reg)?
  • Op Freq Cycles CPI (i) (
    Time)?
  • ALU 50 1 .5 (33)?
  • Load 20 2 .4 (27)?
  • Store 10 2 .2 (13)?
  • Branch 20 2 .4 (27)?
  • 1.5
  • Typical Mix

17
Example
  • Want to add register / memory operations
  • - One source operand in memory
  • - One source operand in register
  • - Cycle count of 2
  • Side effect Branch cycle count will increase to
    3.
  • What fraction of the loads must be eliminated for
    this to pay off?
  • Base Machine (Reg / Reg)?
  • Op Freq Cycles
  • ALU 50 1
  • Load 20 2
  • Store 10 2
  • Branch 20 2

18
Example Solution
  • Exec Time Instruction Count ? CPI ? Clock
  • Op Freq Cycles CPI Freq Cycles CPI
  • ALU .50 1 .5
  • Load .20 2 .4
  • Store .10 2 .2
  • Branch .20 2 .4
  • Reg/Mem
  • 1.00 1.5

19
Example Solution
  • Exec Time Instruction Count ? CPI ? Clock
  • Op Freq Cycles CPI Freq Cycles CPI
  • ALU .50 1 .5 .5 X 1 .5 X
  • Load .20 2 .4 .2 X 2 .4 2X
  • Store .10 2 .2 .1 2 .2
  • Branch .20 2 .4 .2 3 .6
  • Reg/Mem X 2 2X
  • 1.00 1.5 1 X (1.7 X) /(1 X)?
  • CPINew must be normalized to new instruction
    frequency

20
Example Solution
  • Exec Time Instruction Count ? CPI ? Clock
  • Op Freq Cycles CPI Freq Cycles CPI
  • ALU .50 1 .5 .5 X 1 .5 X
  • Load .20 2 .4 .2 X 2 .4 2X
  • Store .10 2 .2 .1 2 .2
  • Branch .20 2 .4 .2 3 .6
  • Reg/Mem X 2 2X
  • 1.00 1.5 1 X (1.7 X) / (1
    X)?
  • Instr CntOld ? CPIOld ? ClockOld Instr
    CntNew ? CPINew ? ClockNew

21
Example Solution
  • Exec Time Instruction Count ? CPI ? Clock
  • Op Freq Cycles CPI Freq Cycles CPI
  • ALU .50 1 .5 .5 X 1 .5 X
  • Load .20 2 .4 .2 X 2 .4 2X
  • Store .10 2 .2 .1 2 .2
  • Branch .20 2 .4 .2 3 .6
  • Reg/Mem X 2 2X
  • 1.00 1.5 1 X (1.7 X) / (1
    X)?
  • Instr CntOld ? CPIOld ? ClockOld Instr CntNew
    ? CPINew ? ClockNew
  • 1.00 ? 1.5 (1 X) ? (1.7
    X) / (1 X)?
  • 1.5 1.7 X
  • 0.2 X
  • ALL loads must be eliminated for this to be a win!
Write a Comment
User Comments (0)
About PowerShow.com