Title: CS1104 Computer Organization
1CS1104 Computer Organization
- PART 2 Computer Architecture
- Lecture 2
- Performance Evaluation Benchmarking
2Issues in Performance Evaluation
- Response time time between start and completion
time of an event (execution time) - Throughput total amount of work (or number of
jobs) done in a given time - Ex Replace a processor with a faster processor
or add multiple slower processors? How is
throughput and response time effected? How does
this depend on job arrival rate? - Notion of performance TIME - user perceived
time, CPU time (user CPU time system CPU time),
3Benchmarks Choosing Programs to Evaluate
Performance
- Measure the performance of a machine using a
set of programs which will hopefully emulate the
workload generated by the users programs.
Benchmarks programs designed to measure
performance
Disadvantages
Advantages
- overly specific
- non-portable
- difficult to run
- hard to identify source
Actual Target Workload
- portable
- widely used
- improvements useful in reality
- less representativethan above
Full Application Benchmarks
Small Kernel Benchmarks
- easy to use early in design cycle
- peak may be far way from real application
performance
Microbenchmarks
- identify peak capability and potential
bottlenecks
4Summarizing Performance
Average execution time
Weighted execution time
where the sum of the weights is equal to 1
5SPEC Benchmarks
- Normalized execution times (divide execution time
on a Sun SPARCstation by the execution time on
the measured machine M) SPEC ratio - Performance of a new program on M performance
of the program on reference machine x SPEC ratio - Average normalized execution times of multiple
benchmarks can be expressed as either an
arithmetic or a geometric mean
Geometric mean of normalized execution times
6Design Principle
Make the common case fast Amdahls Law
Execution time of entire task without enhancement
Speedup
Execution time of entire task with enhancement
Execution time of entire task without enhancement
eold x (1-fracenhanced) enew x fracenhanced
- Fraction of the execution time that can benefit
from the enhancement - How much faster does the enhanced part run?
7Examples
- Suppose we enhance a machine making all
floating-point instructions run five times
faster. If the execution time of some benchmark
before the floating-point enhancement is 10
seconds, what will the speedup be if half of the
10 seconds is spent executing floating-point
instructions? - Time 5 non fl-pt. 5 fl-pt. / 5 6 sec.
- Speedup 10 / 6 1.33
- We are looking for a benchmark to show off the
new floating-point unit described above, and want
the overall benchmark to show a speedup of 3.
One benchmark we are considering runs for 100
seconds with the old floating-point hardware.
How much of the execution time would
floating-point instructions have to account for
in this program in order to yield our desired
speedup on this benchmark? - Speedup 3 100 / (tfl / 5 100 - tfl) ? tfl
83.33sec.
8The CPU Performance Equation
- Instead of reporting execution time in seconds,
we often use the number of clock cycles spent in
executing a program - Any instructions always starts at the beginning
of a clock cycle - Clock cycle time time between two ticks (in
seconds) - Clock rate (frequency) cycles per second (1
Hz. 1 cycle/sec)A 200 Mhz. clock has a
cycle time
Cycle time
9The CPU Performance Equation
How many clock cycles are required to execute a
program?
if different instructions require different
number of clock cycles (which is mostly the case)
CPU time (instrA x cyclesA instrB x
cyclesB ) x cycle time
instrA, instrB, are the different possible
instructions
10The CPU Performance Equation
- Instruction Count (IC) Total number of
instructions executed for a program - Average number of Clock Cycles Per Instruction
(CPI) Total Clock Cycles/IC - Total CPU time IC x CPI x Clock Cycle Time
- Therefore, CPU performance depends on
- Clock Cycle Time Hardware technology
Organization - CPI Organization Instruction Set Architecture
- IC Instruction Set Architecture (ISA) Compiler
These are dependent on each other, but at times
one can be changed with small and predictable
impacts on the other two
11IC
CPI
Clock Cycle Time
- instr count ave. CPI clock rate
- Program X (X)
- Compiler X X
- ISA X X
- Organization X X
- Technology X
12- Be careful of the following concepts
- Machine ? ISA and hardware organization
- Machine ? Cycle time
- ISA hardware organization
- ? cycles for any instruction (this is not
CPI) - ISA Compiler Program ? instructions
executed - Therefore, ISA Compiler Program hardware
organization Cycle time ? Total CPU time
13Summary
- What is Performance?
- How to determine performance? Use benchmarks
- How to summarize performance results
- Amdahls Law Make the common case faster
- CPU performance equation
- What determines CPU time (running time) of a
program? -instructions, cycles per instruction,
cycle time - How do each of these depend on the ISA,
organization, compiler, hardware technology
14?
?
?
?
?
Questions!