Title: Computer Performance: TIME, TIME, TIME
1Computer Performance TIME, TIME, TIME
- Response Time (latency) How long does it take
for my job to run? How long does it take to
execute a job? How long must I wait for the
database query? - Throughput How many jobs can the machine run
at once? What is the average execution
rate? How much work is getting done? - If we upgrade a machine with a new processor what
do we increase? - If we add a new machine to the lab what do we
increase?
2Execution Time
- Elapsed Time
- counts everything (disk and memory accesses, I/O
, etc.) - a useful number, but often not good for
comparison purposes - CPU time
- doesn't count I/O or time spent running other
programs - can be broken up into system time, and user time
- Our focus user CPU time
- time spent executing the lines of code that are
"in" our program
3Clock Cycles
- Instead of reporting execution time in seconds,
we often use cycles - Clock ticks indicate when to start activities
(one abstraction) - cycle time time between ticks seconds per
cycle - clock rate (frequency) cycles per second (1
Hz. 1 cycle/sec)A 4 GHz. clock has a
cycle time
4How many cycles are required for a program?
- We could assume that number of cycles equals
number of instructions. - This assumption is incorrect, different
instructions take different amounts of time on
different machines. - Our focus CPI Cycles Per Instruction.
time
5Problem Some program runs in 10 seconds on
computer A, which has a 400 Mhz. clock. We want
to build a new machine B, that will run this
program in 6 seconds. We can increase the clock
rate, but this increase will cause machine B to
require 1.2 times as many clock cycles as machine
A for the same program. What clock rate should we
tell the designer to target? Solution 400 MHz
4108 Hz gt machine A provides 4108 cycles
per second. The program runs 10 seconds on
machine A gt program execution takes 4109
cycles gt on machine B it would take 1.2
4109 4.8 109 cycles.We want our program to
run 6 seconds on machine Bgt clock rate of
machine B 4.8 109 / 6 0.8 109 8 108
or 800 MHz
6ProblemSuppose we have two implementations of
the same instruction set architecture (ISA).For
some program, - machine A has a clock cycle
time of 10 ns. (nanoseconds) and a CPI
(cycles per instruction, in average) of 2.0. -
machine B has a clock cycle time of 20 ns. and a
CPI of 1.2. Which machine runs faster for the
given program , and by how much? Solutionmachine
A ns. per instruction 2.0 10 20.machine
B ns. per instruction 1.2 20 24. gt
machine A is faster, by 24 / 20 1.2 times.
7ProblemThere are three different classes of
instructions class A, B and C.They require one,
two and three cycles respectively. There are two
code sequences- first code contains 2
instructions of class A, 1 of B, and 2 of C.-
second code contains 4 instructions of class A,
1 of B, and 1 of C. Which sequence will be
faster? By how much?What is the CPI for each
sequence? Solutionfirst code 211223 10
cycles gt CPI 10 / 6second code 411213
9 cycles gt CPI 9 / 6 gt second code is
faster, by 10/9.
8Amdahl's Law
-
- Execution Time After Improvement
- Execution Time Unaffected ( Execution Time
Affected / Amount of Improvement ) - Example
- "Suppose a program runs in 100 seconds on a
machine, with multiply responsible for 80
seconds of this time. How much do we have to
improve the speed of multiplication if we want
the program to run 4 times faster?"How about
making it 5 times faster?
9Solution E.T. after improvement 20 seconds
80 seconds / xgt 100 / 4 20 80 / x gt x
16 This means that multiplication should be
executed 16 time faster! Now , to make run time
5 times faster 100 / 5 20 80 / x gt x
!!! This means that the multiplication
should take zero time! Thats impossible.
10ProblemSuppose we enhance a machine making all
floating-point instructions run 5 times faster.
If the execution time of some benchmark before
the floating-point enhancement is 10 seconds,
what will the speedup be if half of the 10
seconds is spent executing floating-point
instructions? SolutionE.T. after improvement
5 seconds 5 / 5 6 secondsgt speedup 10 /
6.
11ProblemThere are two different classes of
instructions A and B. Machine A has a clock
cycle time of 10 ns and a CPI of 2.0 for class A
instruction, CPI of 3.0 for class B
instructions.Machine B has a clock cycle time of
20 ns and a CPI of 1.25 for both instructions
classes. A given program is 50 class A
instructions and 50 class B instructions. Which
machine runs this program faster? (N number of
instructions) Solution Machine A ns per class A
instruction 2.0 10 20.Machine A ns per
class B instruction 3.0 10 30.Machine B
ns per instruction 1.25 20 25.execution
time on machine A N 0.5 20 N 0.5 30
N 25. execution time on machine B N 1 25
N 25. gt the machines have same performance
for the given program
12- Summary
-
- performance is specific to a particular
program(s). Total execution time is a
consistent summary of performance. - for a given architecture, performance increases
come from - increases in clock rate (without
adverse CPI affects) - improvements in
processor organization that lower CPI -
compiler enhancements that lower CPI and / or
instruction count - Pitfall
- expecting improvement in one aspect of a
machines performance to affect the total
performance.