Title: Cpsc 318 Computer Structures Lecture 2 Performance
1Cpsc 318Computer Structures Lecture 2
Performance
- Dr. Son Vuong
- (vuong_at_cs.ubc.ca)
- January 6, 2004
2Overview
- Performance
- How do we measure performance?
- Metrics
- Benchmarking
- Fallicies How to avoid getting fooled by
performance calculations and demos!
Readings Chapter 4
3Performance
- Purchasing Perspective given a collection of
machines (or upgrade options), which has the - best performance ?
- least cost ?
- best performance / cost ?
- Computer Designer Perspective faced with design
options, which has the - best performance improvement ?
- least cost ?
- best performance / cost ?
- All require basis for comparison and metric for
evaluation - Solid metrics lead to solid progress!
4Cost-performance
Smaller is better
- Divide the cost of the machine by its
performance - (equivalent to) multiply cost by execution time
- Units of cost / performance dollars-seconds
5Measurement and Evaluation
Architecture is an iterative process --
searching the space of possible designs --
at all levels of computer systems
Creativity
Cost / Performance Analysis
Good Ideas
Mediocre Ideas
Bad Ideas
6Example
For 400, we can add 256MBytes of memory and
reduce the execution time from 12 down to 8.5
seconds. Does this improve cost-performance?
- Old cost-performance 3000 x 12 36000 sec
- New cost-performance 3400 x 8.5 28900 sec
For and additional 300, we can upgrade to a
wider disk interface and decrease execution time
to 8 seconds. Does this improve
cost-performance?
- Old cost-performance 3400 x 8.5 28900 sec
- New cost-performance 3700 x 8.0 29600 sec
7Example
For 200, we can add 1 GBytes of memory and
reduce the execution time from 12 down to 8.5
seconds. Does this improve cost-performance?
- Old cost-performance 2000 x 12 24000 sec
- New cost-performance 2200 x 8.5 18700 sec
For and additional 175, we can upgrade to a
wider disk interface and decrease execution time
to 8 seconds. Does this improve
cost-performance?
- Old cost-performance 2200 x 8.5 18700 sec
- New cost-performance 2375 x 8.0 19000 sec
8Metrics of performance
Answers per month Useful Operations per second
Application
Programming Language
Compiler
ISA
Datapath
Control
Function Units
Cycles per second (clock rate)
Transistors
Wires
Pins
Each metric has a place and a purpose, and each
can be misused
9Two Notions of Performance
Plane
TopSpeed
DC to Paris
Passen-gers
Throughput (pmph)
Boeing 747
BAD/Sud Concorde
- Which has higher performance?
- Time to deliver 1 passenger?
- Time to deliver 400 passengers?
- In a computer, time for 1 job called Response
Time or Execution Time - In a computer, jobs per day called Throughput or
Bandwidth
10Definitions
- Performance is in units of things per sec
- bigger is better
- If we are primarily concerned with response time
" X is n times faster than Y " means
11Example of Response Time v. Throughput
- Time of Concorde vs. Boeing 747?
- Concord is 6.5 hours / 3 hours 2.2 times
faster - Throughput of Boeing vs. Concorde?
- Boeing 747 286,700 pmph / 178,200 pmph 1.6
times faster - Boeing is 1.6 times (60) faster in terms of
throughput - Concord is 2.2 times (120) faster in terms of
flying time (response time) - We will focus primarily on execution time for a
single job
12Comparisons
Definitions
- Speed-up time_old / time_new
- Improvement 100 (time_old - time_new) /
time_new
- Example
- Consider the enhancement in a previous example
that improved the execution time from 12 down to
8.5 - Speed-up of 12/8.5 1.41
- Improvement of 100 (12 - 8.5) / 8.5 41
Execute 41 more applications in the same time.
If the old machine could execute 300
programs/hour this one execute 423 programs/hour
13Confusing Wording on Performance
- Will (try to) stick to n times faster its less
confusing than m faster - As faster means both increased performance and
decreased execution time, to reduce confusion
will use improve performance or improve
execution time
14What is Time?
- Straightforward definition of time
- Total time to complete a task, including disk
accesses, memory accesses, I/O activities,
operating system overhead, ... (User Time) - real time, response time or elapsed time
- Alternative just time processor (CPU) is
working only on your program (since multiple
processes running at same time) - CPU execution time or CPU time
- Often divided into system CPU time (in OS) and
user CPU time (in user program)
15Different aspects
cascade time myprogram 90.7u 12.9s 239 65
- User CPU time 90.7 seconds
- System CPU time 12.9 seconds
- Elapsed time is 2 minutes and 39 seconds
- Percentage of elapsed time that is CPU time is
65
16How to Measure Time?
- User Time ? seconds
- CPU Time Computers constructed using a clock
that runs at a constant rate and determines when
events take place in the hardware - These discrete time intervals called clock
cycles (or informally clocks or cycles) - Length of clock period clock cycle time (e.g.,
2 nanoseconds or 2 ns) and clock rate (e.g., 500
megahertz, or 500 MHz), which is the inverse of
the clock period use these!
17Clock Cycle
Clock Period
Start of Cycle
End of Cycle
1 MHz (MegaHertz) 1 million cycles/sec OR 1
cycle/microsecond
18Measuring Time using Clock Cycles (1/2)
CPU execution time for program Clock
Cycles for a program x Clock Cycle Time
- or
- Clock Cycles for a program Clock Rate
19Measuring Time using Clock Cycles (2/2)
- One way to define clock cycles
- Clock Cycles for program
- Instructions for a program (called
Instruction Count) - x Average Clock Cycles Per Instruction
(abbreviated CPI) - CPI one way to compare two machines with same
instruction set, since Instruction Count would be
the same
20Performance Calculation
- CPU execution time for program Clock Cycles
for program x Clock Cycle Time - Substituting for clock cycles
- CPU execution time for program
(Instruction Count x CPI)
x Clock Cycle Time - Instruction Count x CPI x Clock Cycle Time
21How to Calculate the 3 Components?
- Clock Cycle Time in specification of computer
(Clock Rate in advertisements) - Instruction Count
- Count instructions in loop of small program
- Use simulator to count instructions
- Hardware counter in spec. register (Pentium II)
22Example
Two implementation of the same ISA
Which is faster?
Assuming I instructions
23Another way of calculating CPI
InstructionCount x CPI x ClockCycleTime
"instruction frequency"
Invest Resources where time is Spent!
24Example
Choice of code sequence to generate for some
high-level language Construct.
Which is the best choice?
Best
25What Programs Measure for Comparison?
- Ideally run typical programs with typical input
before purchase, or before even build machine - Called a workload For example
- Engineer uses compiler, spreadsheet
- Author uses word processor, drawing program,
compression software - In some situations its hard to do
- Dont have access to machine to benchmark
before purchase - Dont know workload in future
26Benchmarks
- Obviously, apparent speed of processor depends on
code used to test it - Need industry standards so that different
processors can be fairly compared - Companies exist that create these benchmarks
typical code used to evaluate systems - Need to be changed every 2 or 3 years since
designers could (and do!) target for these
standard benchmarks
27Example Standardized Benchmarks 1/2
- Workstations Standard Performance Evaluation
Corporation (SPEC) - SPEC95
- 8 integer (gcc, compress, li, ijpeg, perl, ...)
- 10 floating-point programs (hydro2d, mgrid,
applu, turbo3d, ...) - www.spec.org
- Separate avg for integer (CINT95) and FP (CFP95)
relative to base machine SPARC10/40 - Benchmarks distributed in source code
- Company representatives select workload
- Compiler, machine designers target benchmarks, so
try to change every 3 years
28Example Standardized Benchmarks 2/2
- SPEC CPU2000
- 12 integer (gzip, gcc, crafty, perl, bzip, ...)
- 14 floating-point (swim, mesa, art, apsi, ...)
- Separate average for integer (CINT2000) and FP
(CFP2000) relative to base machine Sun 300MHz
256Mb-RAM Ultra5_10, which gets score of 100 - www.spec.org/osg/cpu2000/
- They measure
- System speed (SPECint2000)
- System throughput (SPECint_rate2000)
29SPECint95base Performance (Oct. 1997)
Compaq/DEC Alpha
HP PA
Intel Pentium Pro
30SPECfp95base Performance (Oct. 1997)
Compaq/DEC Alpha
HP PA
Intel Pentium Pro
31Example PC Workload Benchmark
- PCs Ziff-Davis Benchmark Suite
- Business Winstone is a system-level,
application-based benchmark that measures a PC's
overall performance when running today's
top-selling Windows-based 32-bit applications it
doesn't mimic what these packages do it runs
real applications through a series of scripted
activities and uses the time a PC takes to
complete those activities to produce its
performance scores. - Also tests for CDs, Content-creation, Audio, 3D
graphics, battery life - http//www.etestinglabs.com/benchmarks/
32Performance Evaluation
- Good products created when have
- Good benchmarks
- Good ways to summarize performance
- Given sales is a function of performance relative
to competition, should invest in improving
product as reported by performance summary? - If benchmarks/summary inadequate, then choose
between improving product for real programs vs.
improving product to get more sales Sales almost
always wins!
33Performance Evaluation The Demo
- If were talking about performance, lets
discuss the ways shady salespeople have fooled
consumers(so that you dont get taken!) - 5. Never let the user touch it
- 4. Only run the demo through a script
- 3. Run it on a stock machine in which no expense
was spared - 2. Preprocess all available data
- 1. Play a movie
34Megahertz Myth Marketing Movie
35And in conclusion 1/2
- Latency v. Throughput
- Performance doesnt depend on any single factor
need to know Instruction Count, Clocks Per
Instruction and Clock Rate to get valid
estimations - User Time time user needs to wait for program to
execute depends heavily on how OS switches
between tasks - CPU Time time spent executing a single program
depends solely on design of processor (datapath,
pipelining effectiveness, caches, etc.)
36And in conclusion 2/2
- Benchmarks
- Attempt to predict performance
- Updated every few years
- Measure everything from simulation of desktop
graphics programs to battery life - Megahertz Myth
- MHz ? performance, its just one factor