Title: Evaluating Performance
1Evaluating Performance
2Administrivia
- Ive graded through lab 4
- will email your grades when I finish lab 5
- homework due tomorrow
- Ill post the HW answers once theyre turned in
- Lab on Tuesday Datapath
- ALU and other Logisim examples from class are on
the webpage - Appendix B is on the web page
3Some questions
- How long does it take Google to look up
Tostitos - How long does it take to factor a large prime
number? - How long to sort an array?
4Hard to isolate things
- For all of these questions, were measuring other
things - network
- memory
- hard disk
- compiler
- algorithms
5Were measuring time
- suppose a task takes 30 seconds
- 29 seconds to download a file on a 100 Mbps
network, and 1 second to process it - now we go to a 1 Gbps network connection
- about a 4 second task
- 2.9 seconds to download, 1 second to process
6Wall-clock time
- Simple way to measure
- Lots of other variable affect wall-clock time
- speed of the disk
- network
- how many users are on the machine
- many linux workstations are multi-user
7Multi-user systems are shared
- Examples
- mail server
- blackboard / Marmoset
- Each job gets a timeslice of about 10 ms
- This is not a precise count
- Processing resources are shared, but not
perfectly evenly
8Measure CPU time
- Time your job actually spends using the CPU
- User time
- Time the OS spends doing things that arent your
job, or your job spends waiting for events, like
data from disk - System time
9Clocks
- Weve already seen that clocks regulate how
things happen in a CPU - ticks, clocks, cycles, clock periods, clock ticks
- 2 Ghz means 2 109 cycles per second
- Many instructions take 1 cycle to complete
- Some instructions take multiple cycles to complete
10Tradeoffs
- Increasing the clock speed often means that some
instructions that barely fit in a clock cycle
will now require multiple cycles to complete - Sometimes this is good
11Measuring clock time
- CPU time (CPU clock cycles) / (clock rate)
- 4 Ghz computer takes 10 seconds to perform a task
- We want to drop this down to 6 seconds on a new
computer were designing - Clock rate can be sped up, but will require 1.2
times as many instructions
12- 10 seconds (Cycles) / (4 Ghz)
- 10 X / 4 109
- X 40 109
- 40 109 1.2 48 109
- 6 seconds 48 109 / X
- 8 Ghz
13Clock Cycles Per Instruction
- abbreviated CPI
- average number of cycles required for each
instruction - Estimate for a particular workload
- differs for each architecture
- may differ for streams of instructions for
different programs on the same architecture
14What components affect performance?
- Algorithm
- Instruction count, CPI
- Programming language
- Instruction count, CPI
- Compiler
- Instruction count, CPI
- Instruction set architecture
- Instruction count, CPI, clock rate
15We cant just measure instruction counts
- Some instructions take multiple cycles
- It may be more efficient to execute more
instructions if those instructions each take
fewer cycles
16performance means different things in different
contexts
- Performance metric for a server that matters is
throughput - we dont care if a couple of clients are slow, so
long as on average everyone is fast enough - Performance of an operating system should
incorporate response time - Even if Windows hangs for 5 minutes, I better be
able to move the mouse!
17performance in context, cont.
- Performance metric that matters for air-traffic
control system is the worst case - doesnt matter if on average everything is great,
we cant have anything run slowly
18Which computer is faster?
19Which computer is faster?
20Many ways to measure performance
- instruction counts
- CPI
- wall-clock time
21Throughput vs Response Time
- Faster CPUs vs more CPUs
- A faster CPU usually decreases your response time
- you can handle more instructions per unit of time
- great for video games
- Adding more CPUs increases throughput
- Can perform multiple tasks at once
- great for servers
- like the late Marmoset
22Beware of Benchmarks
- small code segments that are easy to run and
report results for - Advantages and disadvantages?
23Benchmarks arent real programs
- Useful when designing an architecture where
theres no existing compiler - easy to code up
- easy to debug
- Can give extremely misleading performance results
24Should measure performance for real applications
- Harder to get misleading results
- Harder to tweak your compiler/architecture/whateve
r to get artificially good results - I.e. harder to cheat!
25How would we measure the performance of
- Queries to google?
- Factoring large prime numbers?
- Sorting an array?
- Accounting software?