Greg Wilson - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Greg Wilson

Description:

Getting the right answer is important. Getting the right answer quickly is also important. If we ... Sip of coffee. Fresh pot. Buy some more beans. Harvest time ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 38
Provided by: kaja1
Category:
Tags: beans | coffee | greg | wilson

less

Transcript and Presenter's Notes

Title: Greg Wilson


1
CSC407 Software ArchitectureSummer
2006Performance
  • Greg Wilson
  • BA 3230
  • gvwilson_at_cs.utoronto.ca

2
Introduction
  • Getting the right answer is important
  • Getting the right answer quickly is also
    important
  • If we didnt care about speed, wed do things by
    hand
  • Choosing the right algorithm is part of the
    battle
  • Choosing a good architecture is the other part
  • Only way to tell good from bad is to analyze and
    measure actual performance

3
Example File Server
  • Dedicated server handing out PDF and ZIP files
  • One CPU
  • 4 disks PDFs on 1 and 2, ZIPs on 3 and 4
  • Have to know the question to get the right answer
  • How heavy a load can it handle?
  • Would it make more sense to spread all files
    across all disks?

4
We Call It Computer Science
  • because its experimental
  • Collect info on 1000 files downloaded in 200 sec

File Type Size (KB) Time (sec)
PDF 303 1.43
ZIP 1233 5.81
ZIP 1077 5.08
PDF 315 1.48
ZIP 1240 5.84
PDF 413 1.95
5
Summary Statistics
  • Analyze all 1000 downloads in a spreadsheet
  • Yes, computer scientists use spreadsheets

Statistic PDF ZIP
Number of Files 411 589
Average Size (KB) 377.6 1155.6
Standard Deviation 43.1 85.7
  • Were justified in treating each type of file as
    a single class

6
Modeling Requests
  • The concurrency level is the number of things of
    a particular class going on at once
  • Estimate by adding up total download time for PDF
    and ZIP files separately, and dividing by the
    actual elapsed time
  • NPDF 731.5/200 3.7
  • NZIP 3207.7/200 16.1
  • Round off download ratio is 41

7
Measuring Service Demands
  • What load does each request put on the disk and
    CPU?
  • Create N files of various sizes 10KB, 100KB,
    200KB, , 1GB
  • Put them on a single-CPU, single-disk machine
  • Thats doing nothing else
  • Measure download times
  • TCPU 0.1046s 0.0604
  • Hm
  • Tdisk 0.4078s 0.2919

8
Back To The Data
  • Use Mean Value Analysis to calculate service
    demands
  • Remember to divide disk requirements by 2

Resource PDF (msec) ZIP (msec)
CPU 39.4 120.8
Disk 1 77.1 0.0
Disk 2 77.1 0.0
Disk 3 0.0 235.8
Disk 4 0.0 235.8
9
Thinking With Pictures
10
Thinking With Pictures
11
Observations
  • After 20 users, the server saturates
  • Maximum throughput for PDF files
  • 12 files/sec in original configuration
  • 5 files/sec in balanced configuration
  • Maximum throughput for ZIP files
  • 4.2 files/sec in original configuration
  • 6.6 files/sec in balanced configuration

12
Service Level Agreements
  • SLA requires average download times of 20 sec
    (ZIP files) and 7 sec (PDF files)
  • Original configuration ZIP threshold reached at
    approximately 100 users, when PDF download time
    still only 3 sec
  • Balanced configuration ZIP threshold reached at
    165 users, and PDF download time is 6.5 sec
  • Balanced configuration is strictly superior

13
How Did We Do That?
  • Key concern is quality of service (QoS)
  • Throughput transactions/second, pages/second,
    etc.
  • Response time
  • And variation in response time
  • People would rather wait 10 minutes every day
    than 1 minute 9 days, and 20 minutes the tenth
  • Availability
  • 99.99 available 4.5 minutes lost every 30 days
  • Thats not good enough for 911

14
A Simple Database Server
CPU
disk
  • Circles show resources
  • Boxes show queues
  • Throughput and response times depend on
  • Service demand how much time do requests need
    from resources?
  • System load how many requests are arriving per
    second?

15
Classes of Model
  • An open class is specified by the rate at which
    requests arrive
  • Throughput is an input parameter
  • A closed class is specified by the size of the
    customer population
  • E.g., total number of queries to be processed, or
    total number of system users
  • Throughput is an output
  • Can also have load-dependent and load-independent
    resources, mixed models, etc.

16
Values We Can Measure
  • T length of observation period
  • K number of resources in the system
  • Bi total busy time of resource i in observation
    period
  • Ai number of request arrivals for resource i
  • A0 is total number of request arrivals for whole
    system
  • Ci number of service completions for resource i
  • C0 is completions for whole system
  • In steady state for large T, Ai Ci

17
Values We Can Calculate
  • Si mean service time at resource i (Bi/Ci)
  • Ui utilization of resource i (Bi/T)
  • Xi throughput of resource i (Ci/T)
  • In steady state, Xi Ai Ci ?
  • Vi average visit count for resource i (Ci/C0)

18
Utilization Law
  • Utilization Ui Bi/T (Bi/Ci)/(T/Ci)
  • But Bi/Ci is Si, and T/Ci is just 1/?
  • So Ui ?Si
  • I.e., utilization is the throughput times the
    service time, which makes sense

19
Service Demand Law
  • Service demand Di is the total average time
    required per request from resource i
  • Di UiT/C0
  • I.e., fraction of time busy, times total time,
    over number of requests
  • But UiT/C0 Ui/(C0/T) Ui/ ?
  • I.e., service demand is utilization over
    throughput
  • Ui/X0 (Bi/T)/(C0/T) Bi/C0 ViSi
  • So service demand is average number of visits
    times mean service time per visit

20
Littles Law
  • Average number of requests being processed at any
    time throughput average time each request
    stays in the system
  • So
  • 0.5 requests per second ( throughput)
  • 10 second response time ( time each request
    stays in system)
  • There must be 5 servers

21
Interactive Response Time Law
  • S clients accessing a database
  • Each client thinks for Z seconds between requests
  • Average database response time is R seconds
  • If M is the average number of clients thinking,
    and N is the average number of requests at the
    database, then S MN
  • Littles Law applied to clients M ?Z
  • Littles Law applied to database N ?R
  • So MN S ?(ZR)
  • Or R S/ ? - Z

22
The Weakest Link
  • X0 Ui/Di ? 1/Di for all resources
  • So X0 ? 1/maxDi
  • Remember Little's Law N RX0
  • I.e., number of concurrent transactions is
    response time ? throughput
  • But R is at least the sum of the service demand
    times
  • So N ? (?Di) X0
  • Or X0 ? N/(?Di)
  • So X0 ? min1/maxDi, N/(?Di)

23
Amdahl's Law
  • Let
  • t1 be a programs runtime on one CPU
  • tp be its runtime on p CPUs
  • ß be the algorithms serial fraction

tp ßt1 (1 - ß)t1/p
sp t1 / tp
1/(ß (1 - ß)/p)
s? 1/ß
24
Amdahl's Law
  • Example
  • Want ?32 speedup on 64-processor machine
  • So ß must be 0.984
  • I.e., 98 of the code must run in parallel
  • Ouch
  • What if only half the code can run in parallel?
  • s32 is 1.97
  • Ouch again

25
Hockney's Measures
  • Every pipeline has some startup latency
  • So characterize pipelines with two measures
  • r? is the rate on an infinite data stream
  • n1/2 is the data volume at which half that rate
    is achieved
  • Improve real-world performance by
  • Increasing throughput
  • Decreasing latency

r
n
26
Some Quotations
  • Philosophers have only interpreted the world in
    various ways the point, however, is to change
    it.
  • Karl Marx
  • You cannot manage what you do not measure.
  • Bill Hewlett
  • Measure twice, tune once.
  • Greg Wilson

27
A Simple CGI
browser
/var/apache/httpd
5.1
5.3
/local/bin/python
3.3
2.7
/site/cgi-bin/app.cgi
1.8
0.2
0.7
disk I/O
/usr/bin/psql
0.3
28
How Did I Get These Numbers?
  • Shut down everything else on the test machine
  • Use ps and truss on Unix
  • sysinternals.org has lots of tools to help you
    find things
  • Use a script instead of a browser
  • Insert timers in Python and recompile
  • Could wrap in a timing script, but that distorts
    things
  • Measure import times in my own script
  • Rely on PostgreSQL's built-in monitors
  • Use a profiler

29
Profiling
  • A profiler is a tool that can build a histogram
    showing how much time a program spent where
  • Can either instrument or sample the program
  • Both affect the program's performance
  • The more information you collect, the more
    distortion there is
  • Heisenberg's Law
  • Most can accumulate data over many program runs
  • Often want to distinguish the first run(s) from
    later ones
  • Caching, precompilation, etc.

30
Profiling
31
A Simple CGI Revisited
Can't do much about this
0.2
browser
/var/apache/httpd
5.1
5.3
1.8
fork/exec is expensive
/local/bin/python
3.3
import
0.6
what's going on here?
2.7
/site/cgi-bin/app.cgi
0.9
1.8
waiting out turn at DB
0.2
0.7
disk I/O
/usr/bin/psql
0.3
how many transactions? are they one class?
32
Room for Improvement
  • Forking a new Python interpreter for each request
    is expensive
  • So keep an instance of Python running permanently
    beside the web server, and re-initialize it for
    each request
  • FCGI/SCGI
  • Tomcat is usually run this way
  • The ability to do this is one of the reasons
    VM-based languages won the server wars

33
Room for Improvement
  • Reimporting the libraries is expensive, too
  • Rely on cached .pyc files
  • Or rewrite application around a request-handling
    loop
  • Modularity is your friend
  • Tightly-coupled components cannot be tuned
    independently
  • On the other hand, machine-independent code has
    machine-independent performance

34
Too Much of a Good Thing
35
After Our Changes
was 5.3
0.2
browser
/var/apache/httpd
2.6
2.8
0.1
/local/bin/python
2.5
0.6
this has to be the next target
1.9
/site/cgi-bin/app.cgi
0.1
1.8
0.2
0.7
disk I/O
/usr/bin/psql
0.3
36
When Do You Stop?
  • An optimization problem on its own
  • Time invested vs. likely performance improvements
  • Plan A stop when you satisfy SLAs
  • Or beat themalways nice to have some slack
  • Plan B stop when there are no obvious targets
  • Flat performance profiles are hard to improve
  • Plan C stop when you run out of time
  • Plan D stop when performance is "good enough"

37
Five Timescales
  • Human activities fall into natural cognitive
    categories
  • Continuous
  • Sip of coffee
  • Fresh pot
  • Buy some more beans
  • Harvest time
  • Tuning a well-written application usually just
    improves its performance within its category
  • Revolutions happen when things are moved from one
    category to another
Write a Comment
User Comments (0)
About PowerShow.com