Title: Some Systems, Applications and Models I Have Known
1Some Systems, Applicationsand Models I Have Known
a retrospective look at many performance
evaluation studies
- Ken Sevcik
- University of Toronto
2Overview
- In the past 35 years,
- Systems Have Changed
- Applications Have Grown
- Models Have Matured and Adapted
- and some interesting problems
- have been encountered
3One-slide TutorialApproaches To Performance
Evaluation
- How to answer What if questions
- (about hardware, software, and workload)
- Three alternatives
- Analysis using queueing theory
- Abstract model, but fast and cheap
- Stochastic Simulation
- Detailed model, and takes some time and work
- Experimentation
- Actual system, but lots of time and work
4Problems with Voting Systems
- Defn Majority Winner
- A candidate who wins every pairwise election
- Problems
- Voting for a single candidate
- Primaries and Drop Last can eliminate a majority
winner - Expressing a full preference ordering
- There may be no majority winner!
- Question
- How likely is a cyclical majority (or voters
paradox) where there is no majority winner?
5Elections and the will of the people
- Assume voter preferences are
- 30 L gt M gt R
- 10 M gt L gt R
- 20 M gt R gt L
- 40 R gt M gt L
- Single Vote R wins with 40
- Yet pairwise M beats both R and L
- Preference order
- and 40 R gt L gt M
40
60
30
60
70
30
R
60
M
60
70
Cyclical Majority
L
6First Research Application Exact
Probability of a Voters Paradox
- C candidates for election
- V voters with strict preference orderings
- Can one candidate beat each other pairwise?
Example V 3 C 3 V1 X gt Y
gt Z V2 Y gt Z gt X
V3 Z gt X gt Y
Then, in pair-wise elections, X beats
Y and Y beats Z yet Z beats X !
Paradox occurs in 12 of the (3!)3 216
possible configurations.
In general, there are (C!)V voting
configurations.
7My first personal computer IBM System
360 Model 30 with BOS
8Exact Probabilities of Voters Paradox
- V 3 C 3 ? 12 cycles in 216 configs.
- V 7 C 7 ?
- 26,295,386,028,643,902,475,468,800 cycles
in - 82,606,411,253,903,523,840,000,000 configs.
- (Computed in approximately 40 hours of CPU time.)
C 3 5
7
40 V 3 .0555 .1600
.238798185941
.61 V 5 .06944 .19999525
.295755170299 .71 V 7
.075017 .215334
.318321370333 .74 V
40 .09 .24
.36 .80
9Exact Probabilities of Voters Paradox
Recent results
V 9 C 7 ? 692,953,571,964,418,337,
059,197,419,520,000 cycles
2,098,335,016,107,155,751,174,144,000,000,000
configs.
- V 5 C 9 ?
- 2,312,910,445,872,026,769,020,92
8,000 cycles - 6,292,383,221,978,976,013,516,80
0,000 configs.
10Job Sequencing on a Single Processor
(using service time distribution knowledge)
Given N jobs and their service time
distributions, Specify a schedule that minimizes
average completion time.
Example with two jobs job 1 t1
k job 2 t2 s w.
prob. 1 - p
t w. prob. p
j1 1st j2 1st j2, j1, j2
j2, j1, j2 BEST IFF
s (1 p)
lt min k, s p (t s)
11Job Sequencing on a Single Processor
(using service time distribution knowledge)
Smallest Rank (SR) Scheduling
- Minimize Investment (quantum length)
- Payoff (Pr
Completion)
Service
Time Knowledge
exact average distribution
No SPT SEPT
SEPT Preemption Allowed?
Yes SRPT SERPT SR
12Job Sequencing with Two Processors Two
Customers
Extending Shortest First to Multiple
Resources
SBT-RSBT -- Based on average service time
per visit of each customer at each resource
SBT ? A
gets priority at k RSBT
? A gets priority at 1
13In the Beginning
- Single Server Queue
- Many variations
- arrival process, service process
- multiple servers, finite buffer size
- scheduling discipline
- FCFS, RR, FBn, PS, SRPT,
N , Z
S
RR, FBn, and PS increased relevance of models
14Queuing Network Models
Central Server Model
Separable (or product form) models
N customers
and efficient computational algorithms
Variants Open, Closed, Mixed scheduling
disciplines
15The Great DebateOperational Analysis vs.
Stochastic Modeling
- SM
- Ergodic stationary Markov process in equilibrium
- Coxian distributions of service times
- independence in service times and routing
- OA
- finite time interval
- measurable quantities
- testable assumptions
- OA made analytic modelling accessible to capacity
- planners in large computing
environments
16Uses and Analysis of Queuing Network
Models
- Applications
- System Sizing Capacity Planning Tuning
- Analysis Techniques
- Global Balance Solution
- Massive sets of Simultaneous Linear Equations
- Bounds Analysis
- Asymptotic Bounds (ABA), Balanced System Bounds
(BSB) - Solutions of Separable Models
- Exact (Convolution, eMVA)
- Approximate (aMVA)
- Generalizations beyond Separable Models
- aMVA with extended equations
17Bounding Analysis Case Study Insurance
Company with 20 sites
Upgrade Dcpu Dio Dtot
Improvement Current 4.6 4.0
10.6 ----- 1
5.1 1.9 7.0 1.5 to 2.0
2 3.1 1.9 5.0
2.0 to 3.5
ABA Inputs N, Z, Dtot,
Dmax
Throughput Bound Response Time Bound
18Bounding Analysis Case Study Insurance
Company with 20 sites
Upgrade Dcpu Dio Dtot
Improvement Current 4.6 4.0
10.6 1 5.1 1.9
7.0 1.5 to 2.0 2
3.1 1.9 5.0 2.0 to 3.5
.4
2
.3
X
Cur
.2
1
.1
N
2
6
8
10
4
19Bounding Analysis Case Study Insurance
Company with 20 sites
Upgrade Dcpu Dio Dtot
Improvement Current 4.6 4.0
10.6 1 5.1 1.9
7.0 1.5 to 2.0 2
3.1 1.9 5.0 2.0 to 3.5
1
Cur
20
2
15
R
10
5
N
2
4
6
8
10
20Exact Mean Value Analysis Algorithm
Initialize (for zero customers)
Iterate up to N customers
for n 1, , N
Set Arrival Instant Queue Lengths
Set Residence Time
Understandable and Easy to Implement
21 Approximate Mean Value Analysis
Initialize to Equal Queue Lengths
Iterate until convergence
loop until Qk ( N ) are stable
Revise Arrival Instant Queue Lengths
Revise Residence Times
Substantial time savings Little loss of accuracy
22Details of Real Systems
- Going beyond Separable models
- Priority Scheduling
- Alter Residence Time equation
- FCFS with high variance service times
- Reflect coefficient of variation in service times
- Memory Constraints
- Alter MPL limit N , or Dpaging
- I/O Subsystems (simultaneous resource possession)
- Reflect contention by inflating Ddisk
- Enhanced Utility of QNMs for Real Systems
23QNMs for Capacity Planning Tuning
- Existing system with measurable workload
- What if
- the workload volume increases?
- the workload mix changes?
- the processor is upgraded?
- memory is added?
- the I/O configuration is enhanced?
- class priorities are adjusted?
- file placements are changed?
- changing usage of memory?
- Answer by changing model parameters
CAPACITY PLANNING
TUNING
24System Sizing Case StudyNASA Numerical
Aerodynamic Simulator
- GOAL to attain a sustainable Gigaflop
Work Stations
Data Mgmt
Cray 1
Cray 2
Cray 3
Graphics
QNMs proved more useful than a simulation model
25Capacity Planning Case Study FAA Air
Traffic Control System
- 40 distributed air traffic control centers
- Each with the SAME
- software
- hardware family
- 35 transaction types
- But DIFFERENT
- transaction volumes and mixes
- Single QNM (one class per transaction type)
- supports capacity planning for all sites
26QNMs for System and Architecture
Analysis
- Architectures
- caching structures
- Communication networks
- Local Area Networks
- Rings, buses
- Store and Forward
- flow control
- end to end response time
- Interconnection networks
- omega, shuffle-exchange,
27Network for NASAs Space Station
(circa 1984)
- Distributed LAN for many components
Results Some properties of the FDDI
Protocol
Ground Station
28Architectural Analysis Case Study
NUMAchine
- 4 x 4 x 4 Hierarchical Ring Architecture
Setting Routing Priorities
Message Handling
Contiguous vs. Interleaved Shortest First ?
29SEEU Interconnection Network
Source 000 001 010 011
100 101 110 111
Destination 000 001 010 011
100 101 110 111
Exchange Unshuffle
Shuffle Exchange
30SEEU operation
Combination Lock Algorithm
Up to 40 increase in throughput
31Job Scheduling for Parallel Processing
Variants Rigid
Moldable Evolving
Malleable
Job j ( tj , pj )
1
2
3
processors
P
time
32Parallelism Early or Late ?
- Problem
- Schedule N jobs of two tasks each on two
processors - to minimize average residence time
- Each pair of jobs can be executed as
PARALLEL SEQUENTIAL
overhead of parallel execution
33Parallelism Early or Late ?
- Results of two similar studies
- RN et al. Start parallel Finish sequential
S
S
S
P
P
P
P
P
P
S
S
S
34Parallelism Early or Late ?
- Results of two similar studies
- RN et al. Start parallel Finish sequential
- KCS Start sequential Finish parallel
S
S
S
P
P
P
P
P
P
S
S
S
S
S
S
P
P
P
P
P
P
S
S
S
35Parallelism Early or Late ?
- Results of two similar studies
- RN et al. Start parallel Finish sequential
- KCS Start sequential Finish parallel
S
S
S
P
P
P
P
P
P
S
S
S
S
S
S
P
P
P
P
P
P
S
S
S
36Parallelism Early or Late ?
increasing
increasing
37The Case for Popt 1
- (Assume p gt 1 ? Ej (p) lt 1 )
- Argument
- Demand is insatiable (unbounded backlog)
- Economies of scale (100s of users)
- Good systems will be heavily used
- Parallelism overhead decreases throughput
and increases queuing
times
38Distributed Processing Models
- Processor selection strategies
- local vs. global execution
- Load Sharing
- sender-initiated vs. receiver-initiated
39Small example Individual Versus Social
Optimum
- Arriving customers must pick one of two
processors, one fast and one slow
pF
F
pS
S
Individual Optimum Pick server with lower
response time ( ? response
times are equalized) Social Optimum
Control pF to minimize average response time
40Satisfying Social and Individual Goals
Individual Goal Equalize Response Times
Individual Optimum
Social Goal Minimize Average Response Time
min
Social Optimum
41Resolution of Social and Individual Goals
1. Charge a Toll on the Fast processor 2. Give
a Rebate to users of the Slow processor 3. Set
total of Rebates to equal the total of Tolls.
Toll on users of F
Rebate to users of S
RESULT Individual Choice Yields Social
Optimum So
Everybody Wins !!!
42Resolution of Social and Individual Goals
Example
pF RF RS
R IND .87 16.7 16.7
16.7 SOC .85 12.1 27.0 14.3
With Toll 2.2 (and Rebate 12.7)
pF RF RS R
CF CS C Toll .85
12.1 27.0 14.3 14.3 14.3
14.3
43Anomaly of High Dimensional Spaces
2k Spheres (radius 1) in Cube (vol. 4k 2 k
sides) and an Inner sphere
1. Pointy-ness Property
2. Radius of Inner Sphere
R2 .414
R10 2.16 !!!
3. Volume Ratio
44Diagonal of a k-dimensional Cube
(Example k 25 )
Corners
Red
Blues
45Diagonals of Cube
K 1
K 2
K 3
K 4
46Diagonals of Cube
K 9
K 121
(There are 2121 blue spheres)
47Multidimensional Databases
Relational View
(Records of k Attributes)
Multidimensional View
(Points in k-dimensional space)
Indexing Support for -- point search
-- range search -- similarity search
-- clustering
48Bounding Spheres and Rectangles
circumscribed
inscribed ratio of Dim k
sphere cube sphere
volumes -------- ----------------
---------- --------------- -------------
2 1.57 1.00
.785 2 4
4.93 1.00 .308
16 8
64.94 1.00 .0159
4096 16 15422.64
1.00 .000004 4294967296
49Edge Density in High-Dimensions
- Proportion of points near some side
Fraction near some edge
1
k eps .002 .020 .200 ----
------ ------ -----
1 .004 .040 .400
2 .007 .078 .640
4 .015 .150 .870
8 .031 .278 .983
16 .062 .479 .999
50Lessons and Conclusions
- Exact answers are overrated
- accurate approximate answers often suffice
- (e.g., Voters Paradox and Exact QNM solutions )
- Analytic models have an important role
- quick, inexpensive answers in many situations
- (e.g., Insurance Co., NAS System, and FAA System
) - Assumptions matter
- subtle differences can have big effects
- (e.g., in Early or Late Parallelism, NUMAchine
analysis - and PRI vs. FCFS or PS)
51What is the best way to attain
largeimprovements in computer performance?
- -- Analysis?
- -- Simulation?
- -- Experimentation?
52What is the best way to attain
largeimprovements in computer performance?
- -- Analysis?
- -- Simulation?
- -- Experimentation?
- None of the above
- Just wait 30 years!!!
53ACM Sigmetrics IFIP W.G. 7.3
Dept. of Computer Science
Thanks for the memories
54Problems with Voting Systems
- Problems have occurred recently in ..
- France (lowest eliminated)
- R gt M gt L 40
- L gt M gt L 40
- M gt (R, L) 20
- Middle eliminated in first round though rank
score (2.2) - Beats rank score of others (1.9)
- USA (primaries, and electoral college)
- E.g., McCain loses to Bush in primaries although
he - Might beat both candidates in a final election
55Exact Mean Value Analysis Algorithm
for n 1, , N
-- Understandable -- Easy to implement --
Arrival Instant Theorem
end for
56 Approximate Mean Value Analysis
loop
-- Substantial time savings -- Little loss of
accuracy
exit when X(N) and R(N) converge
end loop
57System Sizing Case StudyNASA Numerical
Aerodynamic Simulator
58Quiz 1 Sequence Two Jobs on a Processor
- Service Times
- Rank Calculations
t1 4 t2 1 w. prob. .5 10 w.
prob. .5
Job Attained Investment Payoff
Rank 1 0
4 1.0 4.0 2
0 1
.5 2.0 2 0
5.5 1.0
5.5 2 1 9
1.0 9.0
59Exact Probabilities of Voters Paradox
Recent results
V 9 C 5 ? 1,154,330,758,425,600,000
cycles 5,159,780,352,000,000,000 configs.
- V 5 C 9 ?
- 2,312,910,445,872,026,769,020,928,000
cycles 6,292,383,221,978,976,013,516,800,000
configs.