Title: Influence of heavytailed distributions on load balancing
1Heavy Tails Performance Models Scheduling
Disciplines
Part IV Scheduling in Practice The SYNC Project
Mor Harchol-Balter Carnegie Mellon
University Computer Science
2Q Which minimizes mean response time?
size service requirement
load r lt 1
3Q Which best represents scheduling in web
servers ?
FCFS
jobs
size service requirement
load r lt 1
jobs
PS
SRPT
jobs
4IDEA Use SRPT instead of PS in Web servers
client 1
Get File 1
1
APACHE WEB SERVER
2
client 2
3
Internet
Get File 2
1000
client 3
Linux 0.S.
Get File 3
5Objections to SRPT
Ö
- Need to know size of request
- Unfairness to requests for big files
6Outline of Talk
I Investigating Unfairness in SRPT
(M/G/1, c.f.m.f.v.) Sigmetrics 01 II
Unfairness in All Scheduling Policies
(M/G/1, c.f.m.f.v.) Performance 02,
Sigmetrics 03 III Implementation of SRPT in
Web servers Transactions on Computer
Systems 03, ITC03
Papers are joint with Adam Wierman Bianca
Schroeder
7Outline of Talk
I Investigating Unfairness in SRPT
(M/G/1, c.f.m.f.v.) Sigmetrics 01 II
Unfairness in All Scheduling Policies
(M/G/1, c.f.m.f.v.) Performance 02,
Sigmetrics 03 III Implementation of SRPT in
Web servers Transactions on Computer
Systems 03, ITC03
www.cs.cmu.edu/harchol/
8SRPT has a long history ...
1966 Schrage Miller derive M/G/1/SRPT response
time
1968 Schrage proves optimality 1979 Pechinkin
Solovyev Yashkov generalize 1990 Schassberger
derives distribution on queue length
BUT WHAT DOES IT ALL MEAN?
9SRPT has a long history (cont.)
1990 - 97 7-year long study at Univ. of Aachen
under Schreiber SRPT WINS BIG ON
MEAN! 1998, 1999 Slowdown for SRPT under
adversary Rajmohan, Gehrke,
Muthukrishnan, Rajaraman, Shaheen,
Bender, Chakrabarti, etc. SRPT STARVES
BIG JOBS! Various o.s. books Silberschatz,
Stallings, Tannenbaum Warn
about starvation of big jobs ... Kleinrocks
Conservation Law Preferential treatment given
to one class of customers is afforded at the
expense of other customers.
10Real-world job sizes are Heavy Tailed
log-log plot
Heavy-tailed Property Largest 1 of jobs
comprise half the load.
Job size (x seconds)
Sigmetrics 96
11Unfairness Question
Let r0.9. Let G Bounded Pareto(a 1.1,
max1010)
Question Which queue does biggest job prefer?
M/G/1
M/G/1
12Results on Unfairness
Let r0.9. Let G Bounded Pareto(a 1.1,
max1010)
13Results on Unfairness
Let G Bounded Pareto(a 1.1, max1010)
PS
SRPT
14Unfairness General Distribution
All-can-win-theorem For all distributions, if r
lt ½, ET(x)SRPT lt ET(x)PS for
all x.
15All-can-win-theorem For all distributions, if r
lt ½, ET(x)SRPT ET(x)PS
for all x.
Proof
EWait(x)SRPT
ERes(x)SRPT
ET(x)PS
16All-can-win-theorem For all distributions, if r
lt ½, ET(x)SRPT ET(x)PS
for all x.
Proof cont.
-
Need sufficient condition s.t.
17All-can-win-theorem For all distributions, if r
lt ½, ET(x)SRPT ET(x)PS
for all x.
Proof cont.
Need sufficient condition s.t.
18All-can-win-theorem For all distributions, if r
lt ½, ET(x)SRPT ET(x)PS
for all x.
Proof cont.
Need sufficient condition s.t.
Observe
19All-can-win-theorem For all distributions, if r
lt ½, ET(x)SRPT ET(x)PS
for all x.
Proof cont.
(
-
r
2
x
))
1
(
2
r
-
1
Suffices that 2(1 - r(x))2 gt 1 - r. Suffices
that r(x) lt 1/2
20Intuition
21Outline of Talk
I Investigating Unfairness in SRPT
(M/G/1, c.f.m.f.v.) Sigmetrics 01 II
Unfairness in All Scheduling Policies
(M/G/1, c.f.m.f.v.) Performance 02,
Sigmetrics 03 III Implementation of SRPT in
Web servers Transactions on Computer
Systems 03, ITC03
www.cs.cmu.edu/harchol/
22What is fair?
Response time for job of size x
Slowdown for job of size x
Slowdown is independent of size
PS does not bias towards any particular job size.
Definition A policy P is fair if ES(x)P
ES(x)PS for all x. Otherwise, P
is unfair.
23Classification of Scheduling Policies
and distributions
24Testing your intuition
and distributions
25Classification of Scheduling Policies
FCFS
LJF
SJF
PS
FB
Non-preemptive
FSP
Age- Based Policies
Always Unfair
Sometimes Unfair
Always FAIR
Preemptive
Size-based Policies
Preemptive
Remaining-size based Policies
PLCFS
PSJF
SRPT
LRPT
Lots of open problems
26Always Unfair
Theorem Any preemptive, size based policy, P,
is Always Unfair.
Always Unfair
Case1 A finite size, y, receives lowest
priority Case 2 No finite size receives lowest
priority (2a) Priorities decrease
monotonically -- PSJF (2b) Priorities
decrease non-monotonically.
Unfair for all loads and distributions
27Always Unfair
Theorem Any preemptive, size based policy, P,
is Always Unfair.
Always Unfair
Case1 A finite size, y, receives lowest priority
Unfair for all loads and distributions
y
V Work in System
28Always Unfair
Theorem Any preemptive, size based policy, P,
is Always Unfair.
Always Unfair
Case2a Priorities decrease monotonically (PSJF)
Infinite sized job has lowest priority ...
Unfair for all loads and distributions
but that job is treated fairly?
29Always Unfair
Theorem Any preemptive, size based policy, P,
is Always Unfair.
Always Unfair
Case2a Priorities decrease monotonically (PSJF)
ES(x)
PS PSJF
Unfair for all loads and distributions
x
0
Finding a hump shows PSJF is Always Unfair
30Always Unfair
Theorem Any preemptive, size based policy, P,
is Always Unfair.
Always Unfair
Case2b Priorities decrease non-monotonically
ES(x)
PS PSJF
Unfair for all loads and distributions
x
0
Find y beyond which PSJF treats all jobs
unfairly. Find x gt y, where x has lower priority
than y. gt x is treated unfairly.
31The mysterious hump
PS PSJF
ES(x)
x
0
x
0
- This hump appears in many common policies,
- PSJF
- FB
- SRPT
- SJF
32Outline of Talk
I Investigating Unfairness in SRPT
(M/G/1, c.f.m.f.v.) Sigmetrics 01 II
Unfairness in All Scheduling Policies
(M/G/1, c.f.m.f.v.) Performance 02,
Sigmetrics 03 III Implementation of SRPT in
Web servers Transactions on Computer
Systems 03, ITC03
www.cs.cmu.edu/harchol/
33From theory to practice
What does SRPT mean within a Web server?
- Many devices Where to do the scheduling?
- Many jobs at once.
34Servers Performance Bottleneck
Site buys limited fraction of ISPs bandwidth
client 1
Get File 1
WEB
SERVER
client 2
(Apache)
Rest of Internet
Get File 2
ISP
Linux 0.S.
client 3
Get File 3
5
We schedule bandwidth at servers uplink.
35Network/O.S. insides of traditional Web server
Socket 1
Client1
Network Card
Socket 2
Client2
BOTTLENECK
Socket 3
Client3
Sockets take turns draining --- FAIR PS.
36Network/O.S. insides of our improved Web server
Socket 1
Client1
S
Network Card
1st
Socket 2
Client2
2nd
M
BOTTLENECK
3rd
Socket 3
Client3
L
priority queues.
Socket corresponding to file with smallest
remaining data gets to feed first.
37Experimental Setup
1
2
WAN EMU
3
1
200
APACHE WEB SERVER
Linux
2
1
3
2
WAN EMU
3
switch
200
Linux
Linux 0.S.
1
2
WAN EMU
3
200
Linux
Implementation SRPT-based scheduling 1)
Modifications to Linux O.S. 6 priority Levels
2) Modifications to Apache Web server
3) Priority algorithm design.
38Flash
Experimental Setup
Apache
10Mbps uplink
1
2
WAN EMU
3
100Mbps uplink
APACHE WEB SERVER
1
200
Linux
2
Surge
1
3
2
Trace-based
WAN EMU
3
switch
200
Linux
Open system
Linux 0.S.
1
Partly-open
2
WAN EMU
3
200
WAN EMU
Linux
Geographically- dispersed clients
Trace-based workload Number requests made
1,000,000 Size of file requested 41B -- 2
MB Distribution of file sizes requested has HT
property.
Load lt 1
Transient overload
Other effects initial RTO user abort/reload
persistent connections, etc.
39Results Mean Response Time
.
.
.
Mean Response Time (sec)
FAIR
.
.
SRPT
.
Load
40Mean Response Time vs. Size Percentile
Load 0.8
FAIR
Mean Response time (ms)
SRPT
Percentile of Request Size
41Transient Overload -- Mean response time
SRPT
FAIR
42Transient overload Response time as function of
job size
FAIR
SRPT
small jobs win big!
big jobs arent hurt!
43New project Scheduling dynamic web requests
buy
Web Server (eg Apache/Linux)
Internet
buy
buy
Need to schedule the database ...
44Conclusion
Misconceptions about unfairness
Discrimination against high-performance scheduling
policies
Classifying policies with respect to
unfairness is counter-intuitive.
Good news Many high-performing policies
are also fair in practice!