Title: Influence of heavytailed distributions on load balancing
 1Heavy Tails Performance Models  Scheduling 
Disciplines
Part IV Scheduling in Practice The SYNC Project
Mor Harchol-Balter Carnegie Mellon 
University Computer Science 
 2Q Which minimizes mean response time?
size  service requirement
load r lt 1 
 3Q Which best represents scheduling in web 
servers ?
FCFS
jobs
size  service requirement
load r lt 1
jobs
PS
SRPT
jobs 
 4IDEA Use SRPT instead of PS in Web servers
client 1
Get File 1
1
APACHE WEB SERVER
2
client 2
3
Internet
Get File 2
1000
client 3
Linux 0.S.
Get File 3 
 5Objections to SRPT
Ö
-  Need to know size of request 
-  Unfairness to requests for big files
6Outline of Talk
I Investigating Unfairness in SRPT 
(M/G/1, c.f.m.f.v.) Sigmetrics 01 II 
Unfairness in All Scheduling Policies 
(M/G/1, c.f.m.f.v.) Performance 02, 
Sigmetrics 03 III Implementation of SRPT in 
Web servers Transactions on Computer 
Systems 03, ITC03
Papers are joint with Adam Wierman  Bianca 
Schroeder 
 7Outline of Talk
I Investigating Unfairness in SRPT 
(M/G/1, c.f.m.f.v.) Sigmetrics 01 II 
Unfairness in All Scheduling Policies 
(M/G/1, c.f.m.f.v.) Performance 02, 
Sigmetrics 03 III Implementation of SRPT in 
Web servers Transactions on Computer 
Systems 03, ITC03
www.cs.cmu.edu/harchol/ 
 8SRPT has a long history ...
1966 Schrage  Miller derive M/G/1/SRPT response 
time
1968 Schrage proves optimality 1979 Pechinkin  
Solovyev  Yashkov generalize 1990 Schassberger 
derives distribution on queue length
BUT WHAT DOES IT ALL MEAN? 
 9SRPT has a long history (cont.)
1990 - 97 7-year long study at Univ. of Aachen 
under Schreiber SRPT WINS BIG ON 
MEAN! 1998, 1999 Slowdown for SRPT under 
adversary Rajmohan, Gehrke, 
Muthukrishnan, Rajaraman, Shaheen, 
Bender, Chakrabarti, etc. SRPT STARVES 
BIG JOBS! Various o.s. books Silberschatz, 
Stallings, Tannenbaum Warn 
about starvation of big jobs ... Kleinrocks 
Conservation Law Preferential treatment given 
to one class of customers is afforded at the 
expense of other customers. 
 10Real-world job sizes are Heavy Tailed
log-log plot
Heavy-tailed Property Largest 1 of jobs 
comprise half the load.
Job size (x seconds)
Sigmetrics 96 
 11Unfairness Question
Let r0.9. Let G Bounded Pareto(a  1.1, 
max1010)
Question Which queue does biggest job prefer?
M/G/1
M/G/1 
 12Results on Unfairness
Let r0.9. Let G Bounded Pareto(a  1.1, 
max1010) 
 13Results on Unfairness
Let G Bounded Pareto(a  1.1, max1010)
PS
SRPT 
 14Unfairness  General Distribution
All-can-win-theorem For all distributions, if r 
lt ½, ET(x)SRPT lt ET(x)PS for 
all x. 
 15All-can-win-theorem For all distributions, if r 
lt ½, ET(x)SRPT ET(x)PS 
for all x.
Proof
EWait(x)SRPT
ERes(x)SRPT
ET(x)PS 
 16All-can-win-theorem For all distributions, if r 
lt ½, ET(x)SRPT ET(x)PS 
for all x.
Proof cont.
-
Need sufficient condition s.t. 
 17All-can-win-theorem For all distributions, if r 
lt ½, ET(x)SRPT ET(x)PS 
for all x.
Proof cont.
Need sufficient condition s.t. 
 18All-can-win-theorem For all distributions, if r 
lt ½, ET(x)SRPT ET(x)PS 
for all x.
Proof cont.
Need sufficient condition s.t.
Observe 
 19All-can-win-theorem For all distributions, if r 
lt ½, ET(x)SRPT ET(x)PS 
for all x.
Proof cont.
(
-
r
2
x
))
1
(
2
r
-
1
Suffices that 2(1 - r(x))2 gt 1 - r. Suffices 
that r(x) lt 1/2 
 20Intuition 
 21Outline of Talk
I Investigating Unfairness in SRPT 
(M/G/1, c.f.m.f.v.) Sigmetrics 01 II 
Unfairness in All Scheduling Policies 
(M/G/1, c.f.m.f.v.) Performance 02, 
Sigmetrics 03 III Implementation of SRPT in 
Web servers Transactions on Computer 
Systems 03, ITC03
www.cs.cmu.edu/harchol/ 
 22What is fair?
Response time for job of size x
Slowdown for job of size x
Slowdown is independent of size
PS does not bias towards any particular job size. 
 Definition A policy P is fair if ES(x)P  
ES(x)PS for all x. Otherwise, P 
is unfair. 
 23Classification of Scheduling Policies 
 and distributions 
 24Testing your intuition 
 and distributions 
 25Classification of Scheduling Policies
 FCFS
 LJF
SJF
PS
FB
Non-preemptive
FSP
Age- Based Policies 
Always Unfair 
Sometimes Unfair 
Always FAIR
Preemptive 
 Size-based Policies 
Preemptive 
 Remaining-size based Policies 
PLCFS
PSJF
SRPT
LRPT
Lots of open problems 
 26Always Unfair
Theorem Any preemptive, size based policy, P, 
 is Always Unfair.
Always Unfair
Case1 A finite size, y, receives lowest 
priority Case 2 No finite size receives lowest 
priority (2a) Priorities decrease 
monotonically -- PSJF (2b) Priorities 
decrease non-monotonically.
Unfair for all loads and distributions 
 27Always Unfair
Theorem Any preemptive, size based policy, P, 
 is Always Unfair.
Always Unfair
Case1 A finite size, y, receives lowest priority
Unfair for all loads and distributions
y
V  Work in System 
 28Always Unfair
Theorem Any preemptive, size based policy, P, 
 is Always Unfair.
Always Unfair
Case2a Priorities decrease monotonically (PSJF)
Infinite sized job has lowest priority ...
Unfair for all loads and distributions
 but that job is treated fairly? 
 29Always Unfair
Theorem Any preemptive, size based policy, P, 
 is Always Unfair.
Always Unfair
Case2a Priorities decrease monotonically (PSJF)
ES(x)
PS PSJF
Unfair for all loads and distributions
x
0
Finding a hump shows PSJF is Always Unfair 
 30Always Unfair
Theorem Any preemptive, size based policy, P, 
 is Always Unfair.
Always Unfair
Case2b Priorities decrease non-monotonically
ES(x)
PS PSJF
Unfair for all loads and distributions
x
0
Find y beyond which PSJF treats all jobs 
unfairly. Find x gt y, where x has lower priority 
than y. gt x is treated unfairly. 
 31The mysterious hump
PS PSJF
ES(x)
x
0
x
0
- This hump appears in many common policies, 
-  PSJF 
-  FB 
-  SRPT 
-  SJF 
32Outline of Talk
I Investigating Unfairness in SRPT 
(M/G/1, c.f.m.f.v.) Sigmetrics 01 II 
Unfairness in All Scheduling Policies 
(M/G/1, c.f.m.f.v.) Performance 02, 
Sigmetrics 03 III Implementation of SRPT in 
Web servers Transactions on Computer 
Systems 03, ITC03
www.cs.cmu.edu/harchol/ 
 33From theory to practice
What does SRPT mean within a Web server?
-  Many devices Where to do the scheduling? 
-  Many jobs at once.
34Servers Performance Bottleneck 
Site buys limited fraction of ISPs bandwidth
client 1
Get File 1
WEB
SERVER
client 2
(Apache)
Rest of Internet
Get File 2
ISP
Linux 0.S.
client 3
Get File 3
5
We schedule bandwidth at servers uplink. 
 35Network/O.S. insides of traditional Web server
Socket 1
Client1
Network Card
Socket 2
Client2
BOTTLENECK
Socket 3
Client3
Sockets take turns draining --- FAIR  PS. 
 36Network/O.S. insides of our improved Web server
Socket 1
Client1
S
Network Card
1st
Socket 2
Client2
2nd
M
BOTTLENECK
3rd
Socket 3
Client3
L
priority queues.
Socket corresponding to file with smallest 
remaining data gets to feed first. 
 37Experimental Setup
1
2
WAN EMU
3
1
200
APACHE WEB SERVER
Linux
2
1
3
2
WAN EMU
3
switch
200
Linux
Linux 0.S.
1
2
WAN EMU
3
200
Linux
 Implementation SRPT-based scheduling 1) 
Modifications to Linux O.S. 6 priority Levels 
 2) Modifications to Apache Web server 
3) Priority algorithm design. 
 38Flash
Experimental Setup
Apache
10Mbps uplink
1
2
WAN EMU
3
100Mbps uplink
APACHE WEB SERVER
1
200
Linux
2
Surge
1
3
2
Trace-based
WAN EMU
3
switch
200
Linux
Open system
Linux 0.S.
1
Partly-open 
2
WAN EMU
3
200
WAN EMU
Linux
Geographically- dispersed clients
Trace-based workload Number requests made 
1,000,000 Size of file requested 41B -- 2 
MB Distribution of file sizes requested has HT 
property.
Load lt 1
Transient overload
 Other effects initial RTO user abort/reload 
 persistent connections, etc. 
 39Results Mean Response Time
.
.
.
Mean Response Time (sec)
FAIR
.
.
SRPT
.
Load 
 40Mean Response Time vs. Size Percentile
Load 0.8
FAIR
Mean Response time (ms)
SRPT
Percentile of Request Size 
 41Transient Overload -- Mean response time 
SRPT
FAIR 
 42Transient overload Response time as function of 
job size
FAIR
SRPT
 small jobs win big!
 big jobs arent hurt! 
 43New project Scheduling dynamic web requests
buy
Web Server (eg Apache/Linux)
Internet
buy
buy
Need to schedule the database ... 
 44Conclusion
Misconceptions about unfairness
Discrimination against high-performance scheduling
 policies
Classifying policies with respect to 
unfairness is counter-intuitive.
Good news Many high-performing policies 
 are also fair in practice!