Title: Influence of heavy-tailed distributions on load balancing
1Scheduling Your Network Connections
Mor Harchol-Balter Carnegie Mellon University
Joint work with Bianca Schroeder
2Q Which minimizes mean response time?
size service requirement
load r lt 1
3Q Which best represents scheduling in web
servers ?
FCFS
jobs
size service requirement
load r lt 1
jobs
PS
SRPT
jobs
4IDEA
How about using SRPT instead of PS in web servers?
client 1
Get File 1
WEB SERVER (Apache)
client 2
Internet
Get File 2
Linux 0.S.
client 3
Get File 3
5Immediate Objections
1) Cant assume known job size
Many servers receive mostly static web
requests.GET FILEFor static web requests,
know file size
Approx. know service requirement of request.
2) But the big jobs will starve ...
6Outline of Talk
1) Analysis of SRPT Scheduling Investigating
Unfairness 2)
Size-based Scheduling to Improve Web
Performance 3) Web servers under overload
How scheduling can help
www.cs.cmu.edu/harchol/
7SRPT has a long history ...
1966 Schrage Miller derive M/G/1/SRPT response
time
1968 Schrage proves optimality 1979 Pechinkin
Solovyev Yashkov generalize 1990 Schassberger
derives distribution on queue length
BUT WHAT DOES IT ALL MEAN?
8SRPT has a long history (cont.)
1990 - 97 7-year long study at Univ. of Aachen
under Schreiber SRPT WINS BIG ON
MEAN! 1998, 1999 Slowdown for SRPT under
adversary Rajmohan, Gehrke,
Muthukrishnan, Rajaraman, Shaheen,
Bender, Chakrabarti, etc. SRPT
STARVES BIG JOBS! Various o.s. books
Silberschatz, Stallings, Tannenbaum
Warn about starvation of big jobs
... Kleinrocks Conservation Law
Preferential treatment given to one class of
customers is afforded at the expense of other
customers.
9Unfairness Question
Let r0.9. Let G Bounded Pareto(a 1.1,
max1010)
Question Which queue does biggest job prefer?
10Our Analytical Results (M/G/1)
All-Can-Win Theorem Under workloads with
heavy-tailed (HT) property, ALL jobs, including
the very biggest, prefer SRPT to PS, provided
load not too close to 1. Almost-All-Win-Big
Theorem Under workloads with HT property, 99
of all jobs perform orders of magnitude better
under SRPT.
PS
SRPT
11Whats Heavy-Tail?
Fraction of jobs with CPU duration gt x
Berkeley Unix process CPU lifetimes HD96
12Whats the Heavy-Tail property?
Defn heavy-tailed distribution
-
a
lt
lt
gt
a
2
0
,
Pr
x
x
X
Many real-world workloads well-modeled by
truncated HT distribution. Key property HT
Property Largest 1 of jobs comprise half the
load.
13Our Analytical Results (M/G/1)
All-Can-Win Theorem Under workloads with
heavy-tailed (HT) property, ALL jobs, including
the very biggest, prefer SRPT to PS, provided
load not too close to 1. Almost-All-Win-Big
Theorem Under workloads with HT property, 99
of all jobs perform orders of magnitude better
under SRPT.
PS
SRPT
14Our Analytical Results (M/G/1)
All-distributions-win-thm If load lt .5, for
every job size distribution, ALL jobs prefer SRPT
to PS. Bounding-the-damage Theorem For any
load, for every job size distribution, for every
size x,
ö
æ
r
ç
lt
x
T
E
x
T
E
)
(
1
)
(
PS
SRPT
ç
)
-
2(1
r
ø
è
15From theory to practice
What does SRPT mean within a Web server?
- Many devices Where to do the scheduling?
- No longer one job at a time.
16Servers Performance Bottleneck
Site buys limited fraction of ISPs bandwidth
client 1
Get File 1
WEB
SERVER
client 2
(Apache)
Rest of Internet
Get File 2
ISP
Linux 0.S.
client 3
Get File 3
5
We model bottleneck by limiting bandwidth on
servers uplink.
17Network/O.S. insides of traditional Web server
Socket 1
Client1
Network Card
Socket 2
Client2
BOTTLENECK
Socket 3
Client3
Sockets take turns draining --- FAIR PS.
18Network/O.S. insides of our improved Web server
Socket 1
Client1
S
Network Card
1st
Socket 2
Client2
2nd
M
BOTTLENECK
3rd
Socket 3
Client3
L
priority queues.
Socket corresponding to file with smallest
remaining data gets to feed first.
19Experimental Setup
Implementation SRPT-based scheduling 1)
Modifications to Linux O.S. 6 priority Levels
2) Modifications to Apache Web server
3) Priority algorithm design.
20Flash
Experimental Setup
Apache
10Mbps uplink
1
2
WAN EMU
3
100Mbps uplink
APACHE WEB SERVER
1
200
Linux
2
Surge
1
3
2
Trace-based
WAN EMU
3
switch
200
Linux
Open system
Linux 0.S.
1
Partly-open
2
WAN EMU
3
200
WAN EMU
Linux
Geographically- dispersed clients
Trace-based workload Number requests made
1,000,000 Size of file requested 41B -- 2
MB Distribution of file sizes requested has HT
property.
Load lt 1
Transient overload
Other effects initial RTO user abort/reload
persistent connections, etc.
21Preliminary Comments
1
2
WAN EMU
3
APACHE WEB SERVER
1
200
Linux
2
1
3
2
WAN EMU
3
switch
200
Linux
Linux 0.S.
1
2
WAN EMU
3
200
Linux
- Job throughput, byte throughput, and bandwidth
- utilization were same under SRPT and FAIR
scheduling. - Same set of requests complete.
- No additional CPU overhead under SRPT
scheduling. - Network was bottleneck in all experiments.
22Results Mean Response Time
.
.
.
Mean Response Time (sec)
FAIR
.
.
SRPT
.
Load
23Results Mean Slowdown
FAIR
Mean Slowdown
SRPT
Load
24Mean Response Time vs. Size Percentile
Load 0.8
FAIR
Mean Response time (ms)
SRPT
Percentile of Request Size
25Summary so far ...
- SRPT scheduling yields significant improvements
in Mean Response Time at the server. - Negligible starvation.
- No CPU overhead.
- No drop in throughput.
26More questions
- So far only showed LAN results.
- Are the effects of SRPT in a WAN as strong?
- So far only showed load lt 1.
- What happens under SRPT vs. FAIR when the
- server runs under transient overload?
- -gt new analysis
- -gt implementation study
27WAN EMU results
Propagation delay has additive effect. Reduces
improvement factor.
FAIR
SRPT
28WAN EMU results
Loss has quadratic effect. Reduces improvement
factor a lot.
FAIR
SRPT
29WAN results
Geographically-dispersed clients
Load 0.9
Load 0.7
30Overload 5 minute overview
Person under overload
31Q What happens under overload? A Buildup in
number of connections.
FAIR
SRPT
Q What happens to response time?
32Web server under overload
When reach SYN-queue limit, server drops all
connection requests.
SYN-queue
Clients
Server
SYN-queue
ACK-queue
Apache-processes
33Transient Overload
rgt1
rgt1
rgt1
rlt1
rlt1
rgt1
rgt1
rgt1
rlt1
rlt1
rlt1
34Transient Overload - Baseline Mean response time
SRPT
FAIR
35Transient overload Response time as function of
job size
FAIR
SRPT
small jobs win big!
big jobs arent hurt!
WHY?
36FACTORS
Baseline Case
WAN propagation delays
RTT 0 150 ms
WAN loss
Loss 0 15
WAN loss delay
Loss 0 15
RTT 0 150 ms,
Persistent Connections
0 10 requests/conn.
RTO 0.5 sec 3 sec
Initial RTO value
ON/OFF
SYN Cookies
Abort after 3 15 sec, with 2,4,6,8 retries.
User Abort/Reload
Packet Length
Packet length 536 1500 Bytes
Realistic Scenario
RTT 100 ms Loss 5 5 requests/conn., RTO
3 sec pkt len 1500B User aborts After 7 sec
and retries up to 3 times.
37Transient Overload - Realistic Mean response time
FAIR
SRPT
38Conclusion
- SRPT scheduling is a promising solution for
reducing - mean response time seen by clients,
particularly when the load at server bottleneck
is high. - SRPT results in negligible or zero unfairness to
large requests. - SRPT is easy to implement.
- Results corroborated via implementation and
analysis.