Characterizing and Predicting TCP Throughput on the Wide Area Network - PowerPoint PPT Presentation

About This Presentation

Title:

Characterizing and Predicting TCP Throughput on the Wide Area Network

Description:

Characterizing and Predicting TCP Throughput on the Wide Area Network Dong Lu, Yi Qiao, Peter Dinda, Fabian Bustamante Department of Computer Science – PowerPoint PPT presentation

Number of Views:80

Avg rating:3.0/5.0

Slides: 30

Provided by: north124

Learn more at: https://users.cs.northwestern.edu

Category:

more less

Transcript and Presenter's Notes

Title: Characterizing and Predicting TCP Throughput on the Wide Area Network

1
Characterizing and Predicting TCP Throughput on
the Wide Area Network

Dong Lu, Yi Qiao,
Peter Dinda, Fabian Bustamante
Department of Computer Science
Northwestern University
http//plab.cs.northwestern.edu

2
Overview

Algorithm for predicting the TCP throughput as
function of flow size
Minimal active probing
Dynamic probe rate adjustment
Explaining flow size / throughput correlation
Explaining why simple active probing fails
Large scale empirical study

3
Outline

Why TCP throughput prediction?
Particulars of study
Flow size / TCP throughput correlation
Issues with simple benchmarking
DualPats algorithm
Stability and dynamic rate adjustment

4
Goal

A library call
BW PredictTransfer(src,dst,numbytes)
Expected Time numbytes/BW
Ideally, we want a confidence interval
(BWLow,BWHigh) PredictTransfer(src,dst,numbytes,
p)

5
Available Bandwidth

Maximum rate a path can offer a flow without
slowing other flows
pathchar, cprobe, nettimer, delphi, IGI,
pathchirp, pathload
Available bandwidth can differ significantly from
TCP throughput
Not real time, takes at least tens of seconds to
run

6
Simple TCP Benchmarking

Benchmark paths with a single small probe
BW ProbeSize/Time
Widely used Network Weather Service (NWS) and
others (Remos benchmarking collector)
Not accurate for large transfers on the current
high speed Internet
Numerous papers show this and attempt to fix it

7
Fixing Simple TCP Benchmarking

Logs Sundharshan correlate real transfer
measurements with benchmarking measurements
Recent transfers needed
Similar size transfers needed
Measurements at application chosen times
CDF-matching Swany correlate CDF of real
transfer measurements with CDF of benchmarking
measurements
Recent transfers still needed
Measurements at application chosen times

8
Analysis of TCP

Extensive research on TCP throughput modeling in
networking community
Really intended to build better TCPs
Difficult to use models online because of hard to
measure parameters
Future loss rate and RTT
Note we measure goodput

9
Our Measurement Study

PlanetLab and additional machines
Located all over the world
Measurements of throughput
Wide open socket buffers (1-3 MB)
Simple ttcp-like client/server
scp
GridFTP
Four separate sets of measurements

10
Distribution Set

For analysis of TCP throughput stability and
distributions
60 randomly chosen paths among PlanetLab machines
1.6 million transfers (client/server)
100 KB, 200 KB, 400 KB, 10 MB flows
3000 consecutive transfers per pathflow size

11
Correlation Set

For studying correlation between throughput and
flow size, initial testing of algorithm
60 randomly chosen paths among PlanetLab machines
2.4 million transfers, 270 thousand runs,
client/server
100 KB, 200 KB, 400 KB, 10 MB flows
Run sweep flow size for path

12
Verification Set

Test algorithm
30 randomly chosen paths among PlanetLab machines
and others
4800 transfers, 300 runs, scp and GridFTP
5 KB to 1 GB flows
Run sweep flow size for path

13
Online Evaluation Set

Test online algorithm
50 randomly chosen paths among PlanetLab machines
and others
14000 transfers, scp and GridFTP
40 MB or 160 MB file, randomly chosen size
10 days

14
Strong Correlation Between TCP Throughput and
Flow Size
Correlation and Verification Sets
15
Why Does The Correlation Exist?

Slow start and user effects Zhang
Extant flows
Non-negligible startup overheads
Control messages in scp and GridFTP
Residual slow start effect
SACK results in slow convergence to equilibrium

16
Why Simple Benchmarking Fails
Need more than one probe to capture correlation
Probes are too small
17
Our Approach
Two consecutive probes, both larger than the
noise region
18
Our Approach

Two consecutive probes are integrated into a
single probe
400KB, 800 KB in single 800 KB probe

Probe two
Probe one
T2
0
T1
19
Our Approach
Flow size
Transfer Time
Solve For A and B
Predict Throughput For Some Other Transfer
20
Model Fit is Excellent
Low and Normally Distributed Relative Errors At
All Flow Sizes
Correlation Set
21
Stability

How long does the TCP throughput function remain
stable?
How frequently should we probe the path?
Whats the distribution of throughput around the
function (i.e., the error)?

22
Throughput is Stable For Long Periods
Increasing Max/Min Throughput in Interval
Correlation Set
23
Throughput Is Normally Distributed In An Interval
Distribution Set
24
Online DualPats Algorithm

Fetch probe sequence for destination
Start probing process if no data exists
Project probe sequence ahead
20 point moving average over values with current
sampling interval
Apply model using projected data
Return result
confidence interval computed using normality
assumptions

25
Dynamic Sampling Rate

Adjust sampling interval to correspond to the
paths stable intervals
Limit rate (20 to 1200 seconds)
Additive increase / additive decrease of based on
difference between last two probes
lt 5 gt increase interval
gt 15 gt decrease interval

26
Finding Sufficiently Large Probe Size

Default values 400 KB / 800 KB
Upper bound
Additive increase until prediction error are less
than threshold, all with same sign.

27
Evaluation

Slight conservative bias
gt90 of predictions have lt 35 error

1
Pmean error lt X
Mean relative error
Mean abs(relative error)
0.4
-0.4
0
Relative error
Online Evaluation Set
28
Conclusions

Algorithm for predicting the TCP throughput as
function of flow size
Minimal active probing
Dynamic probe rate adjustment
Explaining flow size / throughput correlation
Explaining why simple active probing fails
Large scale empirical study

29
For MoreInfo

Prescience Lab
http//plab.cs.northwestern.edu
Aqua Lab
http//aqualab.cs.northwestern.edu
D. Lu, Y. Qiao, P. Dinda, and F. Bustamante,
Modeling and Taming Parallel TCP on the Wide Area
Network, IPDPS 2005 .
Y. Qiao, J. Skicewicz, P. Dinda, An Empirical
Study of the Multiscale Predictability of Network
Traffic, HPDC 2004.