Characterizing and Predicting TCP Throughput on the Wide Area Network - PowerPoint PPT Presentation

About This Presentation
Title:

Characterizing and Predicting TCP Throughput on the Wide Area Network

Description:

Characterizing and Predicting TCP Throughput on the Wide Area Network Dong Lu, Yi Qiao, Peter Dinda, Fabian Bustamante Department of Computer Science – PowerPoint PPT presentation

Number of Views:80
Avg rating:3.0/5.0
Slides: 30
Provided by: north124
Category:

less

Transcript and Presenter's Notes

Title: Characterizing and Predicting TCP Throughput on the Wide Area Network


1
Characterizing and Predicting TCP Throughput on
the Wide Area Network
  • Dong Lu, Yi Qiao,
  • Peter Dinda, Fabian Bustamante
  • Department of Computer Science
  • Northwestern University
  • http//plab.cs.northwestern.edu

2
Overview
  • Algorithm for predicting the TCP throughput as
    function of flow size
  • Minimal active probing
  • Dynamic probe rate adjustment
  • Explaining flow size / throughput correlation
  • Explaining why simple active probing fails
  • Large scale empirical study

3
Outline
  • Why TCP throughput prediction?
  • Particulars of study
  • Flow size / TCP throughput correlation
  • Issues with simple benchmarking
  • DualPats algorithm
  • Stability and dynamic rate adjustment

4
Goal
  • A library call
  • BW PredictTransfer(src,dst,numbytes)
  • Expected Time numbytes/BW
  • Ideally, we want a confidence interval
  • (BWLow,BWHigh) PredictTransfer(src,dst,numbytes,
    p)

5
Available Bandwidth
  • Maximum rate a path can offer a flow without
    slowing other flows
  • pathchar, cprobe, nettimer, delphi, IGI,
    pathchirp, pathload
  • Available bandwidth can differ significantly from
    TCP throughput
  • Not real time, takes at least tens of seconds to
    run

6
Simple TCP Benchmarking
  • Benchmark paths with a single small probe
  • BW ProbeSize/Time
  • Widely used Network Weather Service (NWS) and
    others (Remos benchmarking collector)
  • Not accurate for large transfers on the current
    high speed Internet
  • Numerous papers show this and attempt to fix it

7
Fixing Simple TCP Benchmarking
  • Logs Sundharshan correlate real transfer
    measurements with benchmarking measurements
  • Recent transfers needed
  • Similar size transfers needed
  • Measurements at application chosen times
  • CDF-matching Swany correlate CDF of real
    transfer measurements with CDF of benchmarking
    measurements
  • Recent transfers still needed
  • Measurements at application chosen times

8
Analysis of TCP
  • Extensive research on TCP throughput modeling in
    networking community
  • Really intended to build better TCPs
  • Difficult to use models online because of hard to
    measure parameters
  • Future loss rate and RTT
  • Note we measure goodput

9
Our Measurement Study
  • PlanetLab and additional machines
  • Located all over the world
  • Measurements of throughput
  • Wide open socket buffers (1-3 MB)
  • Simple ttcp-like client/server
  • scp
  • GridFTP
  • Four separate sets of measurements

10
Distribution Set
  • For analysis of TCP throughput stability and
    distributions
  • 60 randomly chosen paths among PlanetLab machines
  • 1.6 million transfers (client/server)
  • 100 KB, 200 KB, 400 KB, 10 MB flows
  • 3000 consecutive transfers per pathflow size

11
Correlation Set
  • For studying correlation between throughput and
    flow size, initial testing of algorithm
  • 60 randomly chosen paths among PlanetLab machines
  • 2.4 million transfers, 270 thousand runs,
    client/server
  • 100 KB, 200 KB, 400 KB, 10 MB flows
  • Run sweep flow size for path

12
Verification Set
  • Test algorithm
  • 30 randomly chosen paths among PlanetLab machines
    and others
  • 4800 transfers, 300 runs, scp and GridFTP
  • 5 KB to 1 GB flows
  • Run sweep flow size for path

13
Online Evaluation Set
  • Test online algorithm
  • 50 randomly chosen paths among PlanetLab machines
    and others
  • 14000 transfers, scp and GridFTP
  • 40 MB or 160 MB file, randomly chosen size
  • 10 days

14
Strong Correlation Between TCP Throughput and
Flow Size
Correlation and Verification Sets
15
Why Does The Correlation Exist?
  • Slow start and user effects Zhang
  • Extant flows
  • Non-negligible startup overheads
  • Control messages in scp and GridFTP
  • Residual slow start effect
  • SACK results in slow convergence to equilibrium

16
Why Simple Benchmarking Fails
Need more than one probe to capture correlation
Probes are too small
17
Our Approach
Two consecutive probes, both larger than the
noise region
18
Our Approach
  • Two consecutive probes are integrated into a
    single probe
  • 400KB, 800 KB in single 800 KB probe

Probe two
Probe one
T2
0
T1
19
Our Approach
Flow size
Transfer Time
Solve For A and B
Predict Throughput For Some Other Transfer
20
Model Fit is Excellent
Low and Normally Distributed Relative Errors At
All Flow Sizes
Correlation Set
21
Stability
  • How long does the TCP throughput function remain
    stable?
  • How frequently should we probe the path?
  • Whats the distribution of throughput around the
    function (i.e., the error)?

22
Throughput is Stable For Long Periods
Increasing Max/Min Throughput in Interval
Correlation Set
23
Throughput Is Normally Distributed In An Interval
Distribution Set
24
Online DualPats Algorithm
  • Fetch probe sequence for destination
  • Start probing process if no data exists
  • Project probe sequence ahead
  • 20 point moving average over values with current
    sampling interval
  • Apply model using projected data
  • Return result
  • confidence interval computed using normality
    assumptions

25
Dynamic Sampling Rate
  • Adjust sampling interval to correspond to the
    paths stable intervals
  • Limit rate (20 to 1200 seconds)
  • Additive increase / additive decrease of based on
    difference between last two probes
  • lt 5 gt increase interval
  • gt 15 gt decrease interval

26
Finding Sufficiently Large Probe Size
  • Default values 400 KB / 800 KB
  • Upper bound
  • Additive increase until prediction error are less
    than threshold, all with same sign.

27
Evaluation
  • Slight conservative bias
  • gt90 of predictions have lt 35 error

1
Pmean error lt X
Mean relative error
Mean abs(relative error)
0.4
-0.4
0
Relative error
Online Evaluation Set
28
Conclusions
  • Algorithm for predicting the TCP throughput as
    function of flow size
  • Minimal active probing
  • Dynamic probe rate adjustment
  • Explaining flow size / throughput correlation
  • Explaining why simple active probing fails
  • Large scale empirical study

29
For MoreInfo
  • Prescience Lab
  • http//plab.cs.northwestern.edu
  • Aqua Lab
  • http//aqualab.cs.northwestern.edu
  • D. Lu, Y. Qiao, P. Dinda, and F. Bustamante,
    Modeling and Taming Parallel TCP on the Wide Area
    Network, IPDPS 2005 .
  • Y. Qiao, J. Skicewicz, P. Dinda, An Empirical
    Study of the Multiscale Predictability of Network
    Traffic, HPDC 2004.
Write a Comment
User Comments (0)
About PowerShow.com