Server-based Characterization and Inference of Internet Performance - PowerPoint PPT Presentation

1 / 29

About This Presentation

Title:

Server-based Characterization and Inference of Internet Performance

Description:

Earthlink. Darn, it's slow! Why is it. so slow? 14. Related Work. MINC (Caceres et al. 1999) ... San Francisco (AT&T) Indonesia (Indo.net) Sprint PacBell in California ... – PowerPoint PPT presentation

Number of Views:13

Avg rating:3.0/5.0

Slides: 30

Provided by: VenkatPad6

Category:

more less

Transcript and Presenter's Notes

Title: Server-based Characterization and Inference of Internet Performance

1
Server-based Characterization and Inference of
Internet Performance

Venkat Padmanabhan
Lili Qiu
Helen Wang
Microsoft Research
UCLA/IPAM Workshop
March 2002

2
Outline

Overview
Server-based characterization of performance
Server-based inference of performance
Passive Network Tomography
Summary and future work

3
Overview

Goals
characterize end-to-end performance
infer characteristics of interior links
Approach server-based monitoring
passive monitoring ? relatively inexpensive
enables large-scale measurements
diversity of network paths

4
Web server
ACKs
ACKs
DATA
clients
5
Research Questions

Server-based characterization of end-to-end
performance
correlation with topological metrics
spatial locality
temporal stability
Server-based inference of internal link
characteristics
identification of lossy links

6
Related Work

Server-based passive measurement
1996 Olympics Web server study (Berkeley, 1997
1998)
characterization of TCP properties (Allman 2000)
Active measurement
NPD (Paxson 1997)
stationarity of Internet path properties (Zhang
et al. 2001)

7
Experiment Setting

Packet sniffer at microsoft.com
550 MHz Pentium III
sits on spanning port of Cisco Catalyst 6509
packet drop rate lt 0.3
traces up to 2 hours long, 20-125 million
packets, 50-950K clients
Traceroute source
sits on a separate Microsoft network, but all
external hops are shared
infrequent and in the background

8
Topological Metrics and Loss Rate
Topological distance is a poor predictor of
packet loss rate. All links are not equal ? need
to identify the lossy links
9
Spatial Locality

Do clients in the same cluster see similar loss
rates?
Loss rate is quantized into buckets
0-0.5, 0.5-2, 2-5, 5-10, 10-20, 20
suggested by Zhang et al. (IMW 2002)
Focus on lossy clusters
average loss rate gt 5

Spatial locality ? there may be shared cause for
packet loss
10
Temporal Stability

Loss rate again quantized into buckets
Metric of interest stability period (i.e., time
until transition into new bucket)
Median stability period 10 minutes
Consistent with previous findings based on active
measurements

11
Putting it all together

All links are not equal ? need to identify the
lossy links
Spatial locality of packet loss rate ? lossy
links may well be shared
Temporal stability ? worthwhile to try and
identify the lossy links

12
Passive Network Tomography

Goal determine characteristics of internal
network links using end-to-end, passive
measurements
We focus on the link loss rate metric
primary goal identifying lossy links
Why is this interesting?
locating trouble spots in the network
keeping tabs on your ISP
server placement and server selection

13
Web server
Why is it so slow?
ATT
Sprint
CW
Earthlink
UUNET
Darn, its slow!
AOL
Qwest
14
Related Work

MINC (Caceres et al. 1999)
multicast-based active probing
Striped unicast (Duffield et al. 2001)
unicast-based active probing
Passive measurement (Coates et al. 2002)
look for back-to-back packets
Shared bottleneck detection
Padmanabhan 1999, Rubenstein et al. 2000, Katabi
et al. 2001

15
Active Network Tomography
S
A
B
Striped unicast probes
Multicast probes
16
Problem Formulation
server
Collapse linear chains into virtual
links (1-l1)(1-l2)(1-l4) (1-p1) (1-l1)(1-l2)
(1-l5) (1-p2) (1-l1)(1-l3)(1-l8)
(1-p5) Under-constrained system of equations
l1
l3
l2
l8
l7
l6
l4
l5
p1
p2
p3
p4
p5
clients
17
1 Random Sampling

Randomly sample the solution space
Repeat this several times
Draw conclusions based on overall statistics
How to do random sampling?
determine loss rate bound for each link using
best downstream client
iterate over all links
pick loss rate at random within bounds
update bounds for other links
Problem little tolerance for estimation error

server
l1
l3
l2
l8
l7
l6
l4
l5
p1
p2
p3
p4
p5
clients
18
2 Linear Optimization

Goals
Parsimonious explanation
Robust to estimation error
Li log(1/(1-li)), Pj log(1/(1-pj))
minimize ?Li ?Sj
L1L2L4 S1 P1
L1L2L5 S2 P2
L1L3L8 S5 P5
Li gt 0
Can be turned into a linear program

server
l1
l3
l2
l8
l7
l6
l4
l5
p1
p2
p3
p4
p5
clients
19
3 Bayesian Inference

Basics
D observed data
sj packets successfully sent to client j
fj packets that client j fails to receive
T unknown model parameters
li packet loss rate of link i
Goal determine the posterior P(TD)
inference is based on loss events, not loss rates
Bayes theorem
P(TD) P(DT)P(T)/?P(DT)P(T)dT
hard to compute since T is multidimensional

server
l1
l3
l2
l8
l7
l6
l4
l5
(s1,f1)
(s2,f2)
(s3,f3)
(s4,f4)
(s5,f5)
clients
20
Gibbs Sampling

Markov Chain Monte Carlo (MCMC)
construct a Markov chain whose stationary
distribution is P(TD)
Gibbs Sampling defines the transition kernel
start with an arbitrary initial assignment of li
consider each link i in turn
compute P(liD) assuming lj is fixed for j?i
draw sample from P(liD) and update li
after burn-in period, we obtain samples from the
posterior P(TD)

21
Gibbs Sampling Algorithm

1) Initialize link loss rates arbitrarily
2) For j 1 burn-in for each link i
compute P(liD, li) where li is
loss rate of link i, and li ?j?i lj
3) For j 1 realSamples for each link
i compute P(liD, li)
Use all the samples obtained at step 3 to
approximate P(?D)

22
Experimental Evaluation

Simulation experiments
Internet traffic traces

23
Simulation Experiments

Advantage no uncertainty about link loss rate
Methodology
Topologies used
randomly-generated 20 - 3000 nodes, max degree
5-50
real topology obtained by tracing paths to
microsoft.com clients
randomly-generated packet loss events at each
link
a fraction f of the links are good, and the rest
are bad
LM1 good links 0 1, bad links 5 10
LM2 good links 0 1, bad links 1 100
Goodness metrics
Coverage correctly inferred lossy links
False positives incorrectly inferred lossy
links

24
Simulation Results
25
Simulation Results
26
Simulation Results
High confidence in top few inferences
27
Trade-off
Techniques Coverage False Positive Computation
Random sampling High High Low
LP Medium Low Medium
Gibbs sampling High Low High
28
Internet Traffic Traces

Challenge validation
Divide client traces into two tomography set and
validation set
Tomography data set gt loss inference
Validation set gt check if clients downstream of
the inferred lossy links experience high loss
Results
false positive rate is between 5 30
likely candidates for lossy links
links crossing an inter-AS boundary
links having a large delay (e.g. transcontinental
links)
links that terminate at clients
example lossy links
San Francisco (ATT) ? Indonesia (Indo.net)
Sprint ? PacBell in California
Moscow ? Tyumen, Siberia (Sovam Teleport)

29
Summary

Poor correlation between topological metrics
performance
Significant spatial locality and temporal
stability
Passive network tomography is feasible
Tradeoff between computational cost and accuracy
Future directions
real-time inference
selective active probing
Acknowledgements
MSR Dimitris Achlioptas, Christian Borgs,
Jennifer Chayes, David Heckerman, Chris Meek,
David Wilson
Infrastructure Rob Emanuel, Scott Hogan
http//www.research.microsoft.com/padmanab