Tomographybased Overlay Network Monitoring - PowerPoint PPT Presentation

About This Presentation
Title:

Tomographybased Overlay Network Monitoring

Description:

Incremental changes: O(k2) time (O(n2k2) for re-scan) ... Simulation on adding/removing end hosts and routing changes also give good results ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 38
Provided by: yanc8
Category:

less

Transcript and Presenter's Notes

Title: Tomographybased Overlay Network Monitoring


1
Tomography-based Overlay Network Monitoring
Yan Chen, David Bindel, and Randy H. Katz
  • UC Berkeley

2
Motivation
  • Infrastructure ossification led to thrust of
    overlay and P2P applications
  • Such applications flexible on paths and targets,
    thus can benefit from E2E distance monitoring
  • Overlay routing/location
  • VPN management/provisioning
  • Service redirection/placement
  • Requirements for E2E monitoring system
  • Scalable efficient small amount of probing
    traffic
  • Accurate capture congestion/failures
  • Incrementally deployable
  • Easy to use

3
Existing Work
  • General Metrics RON (n2 measurement)
  • Latency Estimation
  • Clustering-based IDMaps, Internet Isobar, etc.
  • Coordinate-based GNP, ICS, Virtual Landmarks
  • Network tomography
  • Focusing on inferring the characteristics of
    physical links rather than E2E paths
  • Limited measurements -gt under-constrained system,
    unidentifiable links

4
Problem Formulation
  • Given an overlay of n end hosts and O(n2) paths,
    how to select a minimal subset of paths to
    monitor so that the loss rates/latency of all
    other paths can be inferred.
  • Assumptions
  • Topology measurable
  • Can only measure the E2E path, not the link

5
Our Approach
  • Select a basis set of k paths that fully describe
    O(n2) paths (k O(n2))
  • Monitor the loss rates of k paths, and infer the
    loss rates of all other paths
  • Applicable for any additive metrics, like latency

6
Modeling of Path Space
A
1
3
D
C
2
B
  • Path loss rate p, link loss rate l

7
Putting All Paths Together
A
1
3
D
C
2
B
Totally r O(n2) paths, s links, s r


8
Sample Path Matrix
  • x1 - x2 unknown gt cannot compute x1, x2
  • Set of vectors
  • form null space
  • To separate identifiable vs. unidentifiable
    components x xG xN

9
Intuition through Topology Virtualization
  • Virtual links
  • Minimal path segments whose loss rates uniquely
    identified
  • Can fully describe all paths
  • xG is composed of virtual links

All E2E paths are in path space, i.e., GxN 0
10
More Examples
Virtualization
Real links (solid) and all of the overlay paths
(dotted) traversing them
Virtual links
11
Algorithms
  • Select k rank(G) linearly independent paths to
    monitor
  • Use QR decomposition
  • Leverage sparse matrix time O(rk2) and memory
    O(k2)
  • E.g., 10 minutes for n 350 (r 61075) and k
    2958
  • Compute the loss rates of other paths
  • Time O(k2) and memory O(k2)




12
How many measurements saved ?
  • k O(n2) ?
  • For a power-law Internet topology
  • When the majority of end hosts are on the overlay
  • When a small portion of end hosts are on overlay
  • If Internet a pure hierarchical structure (tree)
    k O(n)
  • If Internet no hierarchy at all (worst case,
    clique) k O(n2)
  • Internet has moderate hierarchical structure
    TGJ02

k O(n) (with proof)
For reasonably large n, (e.g., 100), k
O(nlogn) (extensive linear regression tests on
both synthetic and real topologies)
13
Practical Issues
  • Topology measurement errors tolerance
  • Measurement load balancing on end hosts
  • Randomized algorithm
  • Adaptive to topology changes
  • Add/remove end hosts and routing changes
  • Efficient algorithms for incrementally update of
    selected paths

14
Evaluation
  • Extensive Simulations
  • Experiments on PlanetLab
  • 51 hosts, each from different organizations
  • 51 50 2,550 paths
  • On average k 872
  • Results Highlight
  • Avg real loss rate 0.023
  • Absolute error mean 0.0027 90 lt 0.014
  • Relative error mean 1.1 90 lt 2.0
  • On average 248 out of 2550 paths have no or
    incomplete routing information
  • No router aliases resolved

15
Conclusions
  • A tomography-based overlay network monitoring
    system
  • Given n end hosts, characterize O(n2) paths with
    a basis set of O(n logn) paths
  • Selectively monitor the basis set for their loss
    rates, then infer the loss rates of all other
    paths
  • Both simulation and PlanetLab experiments show
    promising results

16
Backup Slides
17
Problem Formulation
  • Given an overlay of n end hosts and O(n2) paths,
    how to select a minimal subset of paths to
    monitor so that the loss rates/latency of all
    other paths can be inferred.
  • Key idea based on topology, select a basis set
    of k paths that fully describe O(n2) paths (k
    O(n2))
  • Monitor the loss rates of k paths, and infer the
    loss rates of all other paths
  • Applicable for any additive metrics, like latency

18
Modeling of Path Space
A
1
3
D
C
2
B
  • Path loss rate p, link loss rate l

Put all r O(n2) paths together Totally s links
19
Sample Path Matrix
  • x1 - x2 unknown gt cannot compute x1, x2
  • Set of vectors
  • form null space
  • To separate identifiable vs. unidentifiable
    components x xG xN
  • All E2E paths are in path space, i.e., GxN 0

20
More Examples
Virtualization
Real links (solid) and all of the overlay paths
(dotted) traversing them
Virtual links
21
Linear Regression Tests of the Hypothesis
  • BRITE Router-level Topologies
  • Barbarasi-Albert, Waxman, Hierarchical models
  • Mercator Real Topology
  • Most have the best fit with O(n) except the
    hierarchical ones fit best with O(n logn)

22
Algorithms

  • Select k rank(G) linearly independent paths to
    monitor
  • Use rank revealing decomposition
  • Leverage sparse matrix time O(rk2) and memory
    O(k2)
  • E.g., 10 minutes for n 350 (r 61075) and k
    2958
  • Compute the loss rates of other paths
  • Time O(k2) and memory O(k2)

23
Practical Issues
  • Topology measurement errors tolerance
  • Care about path loss rates than any interior
    links
  • Poor router alias resolution
  • gt assign similar loss rates to the same links
  • Unidentifiable routers
  • gt add virtual links to bypass
  • Measurement load balancing on end hosts
  • Randomly order the paths for scan and selection
    of
  • Topology Changes
  • Efficient algorithms for incrementally update of
  • for adding/removing end hosts routing changes

24
Work in Progress
  • Provide it as a continuous service on PlanetLab
  • Network diagnostics
  • Which links or path segments are down
  • Iterative methods for better speed and scalability

25
Topology Changes
  • Basic building block add/remove one path
  • Incremental changes O(k2) time (O(n2k2) for
    re-scan)
  • Add path check linear dependency with old basis
    set,
  • Delete path p hard when
  • The essential info described by p
  • Add/remove end hosts , Routing changes
  • Topology relatively stable in order of a day
  • gt incremental detection

26
Evaluation
  • Simulation
  • Topology
  • BRITE Barabasi-Albert, Waxman, hierarchical 1K
    20K nodes
  • Real topology from Mercator 284K nodes
  • Fraction of end hosts on the overlay 1 - 10
  • Loss rate distribution (90 links are good)
  • Good link 0-1 loss rate bad link 5-10 loss
    rates
  • Good link 0-1 loss rate bad link 1-100 loss
    rates
  • Loss model
  • Bernouli independent drop of packet
  • Gilbert busty drop of packet
  • Path loss rate simulated via transmission of 10K
    pkts
  • Experiments on PlanetLab

27
Experiments on Planet Lab
  • 51 hosts, each from different organizations
  • 51 50 2,550 paths
  • Simultaneous loss rate measurement
  • 300 trials, 300 msec each
  • In each trial, send a 40-byte UDP pkt to every
    other host
  • Simultaneous topology measurement
  • Traceroute
  • Experiments 6/24 6/27
  • 100 experiments in peak hours

28
Sensitivity Test of Sending Frequency
  • Big jump for of lossy paths when the sending
    rate is over 12.8 Mbps

29
PlanetLab Experiment Results
  • Loss rate distribution
  • Metrics
  • Absolute error p p
  • Average 0.0027 for all paths, 0.0058 for lossy
    paths
  • Relative error BDPT02
  • Lossy path inference coverage and false positive
    ratio
  • On average k 872 out of 2550

30
Accuracy Results for One Experiment
  • 95 of absolute error lt 0.0014
  • 95 of relative error lt 2.1

31
Accuracy Results for All Experiments
  • For each experiment, get its 95 absolute
    relative errors
  • Most have absolute error lt 0.0135 and relative
    error lt 2.0

32
Lossy Path Inference Accuracy
  • 90 out of 100 runs have coverage over 85 and
    false positive less than 10
  • Many caused by the 5 threshold boundary effects

33
Topology/Dynamics Issues
  • Out of 13 sets of pair-wise traceroute
  • On average 248 out of 2550 paths have no or
    incomplete routing information
  • No router aliases resolved
  • Conclusion robust against topology measurement
    errors
  • Simulation on adding/removing end hosts and
    routing changes also give good results

34
Performance Improvement with Overlay
  • With single-node relay
  • Loss rate improvement
  • Among 10,980 lossy paths
  • 5,705 paths (52.0) have loss rate reduced by
    0.05 or more
  • 3,084 paths (28.1) change from lossy to
    non-lossy
  • Throughput improvement
  • Estimated with
  • 60,320 paths (24) with non-zero loss rate,
    throughput computable
  • Among them, 32,939 (54.6) paths have throughput
    improved, 13,734 (22.8) paths have throughput
    doubled or more
  • Implications use overlay path to bypass
    congestion or failures

35
Adaptive Overlay Streaming Media
Stanford
UC San Diego
UC Berkeley
X
HP Labs
  • Implemented with Winamp client and SHOUTcast
    server
  • Congestion introduced with a Packet Shaper
  • Skip-free playback server buffering and
    rewinding
  • Total adaptation time lt 4 seconds

36
Adaptive Streaming Media Architecture
37
Conclusions
  • A tomography-based overlay network monitoring
    system
  • Given n end hosts, characterize O(n2) paths with
    a basis set of O(nlogn) paths
  • Selectively monitor O(nlogn) paths to compute the
    loss rates of the basis set, then infer the loss
    rates of all other paths
  • Both simulation and real Internet experiments
    promising
  • Built adaptive overlay streaming media system on
    top of monitoring services
  • Bypass congestion/failures for smooth playback
    within seconds
Write a Comment
User Comments (0)
About PowerShow.com