Title: IDMaps:%20A%20Global%20Internet%20Host%20Distance%20Estimation%20Service
1IDMaps A Global Internet Host Distance
Estimation Service
- P. Francis, S. Jamin, C. Jin, Y. Jin, D. Raz, Y.
Shavitt, L. Zhang - Presenter Zhenying Liu
2Contents
- Background
- Goals
- Related work
- Architecture
- Performance Evaluation
- Conclusion
3Background
- Increasing need to learn network distances,
bandwidth - One method
- Measure the distance by itself(ping, traceroute)
- A useful general service quick, efficient
- SONAR, Feb. 1996
- HOPS(Host proximity Service)
- Need underlying measurement infrastructure to
provide distance measurements
4Contents
- Background
- Goals
- Related work
- Architecture
- Performance Evaluation
- Conclusion
5IDMaps
- Internet Distance Map Service
- To be underlying service that provides the
distance information used by SONAR/HOPS - Goals
- Not near instantaneous information
- Determine roughly the best service given
technology constraints - Consider whether there are applications for which
this level of service would be useful
6Resulting Goals
- Separation of functions
- Separation of IDMaps and the query/reply service
- Distance Metrics
- Latency(round-trip delay)
- useful, easy to provide
- Bandwidth
- Useful, difficult to provide, expensive to
measure - Accuracy of the distance information
- High accuracy difficult to achieve
- To obtain accuracy within a factor of 2
7Contents
- Background
- Goals
- Related work
- Architecture
- Performance Evaluation
- Conclusion
8Alternative Architectures and Related Work
- SPAND, Remos provide only distance information
between hosts close to a distance server and
remote hosts on the internet - For each server scales proportionally to the
number of destination - For all sites in the Internet N2
- Stemm passive monitoring
- Not perturb actual internet traffic
- Only measure regions previous traversed
- Not adapt to the internet topology changes
- More human efforts
9Contents
- Background
- Goals
- Related work
- Architecture
- Performance Evaluation
- Conclusion
10IDMaps Architecture
- Address three questions
- What form does the distance information take?
- What are IDMaps components?
- How should the distance information be
disseminated?
11Various forms of distance information
Forms Scale comments
Global IP addr. H2 H of hosts Infeasible
Addr. Prefix(AP) P2 P of APs 200,000 Easily terabytes
AS A2P ( AltltP ) A of AS, P of BGP-advertised IP addr. Blocks A 100,000 (large) Its accuracy is highly suspected
Cluster of APs B2P B of Traces If B 500, manageable Reasonable accuracy
1
2
3
4
123
1
4
2
13The form used
- There are three main components
- APs, Tracers, and the virtual links(the raw
distance) - AP a consecutive address range of IP addresses
- Tracers Some systems that are distributed around
the Internet - Assumption
- We can estimate the distance between two points
as the sum of distances between intermediate
points
14An assumption Triangulation
a-cltbltac ? Feasible to estimate distance?
15To support the triangulation
- Set up 2 experiments D1(1995), D2(1997)
- Fig. Shows the ratios of for all
shortest-path triangulation in the data sets - Between 75 an 90 of triangulation estimates
fall within a factor of 2 of the real distance - The resulting estimates are acceptable!
16Tracer placement
- Two problems
- How many tracers are optimal?
- Given the number of tracers, how to put to
minimize the maximum distance between an AP and
the nearest tracer? - Two graph theoretic approaches that can apply
- K-HST algorithm
- Minimum K-center algorithm
- These algorithms are used to determine the
placement of fire stations, ambulance placement,
etc. with a priori
17k-HST decide of tracers
- 1st phase The graph is recursively partitioned
- A node is arbitrarily selected from the
current(parent) partition, and all the nodes that
are within a random radius from this node form a
new node partition - The radius of the child partition is a factor of
k smaller than the diameter of the parent
partition - Recurs until each node is in a partition of its
own
18k-HST tree
- 2nd phase virtual node is assigned to each of
the partition on each level - The diameter of a partition
- The furthest distance between two nodes in the
partition - Equals to 2 times of the length of the links from
a virtual node to its children
19Use K-HST tree
- Devise a greedy algorithm to find the number of
tracers when the maximum distance is bounded to D - Push the tracers down the tree until it discovers
a partition with diameter ltD - The number of partitions is the minimum number of
tracers - Set the virtual nodes of these partitions to be
the tracer
20Minimum K-Center Algorithm
- K-Center problem
- The placement of a given number of centers such
that the maximum distance from a node to the
nearest center is minimized - NP-complete
- Willing to tolerate inaccuracies within a factor
of 2(2-approximation) - No worse than twice the maximum
- Observation Guarantee that the distance from a
node to the nearest center is bounded
21Minimum K-Center Algorithm details
- G(V,E), EVV, c(e) is the cost of the shortest
path between (v1, v2) - All the graph edges are arranged in
non-decreasing order by cost - Gi2 is the graph whenever there is a path between
u and v in Gi of at most two hops, u?v - An independent set of a graph G(V,E) is such
that, for all u,v?V, the edge (u,v) is not in E - An independent set of Gi2 is thus a set of nodes
in Gi that are at least 3 hops apart in Gi - The maximal independent set M as an independent
set V such that all nodes in V-V are at most
one hop away from nodes in V
22Algorithm 2 (2-approximate minimum-center
18) details
1. Construct Gi2,G22,, Gm2 2. Compute Mi for
each Gi2 3. Find the smallest I such that
MiltK, say j 4. Mj is the set of K centers
23Tracer Heuristics
- Stub-AS
- only connected to one other AS
- Transit-AS
- connected to one or more other AS
- allows itself to be used as a conduit for traffic
(transit traffic) between other AS's - Most large ISPs are Transit-ASs
- Mixed
- Randomly, with uniform distribution placed on the
network
24Virtual links
- Tracer-tracer virtual links
- Not necessary to list all B2 tracer-tracer
distances - Given a number of tracers in Seattle and Boston
- It would almost certainly not to be useful to
know all of the distance between them - Allow a sufficient distance approximation between
hosts in Seattle and hosts in Boston
25Virtual links
C in AP1 will be directed to mirror M1 in AP3
instead of M2 in AP2 Had tracer T2 also traced to
AP1, the client would have been directed to M2
- Tracer-AP VLs
- A dedicated tracer?
- More than one tracer?
26Contents
- Background
- Goals
- Related work
- Architecture
- Performance Evaluation
- Conclusion
27Performance Evaluation
- Topology Generation
- Waxman, Tiers, Inet
- Simulating IDMaps Infrastructure
- Tracer placement Stub-AS, Transit-AS
- Distance map computation
- Tracer-tracer VLs and Tracer-AP VLs
28Performance Metric Computation
- Nearest mirror selection
- Papp the percentage of correct IDMaps answers
over total number of clients - Consider IDMaps server selection correct
- As long as the distance between a client and the
nearest mirror determined by IDMaps is within a
factor of ? times the distance between the client
and the actual nearest mirror ( we use ?2)
29Simulation result
- Mirror selection using IDMaps gives noticeable
improvement over random selection - Network topology can affect IDMaps performance
- Tracer placement heuristics that do not rely on
network topology can perform as well or better
than algorithms that requires a priori knowledge
of the topology
30Simulation result
- Adding more tracers gives diminishing return
- Number of tracer-tracer VLs required for good
performance can be on the order of B with a small
constant - Increasing the number of tracers tracing to each
AP improves IDMaps performance with diminishing
return
31Mirror selection
- Transit-AS
- The probability of that at least 80 of all
clients will be directed to the correct mirror
is 100 - Up to 98 of all clients will be directed to the
correct mirror is only 85
32Mirror selection
- Mirror selection using distance maps outperforms
random selection regardless of the tracer
placement algorithm - Qualitatively, the results from agree with the
conclusion - mirror selection using distance maps outperforms
random selection
33Effect of Topology
34Effect of Topology
- Performance on Tiers generated topology exhibit a
qualitatively different behavior than those on
other topologies - The transit-AS heuristic gives better IDMaps
performance than the k-HST algorithm on
topologies generated from Inet and Waxman, but
not so in the topologies generated from Tiers
35Contents
- Background
- Goals
- Related work
- Architecture
- Performance Evaluation
- Conclusion
36Conclusion
- A global distance measurement infrastructure
called IDMaps is purposed - It can be placed on the Internet to collect
distance information - Nearest mirror selection fro clients
- Significant improvement over random selection
- Do not require a full knowledge of the underling
topology
37Conclusion
- IDMaps overhead can be minimized by grouping
Internet addresses into APs to reduce the number
of measurements - Apply t-spanner to tracer-tracer VLs can result
in linear measurement overhead with respect to
the number of tracers in the common case - Overall, this study has provided positive results
to demonstrate that a useful Internet distance
map service can indeed be built scalably
38(Stub AS)