Title: computing the relationships between autonomous systems
1computing the relationships between autonomous
systems
- giuseppe di battista
- maurizio patrignani
- maurizio pizzonia
univ. of rome III http//www.dia.uniroma3.it/comp
unet/
2announcements and traffic flows
- bgp allows a router to offer connectivity to
another router - offering connectivity means promising the
delivery to a specific destination
bgp announcement
195.10.14.0/24
AS100
AS200
ip traffic (to be delivered to 195.10.14.0/24)
3using AS-path to avoid cycles
AS40
AS31
193.204.161.0/24
AS10
AS60
AS212
legenda
4using prepending to implement preferences
AS40
AS31
193.204.161.0/24
AS10
AS60
AS212
5how to ensure cooperation?
AS40
AS31
193.204.161.0/24
AS10
AS60
AS212
why AS212 should propagate AS10 announcement?
6a basic question
AS1
provider grants connectivity
an Autonomous System may be customer of another
Autonomous System (its provider) how can these
kinds of relationships be inferred by observing
routing information?
customer pays for connectivity
AS2
7topology discovery does not provide inter-AS
commercial relationships
peer-peer
AS2
AS1
provider
customer
peer-peer
AS3
AS5
AS4
sibling-sibling
AS8
AS6
AS7
AS9
not available route
8problem history
- the problem is introduced by Lixin Gao (On
Inferring Autonomous System Relationships in the
Internet, IEEE Trans. Networking, 2001) - relationships are classified into three
categories customer-provider, peer-peer, and
sibling-sibling - BGP routing tables are used as input
- heuristics are verified with information coming
from other sources
9a simpler problem formulation
- due to Subramanian, Agarwal, Rexford, and Katz
(Characterizing the Internet Hierarchy from
Multiple Vantage Points INFOCOM 2002) SARK 2002
from a provider
from a provider
from a provider
from a provider
from a peer
from a peer
from a peer
from a peer
from a customer
from a customer
from a customer
from a customer
peer-peer relationship
customer-provider relationship
10valid and invalid AS-paths
if all ASes respect the advertisement rules,
AS-paths would be all valid
possibly missing
possibly missing
possibly missing
possibly missing
type-2
type-1
type-1 valid AS-path a (possibly missing)
uphill path followed by a (possibly missing)
downhill path
type-2 valid AS-path a (possibly missing)
uphill path followed by a peer-peer edge,
followed by a (possibly missing) downhill path
11the ToR (Type of Relationships) problem
ToR problem SARK 2002
given an undirected graph G and a set of paths P,
give an orientation to some of the edges of G to
minimize the number of invalid paths in P
invalid paths involving more than one peering or
not valley free
alley
- in SARK 2002 the ToR problem is conjectured to
be NP-hard
12a real-life ToR problem instance
six rows extracted from the BGP routing table of
Oregon RouteViews (Apr 18, 2001)
5056 701 6461 4926 4270 4387
5056 701 4926 4926 4926 6461 2914 174 174 174 174
14318
7660 1 5056 701 11334
5056 1239 1 1755 1755 1755 1755 3216 13099
5056 1 1239 8151
1239 5056 701 11334
extract AS paths and eliminate prepending
13a real-life ToR problem instance
six rows extracted from the BGP routing table of
Oregon RouteViews (Apr 18, 2001)
extract AS paths and eliminate prepending
5056 701 6461 4926 4270 4387
5056 701 4926 4926 4926 6461 2914 174 174 174 174
14318
6461 2914 174 174 174 174 14318
14318
7660 1 5056 701 11334
5056 1239 1 1755 1755 1755 1755 3216 13099
3216 13099
5056 1 1239 8151
1239 5056 701 11334
14building the corresponding AS graph
AS8151
AS4387
AS4270
AS1239
AS4926
AS701
AS5056
AS2914
AS3216
AS1755
AS174
AS1
AS6461
AS11334
AS14318
AS7660
AS13099
5056 701 6461 4926 4270 4387
5056 701 4926 6461 2914 174 14318
7660 1 5056 701 11334
5056 1239 1 1755 3216 13099
5056 1 1239 8151
1239 5056 701 11334
15an orientation for the AS graph
AS8151
AS4387
AS4270
AS1239
AS4926
AS701
AS5056
alley
AS2914
AS3216
AS1755
AS174
alley
AS1
AS6461
AS11334
AS14318
AS7660
AS13099
5056 701 6461 4926 4270 4387
5056 701 4926 6461 2914 174 14318
7660 1 5056 701 11334
5056 1239 1 1755 3216 13099
5056 1 1239 8151
1239 5056 701 11334
16our contributions
- we show that, although the ToR-problem is
NP-hard, a solution without invalid paths (if it
exists) can be found in linear time - we propose heuristics for the general problem
based on a novel paradigm and show their
effectiveness against publicly available data
sets - the experiments put in evidence that our
heuristics performs significantly better than
state of the art heuristics
17formulation as a decision problem
- the ToR minimization problem corresponds to the
following decision problem
ToR-D problem
given an undirected graph G, a set of paths P,
and an integer k, test if it is possible to give
an orientation to some of the edges of G so that
number of invalid paths in P is at most k
what if we give an orientation to all the edges?
18simplifying the decision problem for k0
suppose to have a solution to the ToR problem
with all valid paths and consider an edge labeled
as a peer-peer edge
peer-peer
19example of orientable AS graph
AS8709
AS8938
AS3561
AS6893
AS3967
AS8843
AS4197
AS7018
AS15493
AS3582
AS10311
AS1740
20an orientation leaving all valid paths
AS8709
AS8938
AS3561
AS6893
AS3967
AS8843
AS4197
AS7018
AS15493
AS3582
AS10311
AS1740
21a different representation
AS3561
AS7018
AS3967
AS1740
AS8709
AS10311
AS8938
AS4197
AS3582
AS8843
AS6893
AS15493
22example of not orientable AS graph
AS8151
AS4387
AS4270
AS1239
AS4926
AS701
AS5056
AS2914
AS3216
AS1755
AS174
AS1
AS6461
AS11334
AS14318
AS7660
AS13099
5056 701 6461 4926 4270 4387
5056 701 4926 6461 2914 174 14318
7660 1 5056 701 11334
5056 1239 1 1755 3216 13099
5056 1 1239 8151
1239 5056 701 11334
23chestnuts and contradictions
chestnut
contradiction!
24testing if a solution exists with all valid paths
observation
a path p v1, , vn is valid if and only if it
does not have a vertex vi such that the two edges
of p incident on vi are directed away form vi
(it does not have a valley)
vi
alley
- based on this observation the problem can be
mapped to 2SAT
given a set X of Boolean variables and a formula
in conjunctive normal form composed by clauses of
two literals, where a literal is a variable or a
negated variable, find a truth assignment for the
Boolean variables in X such that the formula is
satisfied
example (x1 ? x2) ? (?x2 ? ?x3) ? (x3
? ?x1)
25mapping the problem to 2SAT
- associate each edge with a Boolean variable
- provide each edge with an arbitrary (say random)
orientation
x1,2
x2,3
x3,4
x4,5
v3
v2
v4
v1
v5
- a truth assignment for the Boolean variables
corresponds to an orientation for all the edges
of the AS-graph (and vice versa) - if the variable is true, the associated edge
preserves its original direction - if the variable is false, the associated edge is
reversed
26construction of the 2SAT instance
x1,2
x2,3
x3,4
x4,5
v3
v2
v4
v1
v5
(x1,2 ? x2,3) ? (?x2,3 ? ?x3,4) ? (x3,4 ?
?x4,5)
x1,2
?x2,3
?x3,4
?x4,5
v3
v2
v4
v1
v5
x1,2
?x2,3
?x3,4
x4,5
v3
v2
v4
v1
v5
27solving a huge 2SAT
given a 2SAT formula
(x1,2 ? x2,3) ? (?x2,3 ? ?x3,4) ? (x3,4 ?
?x4,5) ? (x4,5 ? ?x2,3)
compute the graph of the literals
- there is a direct path between a literal and its
opposite and vice versa iff a solution without
invalid paths does not exist (Aspvall, Plass,
Tarjan, IPL 79)
28strongly connected components
- compute the strongly connected components of the
graph of the literals - strongly connected component maximal set of
vertices such that for each pair u, v of vertices
of the set there exists a directed path from u to
v and vice versa
29testing if a solution exists
- a solution for the 2SAT problem does not exists
if and only if the two literals of the same
Boolean variable (edge) fall into the same
strongly connected component - the test takes O(nmq) time
- where n is the number of ASes, m is the number of
edges, and q is the sum of the lengths of the
AS-paths
?x
x
30AS graphs from the various sourceshttp//www.cs.b
erkeley.edu/sagarwal/research/BGP-hierarchy/
31solvable without invalid paths?
32the general problem is NP-hard
MAX2SAT instance
MAX2SAT assignment
33example
(x1 ? ?x2) ? (x2 ? ?x3) ? (x2 ? x4) ? (?x1 ? ?x4)
? (?x3 ? x4)
x1
x2
x3
x4
34a hot topic
- independently of our work, Erlebach, Hall, and
Schank of the Theory of Communication Networks
Group of Zurich discovered analogous results
concerning the time complexity of the general
problem and the linearity in the case of all
valid paths - however, while they put more emphasis on the
approximability of the problem, we focus more on
the engineering and the experimentation of an
effective heuristic approach
35our heuristic approach
- size of the problem
- 3,423,460 AS-paths
- 10,916 vertices
- 23,761 edges
- very simple approach
- find a very large set of AS-paths admitting an
orientation without invalid paths - try to reinsert the kept out AS-paths
361) find a very large set of paths admitting an
orientation
1.a) rank the edges with respect to the number of
paths using them 1.b) construct the graph of the
literals 1.c) compute the strongly connected
components of it 1.d) find out all the variables
whose two literals fall into the same strongly
connected component 1.e) consider the
corresponding edges and eliminate the one with
the highest rank (and all the paths using it,
too)
372) try to reinsert the left out AS-paths
2.a) set x 1 2.b) add x paths 2.c) if it is
still solvable commit the path addition x
x2 else if x is 1 discard the path
else x 1 2.d) repeat 2.b and 2.c
until no path is left
some figures out of 3,423,460 paths, we removed
246,835 paths, ending with 3,176,625 paths after
the first step we reinserted 222,764 with the
second step, ending with 3,399,389 valid paths
38comparison with the SARK paper
recomputed
telnet sources
web sources
39issues
- how to determine peer-peer relationships once the
graph is oriented? - each AS-path is weighted one, irrespectively of
the size of its prefix. To what extent is this
correct? - could snapshots of the same BGP table taken at
different dates help in better understanding the
relationships between ASes? - what if we knew in advance the orientation of
some of the edges?
40questions?
41computing a solution without invalid paths
- consider the directed acyclic graph of the
connected components of the graph of the literals - compute a topological sorting of the connected
components and assign an integer to each
component - call f(x) the index of the component to which the
literal x belongs - a true value is assigned to variable x if f(x) gt
f(?x), false otherwise
1
2
3
4
42degrees of freedom in determining the peering
relationships
- need for more semantic information for
determining peer-peer relationships
43conclusions
- we show that the ToR-problem is NP-hard but
- if a solution without invalid paths exists, it
can be found in linear time by mapping the
problem to 2SAT - we propose heuristics for the general problem
based on the 2SAT mapping - we show the effectiveness of the heuristics
against publicly available data sets - the experiments put in evidence that our
heuristics performs significantly better than
state of the art heuristics - we show that discovering peer-peer relationships
is a hard problem if one wants to maximize them
44acknowledgements
- Subramanian, Agarwal, Rexford, and Katz for
sharing their data and explanations - Thomas Erlebach, Alexander Hall, and Thomas
Schank for their insight - Debora Donato, Andrea Vitaletti for useful
discussions - Massimo Rimondini for computing statistics on the
graphs
45discovering the peer-peer relationships
given an oriented graph, discovering peer-peer
relationships is a hard problem if you want to
maximize them (MAX-INDEPENDENT-SET can be mapped
to it)
v1
v2
1
3
2
v3
4
v6
6
5
v4
v5
46the experimental setting
- same setting used by SARK 2002
- ten telnet looking glasses are selected as
sources of data - all the AS-paths put together are used to test
the algorithms - four web sources are used to validate the output
47a deeper comparison
48graph of the differences
we inserted one undirected edge (and the two end
vertices if needed) into the graph of the
differences for every oppositely directed edge
49links
- Lixin Gao, On Inferring Autonomous System
Relationships in the Internethttp//www-unix.ecs
.umass.edu/lgao/ - Subramanian, Agarwal, Rexford, and Katz
Characterizing the Internet Hierarchy from
Multiple Vantage Pointshttp//www.cs.berkeley.ed
u/sagarwal/research/BGP-hierarchy/ - Erlebach, Hall, and Schank, Classifying
Customer-Provider Relationships in the Internet
http//www.tik.ee.ethz.ch/the/ - website dedicated to the algorithms described in
this presentation http//www.dia.uniroma3.it/comp
unet/relationships/
50computing the relationships between autonomous
systems
- giuseppe di battista
- maurizio patrignani
- maurizio pizzonia
univ. of rome III http//www.dia.uniroma3.it/comp
unet/
51the routing process
52routing protocols
- routing protocols are used to automatically
update the routing tables - they fall into two main cathegories
- link-state routing protocols
- approach send information about your neighbors
to everyone - each router reconstructs the whole network and
computes a shortest path tree to all destinations - examples is-is, ospf
- distance-vector routing protocols
- approach send all your information to your
neighbors - each router updates its routing table based on
the neighbors ones - examples rip
53flat routing a single routing algorithm
involving all the organizations
lan send to
router 6
lan send to
router 9
router 1
router 2
router 5
router 3
- it was this way before 1984
- problems slow to converge, difficult to deploy a
new routing algorithm, difficult to debug and
configure, does not take into account the
ownership of the links
54hierarchical routingmore than one routing
algorithm involved
- approach
- each organization runs a local routing algorithm
by using an interior gateway protocol (igp) - external targets are injected into the local
routing algorithm
55route injection
lan send to
- the interior routing algorithm spreads into the
network the local targets as well as the remote
ones injected by border routers - border routers are aware of remote targets since
they talk a higher level routing protocol with
other border routers
56exterior gateway protocol
57exterior gateway protocol
- approach
- hide the interior part of all organizations
58represent the internal targets
- each border router represents its internal
targets as if they were local
59simplify the graph
- consider the (external and internal) reachability
of the routers
- the graph is actually managed through tcp
connections called peerings
60solve the routing problem
- solve the routing problem on the simplified graph
- based on political considerations
61border gateway protocol
- bgp is an exterior gateway protocol it keeps the
routing tables updated and propagates routing
information - takes into account the willingness of the
organizations to cooperate in the routing process
(commercial agreements, local preferences,
priorities, legal issues, )