Traffic Engineering for ISP Networks - PowerPoint PPT Presentation

About This Presentation
Title:

Traffic Engineering for ISP Networks

Description:

Traffic Engineering for ISP Networks Jennifer Rexford Computer Science Department Princeton University http://www.cs.princeton.edu/~jrex A Challenge in ISP Backbone ... – PowerPoint PPT presentation

Number of Views:122
Avg rating:3.0/5.0
Slides: 39
Provided by: AlbertGr3
Category:

less

Transcript and Presenter's Notes

Title: Traffic Engineering for ISP Networks


1
Traffic Engineering for ISP Networks
  • Jennifer Rexford
  • Computer Science Department
  • Princeton University
  • http//www.cs.princeton.edu/jrex

2
A Challenge in ISP Backbone Networks
  • Finding a good way to route the data packets
  • Given the current network topology and offered
    traffic
  • For good performance and efficient use of
    resources

3
Why the Problem is Hard?
  • IP traffic varies, and the service is best effort
  • The offered traffic is not known in advance
  • The resources in the network are not reserved
  • The routers do not adapt on their own
  • Load-sensitive routing is not widely deployed
  • Due to control overhead and stability challenges
  • Routing protocols were not designed to be managed
  • At best indirect control over the flow of traffic
  • Fine-grain traffic measurements often unavailable
  • E.g., only have coarse-grain link load statistics

4
In This Talk
  • TE with traditional IP routing protocols
  • Shortest-path protocols with configurable link
    weights
  • Two main research challenges
  • Optimization tuning link weights to the offered
    traffic
  • Tomography inferring the offered traffic from
    link load
  • Deployed solutions in ATTs U.S. backbone
  • Our experiences working with the network
    operators
  • And how we improved the tools over time
  • Ongoing research on traffic management

5
Optimization Tuning Link Weights
6
Routing Inside an Internet Service Provider
  • Routers flood information to learn the topology
  • Routers determine next hop to reach other
    routers
  • By computing shortest paths based on the link
    weights
  • Routers forward packets via the next hop link(s)

2
1
3
1
3
2
1
5
4
3
7
Link Weights Control the Flow of Traffic
  • Routers compute paths
  • Shortest paths as sum of link weights
  • Operators set the link weights
  • To control where the traffic goes

2
1
3
1
3
2
3
1
5
4
3
8
Heuristics for Setting the Link Weights
  • Proportional to physical distance
  • Cross-country links have higher weights than
    local ones
  • Minimizes end-to-end propagation delay
  • Inversely proportional to link capacity
  • Smaller weights for higher-bandwidth links
  • Attracts more traffic to links with more capacity
  • Tuned based on the offered traffic
  • Network-wide optimization of weights based on
    traffic
  • Directly minimizes key metrics like max link
    utilization

9
Why Are the Link Weights Static?
  • Strawman alternative load-sensitive routing
  • Link metrics based on traffic load
  • Flood dynamic metrics as they change
  • Adapt automatically to changes in offered load
  • Reasons why this is typically not done
  • Delay-based routing unsuccessful in the early
    days
  • Oscillation as routers adapt to out-of-date
    information
  • Most Internet transfers are very short-lived
  • Research and standards work continues
  • but operators have to work with what they have

10
Big Picture Measure, Model, and Control
Network-wide what if model
Offered traffic
Topology/ Configuration
Changes to the network
measure
control
Operational network
11
Traffic Engineering in an ISP Backbone
  • Topology
  • Connectivity and capacity of routers and links
  • Traffic matrix
  • Offered load between points in the network
  • Link weights
  • Configurable weights for shortest-path routing
  • Performance objective
  • Balanced load, low latency, service level
    agreements
  • Question Given the topology and traffic matrix
    in an IP network, which link weights should be
    used?

12
Key Ingredients of Our Approach
  • Measurement
  • Topology monitoring of the routing protocols
  • Traffic matrix widely deployed traffic
    measurement
  • Network-wide models
  • Representations of topology and traffic
  • What-if models of shortest-path routing
  • Network optimization
  • Efficient algorithms to find good configurations
  • Operational experience to identify key
    constraints

13
Formalizing the Optimization Problem
  • Input graph G(R,L)
  • R is the set of routers
  • L is the set of unidirectional links
  • cl is the capacity of link l
  • Input traffic matrix
  • Mi,j is traffic load from router i to j
  • Output setting of the link weights
  • wl is weight on unidirectional link l
  • Pi,j,l is fraction of traffic from i to j
    traversing link l

14
Multiple Shortest Paths With Even Splitting
Values of Pi,j,l
15
Defining the Objective Function
  • Computing the link utilization
  • Link load ul Si,j Mi,j Pi,j,l
  • Utilization ul/cl
  • Objective functions
  • min(maxl(ul/cl))
  • min(Sl f(ul/cl))

16
Complexity of the Optimization Problem
  • NP-hard optimization problem
  • No efficient algorithm to find the link weights
  • Even for the simple convex objective functions
  • Why cant we just do multi-commodity flow?
  • E.g., solve the multi-commodity flow problem
  • and the link weights pop out as the dual
  • Because IP routers cannot split arbitrarily over
    ties
  • What are the implications?
  • Have to resort to searching through weight
    settings

17
Optimization Based on Local Search
  • Start with an initial setting of the link weights
  • E.g., same integer weight on every link
  • E.g., weights inversely proportional to link
    capacity
  • E.g., existing weights in the operational network
  • Compute the objective function
  • Compute the all-pairs shortest paths to get
    Pi,j,l
  • Apply the traffic matrix Mi,j to get link loads
    ul
  • Evaluate the objective function from the ul/cl
  • Generate a new setting of the link weights

repeat
18
Making the Search Efficient
  • Avoid repeating the same weight setting
  • Keep track of past values of the weight setting
  • or keep a small signature (e.g., a hash) of
    past values
  • Do not evaluate a weight setting if signatures
    match
  • Avoid computing the shortest paths from scratch
  • Explore weight settings that changes just one
    weight
  • Apply fast incremental shortest-path algorithms
  • Limit the number of unique values of link weights
  • Do not explore all 216 possible values for each
    weight
  • Stop early, before exploring the whole search
    space

19
Incorporating Operational Realities
  • Minimize number of changes to the network
  • Changing just 1 or 2 link weights is often enough
  • Tolerate failure of network equipment
  • Weights settings usually remain good after
    failure
  • or can be fixed by changing one or two weights
  • Limit dependence on measurement accuracy
  • Good weights remain good, despite random noise
  • Limit frequency of changes to the weights
  • Joint optimization for day and night traffic
    matrices

20
Application to ATTs Backbone Network
  • Performance of the optimized weights
  • Search finds a good solution within a few minutes
  • Much better than link capacity or physical
    distance
  • Competitive with multi-commodity flow solution
  • How ATT changes the link weights
  • Maintenance done every night from midnight to 6am
  • Predict effects of removing link(s) from the
    network
  • Reoptimize the link weights to avoid congestion
  • Configure new weights before disabling equipment

21
Example from My Visit to ATTs Operations Center
  • Amtrak repairing/moving part of the train track
  • Need to move some of the fiber optic cables
  • Or, heightened risk of the cables being cut
  • Amtrak notifies us of the time the work will be
    done
  • ATT engineers model the effects
  • Determine which IP links go over the affected
    fiber
  • Pretend the network no longer has these links
  • Evaluate the new shortest paths and traffic flow
  • Identify whether link loads will be too high

22
Example Continued
  • If load will be too high
  • Reoptimize the weights on the remaining links
  • Schedule the time for the new weights to be
    configured
  • Roll back to the old weight setting after Amtrak
    is done
  • Same process applied to other cases
  • Assessing the networks risk to possible failures
  • Planning for maintenance of existing equipment
  • Adapting the link weights to installation of new
    links
  • Adapting the link weights in response to traffic
    shifts

23
Conclusions on Traffic Engineering
  • IP networks do not adapt on their own
  • Routers compute shortest paths based on static
    weights
  • Service providers need to adapt the weights
  • Due to failures, congestion, or planned
    maintenance
  • Leads to an interesting optimization problems
  • Optimize link weights based on topology and
    traffic
  • Optimization problem is computationally difficult
  • Forces the use of efficient local-search
    techniques
  • Results of the local search are pretty good
  • Near-optimal solutions that minimize disruptions

24
Extensions
  • Robust link-weight assignments
  • Link/node failures
  • Range of traffic matrices
  • More complex routing models
  • Destinations reachable via multiple egress
    points
  • Interdomain routing policies
  • Interaction between ISPs
  • Inter-ISP negotiation for joint optimization
  • Grappling with scalability and trust issues

25
Tomography Inferring the Traffic Matrix
26
Computing the Traffic Matrix Mi,j
  • Hard to measure the traffic matrix
  • IP networks transmit data as individual packets
  • Routers do not keep traffic statistics, except
    link utilization on (say) a five-minute time
    scale
  • Need to infer the traffic matrix Mi,j from
  • Current topology G(R,L)
  • Current routing Pi,j,l
  • Current link load ul
  • Link capacity cl

27
Inference Network Tomography
From link counts to the traffic matrix
Sources
3Mbps
5Mbps
4Mbps
4Mbps
Destinations
28
Tomography Formalizing the Problem
  • Ingress-egress pairs
  • p is a ingress-egress pair of nodes (i,j)
  • xp is the (unknown) traffic volume for this pair
    Mi,j
  • Routing
  • Plp is proportion of ps traffic that traverses l
  • Links in the network
  • l is a unidirectional edge
  • ul is the observed traffic volume on this link
  • Relationship u Px (work backwards to get x)

29
Tomography One Observation Not Enough
  • Linear system of n nodes is underdetermined
  • Number of links e is around O(n)
  • Number of ingress-egress pairs c is O(n2)
  • Dimension of solution sub-space at least c - e
  • Multiple observations are needed
  • k independent observations (over time)
  • Stochastic model with Poisson iid counts
  • Maximum likelihood estimation to infer matrix
  • Doesnt work all that well in practice

30
Approach Used at ATT Tomo-gravity
  • Gravitational assumption
  • Ingress point a has traffic via
  • Egress point b has traffic veb
  • Pair (a,b) has traffic proportional to via veb

9
20
21
10
31
Approach Used at ATT Tomo-gravity
  • Problem with gravity model
  • Gravity model ignores the load on the inside
    links
  • Gravity assumption isnt always 100 correct
  • Resulting traffic matrix might not satisfy the
    link loads
  • Combining the two techniques
  • Gravity find a traffic matrix using the gravity
    model
  • Tomography find the family of traffic matrices
    consistent with all link load statistics
  • Tomo-gravity find the tomography solution that
    is closest to the output of the gravity model
  • Works extremely well (and fast) in practice

32
Conclusions
  • Managing IP networks is challenging
  • Routers dont adapt on their own to congestion
  • Routers dont reveal much information about
    traffic
  • Measurement provides a network-wide view
  • Topology
  • Traffic matrix
  • Optimization enables the network to adapt
  • Inferring the traffic matrix from the link loads
  • Optimizing the link weights based on the traffic
    matrix

33
New Research Direction Design for Manage-ability
  • Two main parts of network management
  • Control optimization
  • Measurement tomography
  • Two research approaches
  • Bottom up do the best with what you have
  • Top down design systems that are easier to
    manage
  • Design for manage-ability
  • If you are both the professor and the student,
    you create exam questions that are easy to
    answer.

34
Example Changing the Path Computation
  • Routers split traffic over multiple paths
  • More traffic on shorter paths, less on longer
    ones
  • In proportion to the exponential of path cost
  • Exciting result
  • Can achieve optimal distribution of the traffic
  • With polynomial-time algorithm for setting the
    weights

35
New Research Direction Logically-Central Control
  • Traditional division of labor
  • Routers real-time, distributed protocols
  • Management system offline, centralized
    algorithms
  • Example routing protocols and traffic
    engineering
  • Routing routers react automatically to link
    failures
  • TE management system sets the link weights
  • The case for separating routing from routers
  • Better decisions with network-wide visibility
  • Routers only collect measurements and forward
    packets

36
Example Routing Control Platform (RCP)
  • Logically-centralized server
  • Collects measurement data from the network
  • Pushes forwarding tables into the routers
  • Benefits
  • Network-wide policies
  • Flexible, easy to customize
  • Fewer nodes to upgrade
  • Feasibility
  • High-end PC can compute routes for large ISP
  • Simple replication to survive failures

37
References
  • Traffic engineering using traditional protocols
  • http//www.cs.princeton.edu/jrex/papers/ieeecomm0
    2.pdf
  • http//www.cs.princeton.edu/jrex/papers/opthand04
    .pdf
  • http//www.cs.princeton.edu/jrex/papers/ton-whati
    f.pdf
  • Tomo-gravity to infer the traffic matrix
  • http//www.cs.utexas.edu/yzhang/papers/mmi-ton05.
    pdf
  • http//www.cs.utexas.edu/yzhang/papers/tomogravit
    y-sigm03.pdf
  • http//www.cs.princeton.edu/jrex/papers/sfi.pdf

38
References
  • Design for manage-ability
  • http//www.cs.princeton.edu/jrex/papers/pefti.pdf
  • http//www.cs.princeton.edu/jrex/papers/optimizab
    ility.pdf
  • http//www.cs.princeton.edu/jrex/papers/tie-long.
    pdf
  • Routing Control Platform
  • http//www.cs.princeton.edu/jrex/papers/rcp.pdf
  • http//www.cs.princeton.edu/jrex/papers/ccr05-4d.
    pdf
  • http//www.cs.princeton.edu/jrex/papers/rcp-nsdi.
    pdf
  • http//www.research.att.com/kobus/docs/irscp.inm.
    pdf
Write a Comment
User Comments (0)
About PowerShow.com