Title: Adapting Routing to the Traffic
1Adapting Routing to the Traffic
- COS 461 Computer Networks
- Spring 2007 (MW 130-250 in Friend 004)
- Jennifer Rexford
- Teaching Assistant Ioannis Avramopoulos
- http//www.cs.princeton.edu/courses/archive/spring
07/cos461/
2Goals of Todays Lecture
- Challenges
- Reacting quickly to alleviate congestion
- Avoiding over-reacting and causing oscillations
- Limiting bandwidth CPU overhead on routers
- Load-sensitive routing
- Routers adapt to link load in a distributed
fashion - At the packet level, or on group of packets
- Traffic engineering
- Centralized computation of routing parameters
- Network-wide measurements of offered traffic
3Do IP Networks Manage Themselves?
- TCP congestion control
- Senders react to congestion
- Decrease sending rate
- But the TCP sessions receive lower throughput
- IP routing protocols
- Routers react to failures
- Compute new paths
- But the new paths may be congested
2
2
1
1
4
4
1
1
3
3
2
2
1
1
5
5
4
4
3
3
4Do IP Networks Manage Themselves?
- In some sense, yes
- TCP senders send less traffic during congestion
- Routing protocols adapt to topology changes
- But, does the network run efficiently?
- Congested link when idle paths exist?
- High-delay path when a low-delay path exists?
2
2
1
1
4
4
1
1
3
3
2
2
1
1
5
5
4
4
3
3
5Adapting the Routing to the Traffic
- Goal modify the routes to steer traffic through
the network in most effective way - Approach 1 load-sensitive protocols
- Distribute traffic performance measurements
- Routers compute paths based on load
- Approach 2 adaptive management system
- Collect measurements of traffic and topology
- Management system optimizes the parameters
- Debates still today about the right answer
6Load-Sensitive Routing Protocols
- Advantages
- Efficient use of network resources
- Satisfying the performance needs of end users
- Self-managing network takes care of itself
- Disadvantages
- Higher overhead on the routers
- Long alternate paths consume extra resources
- Instability from out-of-date feedback information
7Packet-Based Load-Sensitive Routing
- Packet-based routing
- Forward packets based on forwarding table
- Load-sensitive
- Compute table entries based on load or delay
- Questions
- What link metrics to use?
- How frequently to update the metrics?
- How to propagate the metrics?
- How to compute the paths based on metrics?
8Original ARPANET Algorithm (1969)
- Routing algorithm
- Shortest-path routing based on link metrics
- Instantaneous queue length plus a constant
- Distributed shortest-path algorithm (Bellman-Ford)
2
1
3
1
3
2
1
5
20
congested link
9Performance of ARPANET Algorithm
- Light load
- Delay dominated by transmission propagation
- So, link metrics dont fluctuate much
- Medium load
- Queuing delay is no longer negligible
- Moderate traffic shifts to avoid congestion
- Heavy load
- Very high metrics on congested links
- Busy links look bad to all of the routers
- All routers avoid the busy links
- Routers may send packets on longer paths
10Problem Out-of-Date Information
- Routers make decisions based on old information
- Propagation delay in flooding link metrics
- Thresholds applied to limit number of updates
- Old information leads to bad decisions
- All routers avoid the congested links
- leading to congestion on other links
- and the whole things repeats
11Problem Frequent Updates
- Update messages
- Link keeps track of its metric (e.g., queuing
delay) - Link transmits updates when the metric changes
- Frequency of updates
- Frequent changes to the metric lead to frequent
updates - Significantly increases the overhead of the
protocol - Oscillation makes the problem worse
- Oscillation leads to wild swings in the link
metrics - Forcing very frequent update messages
- that add to the load on the links in the network
12Second ARPANET Algorithm (1979)
- Link-state protocol
- Old Distributed path computation leads to loops
- New Better to flood metrics and have each router
compute the shortest paths - Averaging of the link metric over time
- Old Instantaneous delay fluctuates a lot
- New Averaging reduces the fluctuations
- Reduce frequency of updates
- Old Sending updates on each change is too much
- New Send updates if change passes a threshold
13Problem of Long Alternate Paths
- Picking alternate paths
- Long path chosen by one router consumes resource
that other packets could have used - Leads other routers to pick other alternate paths
- Solution limit path length
- Bound the value of the link metric
- This link is busy enough to go two extra hops
- Extreme case
- Limit path selection to the shortest paths
- Pick least-loaded shortest path in the network
14Load-Sensitive Routing
- Timescales
- What timescale of routing decisions?
- What timescale of feedback about link loads?
- Load-sensitive routing at packet level
- Routers receive feedback on load and delay
- Routers re-compute their forwarding tables
- Fundamental problems with oscillation
- Load-sensitive routing for groups of packets
- Routers receive feedback on load and delay
- Router compute a path for the next flow or
circuit - Less oscillation, as long as circuits last for a
while
15Reducing Effects of Out-of-Date Info
- Send link metrics more often
- But, leads to higher overhead
- But, propagation delay is a fundamental limit
- Make the traffic last longer
- Route on groups of packets, rather than packets
- Fewer routing decisions, and more accurate
feedback - Groups of packets
- Telephone network phone call (3-minutes long)
- Internet TCP connection (10-packets long)
- Internet all traffic between a pair of hosts, or
routers,
More when we talk about circuit switching later
in the course.
16Traffic Engineering as a Network-Management
Problem Case Study
17Using Traditional Routing Protocols
- Routers flood information to learn topology
- Determine next hop to reach other routers
- Compute shortest paths based on link weights
- Link weights configured by network operator
18Approaches for Setting the Link Weights
- Conventional static heuristics
- Proportional to physical distance
- Cross-country links have higher weights
- Minimizes end-to-end propagation delay
- Inversely proportional to link capacity
- Smaller weights for higher-bandwidth links
- Attracts more traffic to links with more capacity
- Tune the weights based on the offered traffic
- Network-wide optimization of the link weights
- Directly minimize metrics like max link
utilization
19Example of Tuning the Link Weights
- Problem congestion along the pink path
- Second or third link on the path is overloaded
- Solution move some traffic to the bottom path
- E.g., by decreasing the weight of the second link
2
1
3
1
3
2
3
1
5
4
3
20Measure, Model, and Control
Network-wide what if model
Offered traffic
Changes to the network
Topology/ Configuration
measure
control
Operational network
21Traffic Engineering Problem
- Topology
- Connectivity and capacity of routers and links
- Traffic matrix
- Offered load between points in the network
- Link weights
- Configurable parameters for routing protocol
- Performance objective
- Balanced load, low latency, service level
agreements - Question Given the topology and traffic matrix,
which link weights should be used?
22Key Ingredients of the Approach
- Instrumentation
- Topology monitoring of the routing protocols
- Traffic matrix fine-grained traffic measurement
- Network-wide models
- Representations of topology and traffic
- What-if models of shortest-path routing
- Network optimization
- Efficient algorithms to find good configurations
- Operational experience to identify key
constraints
23Formalizing the Optimization Problem
- Input graph G(R,L)
- R is the set of routers
- L is the set of unidirectional links
- cl is the capacity of link l
- Input traffic matrix
- Mi,j is load from router i to j
- Output setting of the link weights
- wl is weight on unidirectional link l
- Pi,j,l is fraction of traffic from i to j
traversing link l
24Multiple Shortest Paths Even Splitting
Values of Pi,j,l
25Defining the Objective Function
- Computing the link utilization
- Link load ul Si,j Mi,j Pi,j,l
- Utilization ul/cl
- Objective functions
- min (maxl(ul/cl))
- min(Sl f(ul/cl))
26Complexity of the Optimization Problem
- Computationally intractable problem
- No efficient algorithm to find the link weights
- Even for simple objective functions
- What are the implications?
- Must resort to searching through weight settings
27Optimization Based on Local Search
- Start with an initial setting of the link weights
- E.g., same integer weight on every link
- E.g., weights inversely proportional to capacity
- E.g., existing weights in the operational network
- Compute the objective function
- Compute the all-pairs shortest paths to get
Pi,j,l - Apply the traffic matrix Mi,j to get link loads
ul - Evaluate the objective function from the ul/cl
- Generate a new setting of the link weights
repeat
28Making the Search Efficient
- Avoid repeating the same weight setting
- Keep track of past values of the weight setting
- or keep a small signature of past values
- Do not evaluate setting if signatures match
- Avoid computing shortest paths from scratch
- Explore settings that changes just one weight
- Apply fast incremental shortest-path algorithms
- Limit number of unique link-weight values
- Dont explore 216 possible values for each weight
- Stop early, before exploring all settings
29Incorporating Operational Realities
- Minimize number of changes to the network
- Changing just 1 or 2 link weights is often enough
- Tolerate failure of network equipment
- Weights usually remain good after failure
- or can be fixed by changing 1-2 weights
- Limit effects of measurement accuracy
- Good weights remain good, despite noise
- Limit frequency of changes to the weights
- Joint optimization for day night traffic
matrices
30Application to ATTs Backbone
- Performance of the optimized weights
- Search finds a good solution within a few minutes
- Much better than link capacity or physical
distance - Competitive with multi-commodity flow solution
- How ATT changes the link weights
- Maintenance every night from midnight to 6am
- Predict effects of removing link(s) from network
- Reoptimize the link weights to avoid congestion
- Configure new weights before disabling equipment
31Example from ATTs Operations Center
- Amtrak repairing/moving part of train track
- Need to move some of the fiber optic cables
- Or, heightened risk of the cables being cut
- Amtrak notifies ATT the timework will be done
- ATT engineers model the effects
- Determine which IP links go over affected fiber
- Pretend the network no longer has these links
- Evaluate the new shortest paths and traffic flow
- Identify whether link loads will be too high
32Example Continued
- If load will be too high
- Reoptimize the weights on the remaining links
- Schedule time for new weights to be configured
- Roll back to old weights when Amtrak is done
- Same process applied to other cases
- Assessing the networks risk to possible failures
- Planning for maintenance of existing equipment
- Adapting link weights to installation of new
links - Adapting link weights in response to traffic
shifts
33What About Interdomain Routing?
- Border Gateway Protocol
- Announcements carry very limited information
- E.g., AS path, but nothing about delay, loss,
etc. - Challenging to make load-sensitive protocol
- Hard to agree upon a common metric
- Hard to scale to such a large network
- Hard to prevent ASes from gaming the system
- Instead, individual ASes act alone
- Change routing policies based on link load
- E.g., moving some traffic to another provider
34Interdomain Traffic Engineering
- Predict effects of changes to import policies
- Inputs routing, traffic, and configuration data
- Outputs flow of traffic through the network
BGP policy configuration
Topology
Externally learned routes
BGP routing model
Offered traffic
Flow of traffic through the network
35Outbound Traffic Pick a BGP Route
- Easier to control than inbound traffic
- IP routing is destination based
- Sender determines where the packets go
- Control only by selecting the next hop
- Border router can pick the next-hop AS
- Cannot control selection of the entire path
Provider 1
Provider 2
(1, 3, 4)
(2, 7, 8, 4)
36Outbound Traffic Shortest AS Path
- No import policy on border router
- Pick route with shortest AS path
- Arbitrary tie break (e.g., smallest router-id)
- Performance?
- Shortest AS path is not necessarily best
- Could have high delays or congestion
- Load balancing?
- Could lead to uneven split in traffic
- E.g., one provider with shorter paths
- E.g., too many ties with skewed tie-break
37Outbound Traffic Load Balancing
- Selectively use each provider
- Assign local-pref across destination prefixes
- Change the local-pref assignments over time
- Useful inputs to load balancing
- End-to-end path performance data
- E.g., active measurements along each path
- Outbound traffic statistics per destination
prefix - E.g., packet monitors or router-level support
- Link capacity to each provider
- Billing model of each provider
38Balancing Load, Performance, and Cost
- Balance traffic based on link capacity
- Measure outbound traffic per prefix
- Select provider per prefix for even load
splitting - But, might lead to poor performance and high bill
- Balance traffic based on performance
- Select provider with best performance per prefix
- But, might lead to congestion and a high bill
- Balance traffic based on financial cost
- Select provider per prefix over time to minimize
the total financial cost - But, might lead to bad performance
39A Fundamental Problem
- Everyone is acting alone
- Internet is highly decentralized
- Each AS is adapting its routes alone
- Toward greater coordination
- End hosts or edge routers pick the entire path?
- Neighbor ASes cooperate to pick better paths?
- A largely unsolved problem
- The price of anarchy
- Is there a better way?
40Conclusions
- Adapting routing to the traffic
- To alleviate congestion
- To minimize propagation delay
- To be robust to future failures
- Two main approaches
- Load-sensitive routing protocol
- Optimization of configurable parameters
- Next class Overlay Networks
- Read Section 9.4 of the textbook