Title: Yaping Zhu
1 UFO A Resilient Layered Routing
Architecture
- Yaping Zhu
- Advisor Prof. Jennifer Rexford
- With Andy Bavier and Nick Feamster (Georgia
Tech)
2Scalability High Availability ?
Scalability Scalability of routing control
plane Efficiency of routing data plane
High Availability Quick adaptation and re-route
3Can We Have the Best of Both Worlds?
Basic Idea 1. Layered routing architecture
(borrowing idea from overlay routing) 2. Underlay
Support for efficient and scalable overlay routing
4Outline
- Background
- Internet routing architecture
- Overlay routing (Resilient Overlay Networks)
- Basic idea of Layered routing architecture
- Efficient overlay forwarding
- Scalable overlay monitoring
- Enhancing the scalability of UFO
- Implementation and Evaluation
- Conclusion and deployment
5Internet Routing designed for Scalability
Autonomous System (AS)
Peering
6Internet Routing without High Availability
- Scalability
- Statistics 25K ASes, 200K prefixes, millions of
routers - Hierarchical intra-domain / inter-domain routing
- Prefix aggregation
- Routing protocols oblivious to performance
- Intra-domain static link weights
- Inter-domain routing policies
- Slow outage detection and recovery
- Disruptions during convergence
- Performance suffers from black-holes and loops
7Scalable Internet Routing without Customization
- IP does destination-based forwarding
- All traffic follows the same paths
- Independent of the application requirements
- Yet, applications have different needs
- Voice and gaming low latency and loss
- File sharing high throughput
High throughput, but high latency
low latency, but low throughput
8Outline
- Background
- Internet routing architecture
- Overlay routing (Resilient Overlay Networks)
- Basic idea of Layered routing architecture
- Efficient overlay forwarding
- Scalable overlay monitoring
- Enhancing the scalability of UFO
- Implementation and Evaluation
- Conclusion and deployment
9RON Resilient Overlay Networks (by D. Andersen)
Scalable IP routing substrate
10RON Resilient Overlay Networks System Components
- Overlay Control Plane
- Probing, overlay path evaluation
- Disseminate routing messages, update routes
- Overlay Data Plane
- Tunnel setup packet encapsulation/decapsulation
- User Opt-in Method
- DNS redirection to overlay server
- Connection to overlay server tunnels (e.g VPN)
11Overlay Routing
- Pros
- High availability End hosts discover
network-level path failure and cooperate to
re-route. - Customization Forwarding paths tailored to the
application - Applications
- Content distribution (e.g. Akamai SureRoute)
- Application layer multicast
12Overlay Routing Poor Efficiency
- Problem traffic must traverse bottleneck link
both inbound and outbound - Additional latency overhead
- Additional traffic consumption
Upstream ISP
13Overlay Routing Poor Scalability
Lets just keep probing
Scalable IP routing substrate
I dont know when failure happens
Shall I re-route if one packet lost?
14Overlay Routing Poor Scalability
- Fundamental trade-off between probing freq and
adaptation - To get Quick adaptation
- - aggressive probing at short time interval
- - poor scalability
- -RON only supports for a small (i.e.,nodes) set of connected hosts
- Can not differentiate packet lost due to
different events - Failure - fast re-route
- Congestions - may slower? - oscillation?
15Outline
- Background
- Internet routing architecture
- Overlay routing (Resilient Overlay Networks)
- Basic idea of Layered routing architecture
- Efficient overlay forwarding
- Scalable overlay monitoring
- Enhancing the scalability of UFO
- Implementation and Evaluation
- Conclusion and deployment
16Can We Have the Best of Both Worlds?
17A Resilient Layered Routing Architecture
- Combination of underlay and overlay routing
18UFO Underlay Friendly to Overlays
- In-network support for overlays
Friendly to
Overlays
19A Resilient Layered Routing Architecture
- Questions
- Which functionality belong to which layer?
- What are the interfaces between both layers?
- Cross-layer design
- Efficiency improvement
- Direct control over forwarding table entries
- Scalability improvement
- Explicit notification about changing network
conditions
20Outline
- Efficient overlay forwarding
- Overlay forwarding on line cards
- Hosting the overlay control plane
- Scalable overlay monitoring
- Registration of overlay links
- Notification of network events
- Lazy recovery
- Enhancing the scalability of UFO
- Implementation and Evaluation
- Conclusion and deployment
21Outline
- Efficient overlay forwarding
- Overlay forwarding on line cards
- Hosting the overlay control plane
- Scalable overlay monitoring
- Registration of overlay links
- Notification of network events
- Lazy recovery
- Enhancing the scalability of UFO
- Implementation and Evaluation
- Conclusion and deployment
22Efficient Overlay Forwarding
- Problem traffic must traverse bottleneck link
both inbound and outbound - Solution reflection points in routers
Upstream ISP
23Overlay Forwarding on Router Line Cards
24Where the overlay control plane runs? On Routers
- On Routers by Router virtualization
- Pros fast updates of forwarding tables
- Pros efficient transmission of control messages
- Pros fate-sharing
Processors
Router
Switching Fabric
Line Cards
25Where the overlay control plane runs? On Servers
26Where the overlay control plane runs? On Servers
- On separate set of servers
- Update forwarding table on router line cards
- Data packets reflected in-network
- Pros
- Pros cheap compared to router
- Pros compatibility with legacy overlay server
- Cons
- Lack of fate sharing
27Outline
- Efficient overlay forwarding
- Overlay forwarding on line cards
- Hosting the overlay control plane
- Scalable overlay monitoring
- Registration of overlay links
- Notification of different kinds of network events
- Lazy recovery
- Enhancing the scalability of UFO
- Implementation and Evaluation
- Conclusion and deployment
28Scalable Overlay Monitoring
Assumption Rich connectivity, multiple
alternative overlay paths Overlays could even
tolerate false positive notification
What to notify? Different applications may want
notification of different events
Notification Benefits Accurate adaptation
(compared with RON) Reduce probing overhead, and
increase scalability
29Scalable Overlay Monitoring
- Notification preserve overlay link abstractions
- Message format
- (overlay source, overlay destination, event)
- Routers store states by explicit overlay
registration - Explicit notification about events which affect
performance of overlay applications - Physical failures of routers or links
- Reachability failures route withdraw, routing
session failure - Network congestion
- few hello packets lost
30Registration of Overlay Links
- Overlay Nodes A, B, C
- Routers 1, 2, 3, 4
- Register for uni-directional overlay links A-B
and A-C
2
3
1
4
31Periodical Registration of Overlay Links
- ACK for successful registration
(A,B)
(A,B)
(A,B)
(A,B)
2
3
1
4
32Periodical Registration of Overlay Links
Registration kept as soft state Periodical
re-registration
(A,B) (A,C)
(A,B) (A,C)
(A,B)
(A,B)
2
3
1
(A,C)
(A,C)
4
33Notification of Network Events
(A,B) (A,C)
(A,B) (A,C)
(A,B)
(A,B)
2
3
1
(A,C)
(A,C)
4
34Reactive Routing and Lazy Recovery
- Assumption rich connectivity
- Reactive routing after notification
- Re-route via alternative overlay paths
- Disseminate notification message to peers
- Lazy recovery
- Stick to alternative overlay paths (e.g. for
mins) - Re-register for failed overlay
- Reason transient period during convergence of
recovery, causing loops and blackholes
35Outline
- Efficient overlay forwarding
- Overlay forwarding on line cards
- Hosting the overlay control plane
- Scalable overlay monitoring
- Registration of overlay links
- Notification of network events
- Lazy recovery
- Enhancing the scalability of UFO
- Implementation and Evaluation
- Conclusion and deployment
36Unicast Registration is Inefficient
- Overlay Nodes A, B, C, D, E and Routers 1, 2,
3, 4 - Register for overlay links B-A, C-A, D-A, E-A
(B,A)
2
3
1
(B,A) (C,A) (D,A) (E,A)
(B,A) (C,A) (D,A) (E,A)
(B,A) (C,A)
(C,A)
(D,A) (E,A)
(D,A)
4
(E,A)
37Unicast Notification is inefficient
(B,A)
2
3
1
(B,A) (C,A) (D,A) (E,A)
(B,A) (C,A) (D,A) (E,A)
(B,A) (C,A)
(C,A)
(D,A) (E,A)
(D,A)
4
(E,A)
38Multicast Registration
GroupA
2
3
1
GroupA
GroupA
GroupA
GroupA
GroupA
4
GroupA
39Multicast Notification
GroupA
2
3
1
GroupA
GroupA
GroupA
GroupA
GroupA
4
GroupA
40Benefits of Multicast registration/notification
- Reduce registration states stored at routers
- Unicast store state for each (src, dst) pair,
O(n2) - Multicast store state each mcast group, O(n)
- Reduce notification message overhead
- Deployment Benefits
- Exploit IP-Multicast (which routers already have)
41Outline
- Efficient overlay forwarding
- Overlay forwarding on line cards
- Hosting the overlay control plane
- Scalable overlay monitoring
- Registration of overlay links
- Notification of network events
- Lazy recovery
- Enhancing the scalability of UFO
- Implementation and Evaluation
- Conclusion and deployment
42Prototype Implementation on VINI
- Whats finished?
- RON
- Control plane probing and reactive routing
- Data plane overlay tunnel setup
- User Opt-in user data packets delivered by
overlays - UFO Notification of link failure
- What to do next?
- UFO
- Evaluate inter-domain routing convergence
- Notification of link congestion
- Run applications e.g. VoIP
43Prototype Implementation on VINI
- Overlay RON
- Overlay FIB
- Client opt-in
- Notification by Filter
UML
RON
XORP IP Router
eth1
eth3
eth2
eth0
Control
Data
Packet Forward Engine
UmlSwitch element
Overlay FIB
Tunnel table
Click
Filters
VPN Server
Clients
44Evaluation Setup
- Topology
- Routers and Overlay nodes
s
d
r
45Evaluation1 Reactive Routing of RON
- How much time does RON spend to detect outage?
- RON probe interval 12s
- RON probe timeout 3s
- Average detection time
- Probe interval / 2 probe timeout 3
- What to evaluate?
- Fundamental trade-off between probe frequency and
detection time - Parameters probe interval
46Evaluation1 Reactive Routing of RON
- Detection time probe interval / 2 probe
timeout 3
47Evaluation2 comparison of Convergence Speed
- Controlled Experiment
- Fail a link by filtering all the packets
- Comparison of Convergence Speed
- IP routing (XORP)
- RON reactive routing
- Reactive routing with UFO notification
48Evaluation2 comparison of Convergence Speed
- IP Routing (XORP)
- Hello-interval 15s
- Router-dead-interval 45s
49Evaluation2 comparison of Convergence Speed
- RON
- Probe interval 12s
- Probe timeout 3s
- Re-route immediately after outage detection
50Evaluation2 comparison of Convergence Speed
- UFO routing with explicit notification
- Re-route immediately after outage notification
51Outline
- Efficient overlay forwarding
- Overlay forwarding on line cards
- Hosting the overlay control plane
- Scalable overlay monitoring
- Registration of overlay links
- Notification of network events
- Lazy recovery
- Enhancing the scalability of UFO
- Implementation and Evaluation
- Conclusion and deployment
52Deployability Benefits
- Forwarding Support
- Low barriers to entry
- Routers already have hardware for setting tunnels
- Upgrade small fraction for overlay forwarding
- Notification Support
- Upgrade all routers to support notification
(could start with one AS) - Performance benefits and business incentives
- Better real-time applications VoIP
53Related Work
- Overlay routing
- Detour (Collins98)
- Resilient Overlay Networks (Andersen01)
- Improving forwarding efficiency
- Path reflection and path painting (Jannotti02)
- Reducing probing overhead
- Routing Underlay for Overlays (Nakao03)
- Network virtualization
- VINI, GENI, CABO, VERA
54Conclusion
- Contributions
- Scalable overlay routing is feasible with
in-network support - UFO provides strong reliability and a compelling
deployment model - Future Work
- Further performance evaluation
- Applications VoIP
- Application Layer Multicast (with NEC Lab)
55Acknowledgement
- General Exam Committee
- Prof. Jennifer Rexford (Advisor)
- Prof. Larry Peterson
- Prof. Vivek Pai
- Collaborators
- Andy Bavier and Nick Feamster (Georgia Tech)
- Cabernet Research Group
- VINI Support, Planetlab Operations
56Questions?
57FAQ recovery notification ?
- UFO does NOT support notification of recovery,
because - Alternative overlay paths available (overlays
dont care !) - Hard for routers to determine intra-domain
convergence synchronization to determine
data-plane convergence - Hard for routers to determine inter-domain
convergence