Title: Evolving Toward a Self-Managing Network
1Evolving Toward a Self-Managing Network
- Jennifer Rexford
- Princeton University
- http//www.cs.princeton.edu/jrex
2Why is Network Management So Darn Hard?
- Oodles and oodles of complex features
- Many protocols
- Many mechanisms
- Many configurable parameters
- Little guidance for network administrators
- How to select and compose features?
- How to set the configurable parameters?
- Managing boxes, rather than networks
- Routers, switches, firewalls, IDSes, servers,
etc. - Low-level, box-specific configuration languages
3The Enemy is Complexity
- Goal raising the level of abstraction
- Network-level design and configuration
- Composition of protocols and mechanisms
- Idea 1 add abstraction on top
- Compile high-level spec into box configuration
- But, must grapple with inherent complexity
- Idea 2 design system for manageability
- Identify network-level abstractions
- and change the boxes and protocols
- But, must grapple with backwards compatibility
4Example Border Gateway Protocol
- ASes exchange reachability information
- IP prefix block of destination IP addresses
- AS path sequence of ASes along the path
- Configurable routing policies
- Path selection (which route to use?)
- Path export (who to tell about the route?)
12.34.158.0/24 path (7018,1,88)
12.34.158.0/24 path (88)
88
1
7018
data traffic
data traffic
12.34.158.5
5Some Things I Hate About BGP
- Routers in an AS have different views
- Effect protocol oscillation and loops
- Point fix testing sufficient conditions
- Routing policy distributed across routers
- Effect routers need to share information
- Point fix complex tagging of BGP routes
- Policy has only an indirect effect on traffic
- Effect selecting the right policy is hard
- Point fix what if tools for traffic
engineering - BGP route selection depends on the IGP
- Effect disruptions from small internal changes
- Point fix what if tools to identify risks
6Interdomain Routing Design for Manageability
- Routing Control Platform
- Represents the AS to others
- Has complete view of candidate routes
- Computes answers for the ASs routers
- Communicates with other ASes
- Using BGP or (ideally) a brand new protocol
Inter-AS Protocol
RCP
RCP
RCP
AS 1
AS 2
AS 3
Physical peering
7Advantages of RCP Approach
- Lower management complexity
- Complete, network-wide view
- Direct control over the routers
- Single specification of policies and objectives
- Simpler routers
- Much less control-plane software
- Much less configuration state
- Enabling innovation
- New algorithms for selecting paths within an AS
- New approaches to inter-AS routing
8Deployability Backwards Compatibility using BGP
- Border Gateway Protocol (BGP)
- Protocol messages sent between routers
- Decision logic route-selection process
- Policy configurable rules for path
selection/export - The key point is that BGP has
- Complex decision logic and policies
- Yet a simple protocol (and message format)
- Use BGP messages to program the routers
9Phase 1 Flexible Path Selection in One AS
Before conventional use of BGP in backbone
network
eBGP
iBGP
After RCP learns routes and sends answers to
routers
eBGP
RCP
iBGP
10Phase 2 AS-Wide Path Selection and Export
Before RCP gets best iBGP routes (and IGP feed)
eBGP
RCP
iBGP
After RCP gets all eBGP routes from neighbors
eBGP
RCP
iBGP
11Phase 3 Direct Communication Between RCPs
Before RCP gets all eBGP routes from neighbors
eBGP
RCP
iBGP
After ASes exchange routes via RCP
Inter-AS Protocol
RCP
RCP
RCP
iBGP
AS 1
AS 2
AS 3
Physical peering
12Systems Considerations (NSDI05)
- Reliability
- Problem single point of failure
- Solution replication of RCP components
- Consistency
- Problem inconsistent decisions by replicas
- Solution consistency without inter-replica
protocol - Scalability
- Problem storing and computing for all routers
- Solution store each route once and amortize work
13Example Network Management Applications
- Customer-driven route selection
- Customized load-balancing policies
- Geographic rules for route selection
- Blocking denial-of-service attacks
- Blackhole routes that drop traffic
- Only for routers carrying attack traffic
- Hitless maintenance
- Move traffic away from certain routers
- Before the operators bring down the routers
14Conclusion
- Network management is too hard
- IP was not designed for management
- Complex, distributed operation of routers
- Must reduce complexity
- Network-wide views and objectives
- Direct control over the data plane
- RCP approach is feasible
- Deployable, scalable, and reliable
- Solves important management problems
- Many interesting open problems
15Backup Slides
16Routing Control Platform (RCP)
Routing Control Platform (RCP)
Route Control Server (RCS)
Answers
Options
Topology
OSPF Viewer
BGP Engine
BGP updates
OSPF link-state advertisements
BGP updates
Network
17Scalability Standard Computing Platform
- Prototype on a high-end PC
- 3.2 GHz Pentium-4 with 8 GB of RAM
- Running the Linux 2.6.5 kernel
- Workload from the ATT backbone
- Replay the BGP and OSPF messages
- Good RCP performance
- Memory usage less than 2GB
- Speed, BGP changes less than 40 msec
- Speed, topology changes 0.1-0.8 seconds
Short answer the system can keep up
18Reliability Replication and Consistency
- Replication avoid single point of failure
- Multiple RCPs in a network
- Connected at different places
- Consistency no explicit coordination
- Replica has full view of each partition
- Replicas perform the same algorithm on the same
data, and get the same answer
A, B
A
B
RCP A
RCP B