Rethinking Network Control - PowerPoint PPT Presentation

1 / 59
About This Presentation
Title:

Rethinking Network Control

Description:

The Role of Network Control and Management. Many different ... Shell scripts. Traffic Eng. Databases. Planning tools. OSPF. SNMP. netflow. modems. Configs ... – PowerPoint PPT presentation

Number of Views:180
Avg rating:3.0/5.0
Slides: 60
Provided by: dma114
Category:

less

Transcript and Presenter's Notes

Title: Rethinking Network Control


1
Rethinking Network Control ManagementThe Case
for a New 4D Architecture
  • David A. Maltz
  • Carnegie Mellon University/Microsoft Research
  • Joint work with
  • Albert Greenberg, Gisli Hjalmtysson
  • Andy Myers, Jennifer Rexford, Geoffrey Xie,
  • Hong Yan, Jibin Zhan, Hui Zhang

2
The Role of Network Control and Management
  • Many different network environments
  • Access, backbone networks
  • Data-center networks, enterprise/campus
  • Sizes 10-10,000 routers/switches
  • Many different technologies
  • Longest-prefix routing (IP), fixed-width routing
    (Ethernet), label switching (MPLS, ATM), circuit
    switching (optical, TDM)
  • Many different policies
  • Routing, reachability, transit, traffic
    engineering, robustness
  • The control plane software binds these elements
    together and defines the network

3
We Can Change the Control Plane!
  • Pre-existing industry trend towards separating
    router hardware from software
  • IETF FORCES, GSMP, GMPLS
  • SoftRouter Lakshman, HotNets04
  • Incremental deployment path exists
  • Individual networks can upgrade their control
    planes and gain benefits
  • Small enterprise networks have most to gain
  • No changes to end-systems required

4
A Clean-slate Design
  • What are the fundamental causes of network
    problems?
  • How to secure the network and protect the
    infrastructure?
  • How to provide flexibility in defining management
    logic?
  • What functionality needs to be distributed what
    can be centralized?
  • How to reduce/simplify the software in networks?
  • What would a RISC router look like?
  • How to leverage technology trends?
  • CPU and link-speed growing faster than of
    switches

5
Three Principles forNetwork Control Management
  • Network-level Objectives
  • Express goals explicitly
  • Security policies, QoS, egress point selection
  • Do not bury goals in box-specific configuration

Reachability matrix Traffic engineering rules
Management Logic
6
Three Principles forNetwork Control Management
  • Network-wide Views
  • Design network to provide timely, accurate info
  • Topology, traffic, resource limitations
  • Give logic the inputs it needs

Reachability matrix Traffic engineering rules
Management Logic
Read state info
7
Three Principles forNetwork Control Management
  • Direct Control
  • Allow logic to directly set forwarding state
  • FIB entries, packet filters, queuing parameters
  • Logic computes desired network state, let it
    implement it

Reachability matrix Traffic engineering rules
Write state
Management Logic
Read state info
8
Overview of the 4D Architecture
Network-level objectives
Decision
Dissemination
Direct control
Network-wide views
Discovery
Data
  • Decision Plane
  • All management logic implemented on centralized
    servers making all decisions
  • Decision Elements use views to compute data plane
    state that meets objectives, then directly writes
    this state to routers

9
Overview of the 4D Architecture
Network-level objectives
Decision
Dissemination
Direct control
Network-wide views
Discovery
Data
  • Dissemination Plane
  • Provides a robust communication channel to each
    router and robustness is the only goal!
  • May run over same links as user data, but
    logically separate and independently controlled

10
Overview of the 4D Architecture
Network-level objectives
Decision
Dissemination
Direct control
Network-wide views
Discovery
Data
  • Discovery Plane
  • Each router discovers its own resources and its
    local environment
  • E.g., the identity of its immediate neighbors

11
Overview of the 4D Architecture
Network-level objectives
Decision
Dissemination
Direct control
Network-wide views
Discovery
Data
  • Data Plane
  • Spatially distributed routers/switches
  • Can deploy with todays technology
  • Looking at ways to unify forwarding paradigms
    across technologies

12
Concerns and Challenges
  • Distributed Systems issues
  • How will communication between routers and DEs
    survive failures in the network?
  • Latency means DEs view of network is behind
    reality. Will the control loop be stable?
  • What is the overhead to/from the DEs?
  • What happens in a network partition?
  • Networking issues
  • Does the 4D simplify control and management?
  • Can we create logic to meet multiple objectives?

13
The Feasibility of the 4D Architecture
  • We designed and built a prototype of the 4D
    Architecture
  • 4D Architecture permits many designs prototype
    is a single, simple design point
  • Decision plane
  • Contains logic to simultaneously compute routes
    and enforce reachability matrix
  • Multiple Decision Elements per network, using
    simple election protocol to pick master
  • Dissemination plane
  • Uses source routes to direct control messages
  • Extremely simple, but can route around failed
    data links

14
Evaluation of the 4D Prototype
  • Evaluated using Emulab (www.emulab.net)
  • Linux PCs used as routers (650 800MHz)
  • Tested on 9 enterprise network topologies
    (10-100 routers each)

Example network with 49 switches and 5 DEs
15
Performance of the 4D Prototype
  • Trivial prototype has performance comparable to
    well-tuned production networks
  • Recovers from single link failure in lt 300 ms
  • lt 1 s response considered excellent
  • Faster forwarding reconvergence possible
  • Survives failure of master Decision Element
  • New DE takes control within 1 s
  • No disruption unless second fault occurs
  • Gracefully handles complete network partitions
  • Less than 1.5 s of outage

16
Fundamental Problem Wrong Abstractions
Shell scripts
Traffic Eng
  • Management Plane
  • Figure out what is happening in network
  • Decide how to change it

Planning tools
Databases
Configs
SNMP
netflow
modems
OSPF
  • Control Plane
  • Multiple routing processes on each router
  • Each router with different configuration program
  • Huge number of control knobs metrics, ACLs,
    policy

Link metrics
Routing policies
FIB
  • Data Plane
  • Distributed routers
  • Forwarding, filtering, queueing
  • Based on FIB or labels

FIB
FIB
Packet filters
17
Good Abstractions Reduce Complexity
Management Plane
Configs
Decision Plane
Control Plane
FIBs, ACLs
FIBs, ACLs
Dissemination
Data Plane
Data Plane
  • All decision making logic lifted out of control
    plane
  • Eliminates duplicate logic in management plane
  • Dissemination plane provides robust communication
    to/from data plane switches

18
Today Simple Things are Hard to Do
D
Inter-POP Links
Access Networks
19
Fundamental Problem Configurations Allow Too
Many Degrees of Freedom
  • Computing configuration files that cause control
    plane to compute desired forwarding states is
    intractable
  • NP-hard in many cases
  • Requires predictive model of control plane
    behavior
  • Configurations files form a program that defines
    a set of forwarding states
  • Very hard to create program that permits only
    desired states, and doesnt transit through bad
    ones

Forwarding states allowed by configs
Auto-adaptation leads to/thru bad states
Direct Control avoids bad states
20
Fundamental Problem Conflation of Issues
  • Ideal case all routing information flooded to
    all routers inside network
  • Robustness achieved via flooding
  • Reality routing information filtered and
    aggregated extensively
  • Route filtering used to implement security and
    resource policies
  • Route aggregation used to achieve scalability

21
4D Separates Distributed Computing Issues from
Networking Issues
  • Distributed computing issues ! protocols and
    network architecture
  • Overhead
  • Resiliency
  • Scalability
  • Networking issues ! management logic
  • Traffic engineering and service provisioning
  • Egress point selection
  • Reachability control (VPNs)
  • Precomputation of backup paths

22
Future Work
  • Scalability
  • Evaluate over 1-10K switches, 10-100K routes
  • Networks with backbone-like propagation delays
  • Structuring decision logic
  • Arbitrate among multiple, potentially competing
    objectives
  • Unify control when some logic takes longer than
    others
  • Protocol improvements
  • Better dissemination and discovery planes
  • Deployment in todays networks
  • Data center, enterprise, campus, backbone (RCP)

23
Future Work
  • Experiment with network appliances
  • Traffic shapers, traffic scrubbers
  • Expand relationships with security
  • Using 4D as mechanism for monitoring/quarantine
  • Formulate models that establish bounds of 4D
  • Scale, latency, stability, failure models,
    objectives
  • Generate evidence to support/refute principles

24
Questions?
25
Direct Control Provides Complete Control
  • Zero device-specific configuration
  • Supports many models for pushing routes
  • Trivial push convergence requires time for all
    updates to be receive and applied same as today
  • Synchronized update updates propagated, but not
    applied till agreed time in the future clock
    skew defines convergence time
  • Controlled state trajectory DE serializes
    updates to avoid all incorrect transient states

26
Fundamental Problem Wrong Abstractions
  • interface Ethernet0
  • ip address 6.2.5.14 255.255.255.128
  • interface Serial1/0.5 point-to-point
  • ip address 6.2.2.85 255.255.255.252
  • ip access-group 143 in
  • frame-relay interface-dlci 28
  • router ospf 64
  • redistribute connected subnets
  • redistribute bgp 64780 metric 1 subnets
  • network 66.251.75.128 0.0.0.127 area 0
  • router bgp 64780
  • redistribute ospf 64 match route-map
    8aTzlvBrbaW
  • neighbor 66.253.160.68 remote-as 12762
  • neighbor 66.253.160.68 distribute-list 4 in

access-list 143 deny 1.1.0.0/16 access-list 143
permit any route-map 8aTzlvBrbaW deny 10 match
ip address 4 route-map 8aTzlvBrbaW permit 20
match ip address 7 ip route 10.2.2.1/16 10.2.1.7
27
Fundamental Problem Wrong Abstractions
2000
Size of configuration files in a single
enterprise network (881 routers)
Lines in config file
1000
0
881
0
Router ID (sorted by file size)
28
(No Transcript)
29
(No Transcript)
30
Fundamental Problem Conflating Distributed
Systems Issues with Networking Issues
Routing Process
D left
D
D
Routing Process
Routing Process
D
D
D left
D left
  • Distributed Systems Concern resiliency to link
    failures
  • Solution multiple paths through routing process
    graph

31
Fundamental Problem Conflating Distributed
Systems Issues with Networking Issues
Routing Process
D right
D
Routing Process
Routing Process
D
D
D left
D left
  • Distributed Systems Concern resiliency to link
    failures
  • Solution multiple paths through routing process
    graph

32
Fundamental Problem Conflating Distributed
Systems Issues with Networking Issues
Routing Process
Filter routes to D
D left
D
D
Routing Process
Routing Process
D
D
D left
D left
  • Networking Concern implement resource or
    security policy
  • Solution restrict flow of routing information,
    filter routes, summarize/aggregate routes

33
4D Supports Network Evolution Expansion
  • Decision logic can be upgraded as needed
  • No need for update of distributed protocols
    implemented in software distributed on every
    switch
  • Decision Elements can be upgraded as needed
  • Network expansion requires upgrades only to DEs,
    not every switch

34
Reachability Example
R1
R2
Chicago (chi)
New York (nyc)
Data Center
Front Office
R5
R4
R3
  • Two locations, each with data center front
    office
  • All routers exchange routes over all links

35
Reachability Example
R1
R2
Chicago (chi)
New York (nyc)
Data Center
Front Office
R5
R4
R3
chi-DC
chi-FO
nyc-DC
nyc-FO
chi-DC
chi-FO
nyc-DC
nyc-FO
36
Reachability Example
Packet filter Drop nyc-FO -gt Permit
R1
R2
chi
Data Center
Front Office
Packet filter Drop chi-FO -gt Permit
R5
nyc
R4
R3
37
Reachability Example
Packet filter Drop nyc-FO -gt Permit
R1
R2
chi
Data Center
Front Office
Packet filter Drop chi-FO -gt Permit
R5
nyc
R4
R3
  • A new short-cut link added between data centers
  • Intended for backup traffic between centers

38
Reachability Example
Packet filter Drop nyc-FO -gt Permit
R1
R2
chi
Data Center
Front Office
Packet filter Drop chi-FO -gt Permit
R5
nyc
R4
R3
  • Oops new link lets packets violate security
    policy!
  • Routing changed, but
  • Packet filters dont update automatically

39
Prohibiting Packets from chi-FO to nyc-DC
40
Reachability Example
Packet filter Drop nyc-FO -gt Permit
R2
R1
chi
Data Center
Front Office
Packet filter Drop chi-FO -gt Permit
R5
nyc
R4
R3
  • Typical response add more packet filters to
    plug the holes in security policy

41
Reachability Example
Drop nyc-FO -gt
R2
R1
chi
Data Center
Front Office
R5
nyc
Drop chi-FO -gt
R4
R3
  • Packet filters have surprising consequences
  • Consider a link failure
  • chi-FO and nyc-FO still connected

42
Reachability Example
Drop nyc-FO -gt
R2
R1
chi
Data Center
Front Office
R5
nyc
Drop chi-FO -gt
R4
R3
  • Network has less survivability than topology
    suggests
  • chi-FO and nyc-FO still connected
  • But packet filter means no data can flow!
  • Probing the network wont predict this problem

43
Allowing Packets from chi-FO to nyc-FO
44
Multiple Interacting Routing Processes
Client
Server
45
The Routing Instance Graph of a 881 Router
Network
46
Reconvergence Time UnderSingle Link Failure
47
Reconvergence Time When Master DE Crashes
48
Reconvergence Time WhenNetwork Partitions
49
Reconvergence Time WhenNetwork Partitions
50
Many Implementations Possible
Single redundant decision engine
  • Multiple decision engines
  • Hot stand-by
  • Divide network load share
  • Distributed decision engines
  • Up to one per router
  • Choice can be based on reliability requirements
  • Dessim. Plane can be in-band, or leverage OOB
    links
  • Less need for distributed solutions (harder to
    reason about)
  • More focus on network issues, less on distributed
    protocols

51
Direct Expression Enables New Algorithms
D
  • OSPF normally calculates a single path to each
    destination D
  • OSPF allows load-balancing only for equal-cost
    paths to avoid loops
  • Using ECMP requires careful engineering of link
    weights

D
  • Decision Plane with network-wide view can compute
    multiple paths
  • Backup paths installed for free!
  • Bounded stretch, bounded fan-in

52
Systems of Systems
  • Systems are designed as components to be used in
    larger systems in different contexts, for
    different purposes, interacting with different
    components
  • Example OSPF and BGP are complex systems in its
    own right, they are components in a routing
    system of a network, interacting with each other
    and packet filters, interacting with management
    tools
  • Complex configuration to enable flexibility
  • The glue has tremendous impact on network
    performance
  • State of art multiple interactive distributed
    programs written in assembly language
  • Lack of intellectual framework to understand
    global behavior

53
Supporting Network Evolution
  • Logic for controlling the network needs to change
    over time
  • Traffic engineering rules
  • Interactions with other networks
  • Service characteristics
  • Upgrades to field-deployed network equipment must
    be avoided
  • Very high cost
  • Software upgrades often require hardware upgrades
    (more CPU or memory)

54
Supporting Network EvolutionToday
  • Todays Solution
  • Vendors stuff their routers with software
    implementing all possible features
  • Multiple routing protocols
  • Multiple signaling protocols (RSVP, CR-LDP)
  • Each feature controlled by parameters set at
    configuration time to achieve late binding
  • Feature-creep creates configuration nightmare
  • Tremendous complexity for syntax semantics
  • Mis-interactions between features is common
  • Our Goal Separate decision making logic from the
    field-deployed devices

55
Supporting Network Expansion
  • Networks are constantly growing
  • New routers/switches/links added
  • Old equipment rarely removed
  • Adding a new switch can cause old equipment to
    become overloaded
  • CPU/Memory demands on each device should not
    scale up with network size

56
Supporting Network ExpansionToday
  • Routers run a link-state routing protocol
  • Size of link-state database scales with of
    routers
  • Expanding network can exceed memory limits of old
    routers
  • Todays Solution
  • Monitor resources on all routers
  • Predict approach of exhaustion and then
  • Global upgrade
  • Rearchitecture of routing design to add
    summarization, route aggregation, information
    hiding
  • Our Goal make demands scale with hardware (e.g.,
    of interfaces)

57
Supporting Remote Devices
  • Maintaining communication with all network
    devices is critical for network management
  • Diagnosis of problems
  • Monitoring status and network health
  • Updating configuration or software
  • the chicken or the egg.
  • Cannot send device configuration/management
    information until it can communicate
  • Device cannot communicate until it is correctly
    configured

58
Supporting Remote DevicesToday
  • Todays Solution
  • Use PSTN as management network of last resort
  • Connect console of remote routers to phone modem
  • Cant be used for customer premise equipment
    (CPE) DSL/cable modems, integrated access
    devices (IADs)
  • In a converged network, PSTN is decommissioned
  • Our Goal Preserve management communication to
    any device that is not physically partitioned,
    regardless of configuration state

59
Recent Publications
  • G. Xie, J. Zhan, D. A. Maltz, H. Zhang, A.
    Greenberg, G. Hjalmtysson, J. Rexford, On Static
    Reachability Analysis of IP Networks, IEEE
    INFOCOM 2005, Orlando, FL, March 2005.
  • J. Rexford, A. Greenberg, G. Hjalmtysson, D. A.
    Maltz, A. Myers, G. Xie, J. Zhan, H. Zhang,
    Network-Wide Decision Making Toward a
    Wafer-Thin Control Plane, Proceedings of ACM
    HotNets-III, San Diego, CA, November 2004.
  • D. A. Maltz, J. Zhan, G. Xie, G. Hjalmtysson, A.
    Greenberg, H. Zhang, Routing Design in
    Operational Networks A Look from the Inside,
    Proceedings of the 2004 Conference on
    Applications, Technologies, Architectures, and
    Protocols for Computer Communications (ACM
    SIGCOMM 2004), Portland, Oregon, 2004.
  • D. A. Maltz, J. Zhan, G. Xie, H. Zhang, G.
    Hjalmtysson, A. Greenberg, J. Rexford, Structure
    Preserving Anonymization of Router Configuration
    Data, Proceedings of ACM/Usenix Internet
    Measurement Conference (IMC 2004), Sicily, Italy,
    2004.
Write a Comment
User Comments (0)
About PowerShow.com