Rethinking Network Control - PowerPoint PPT Presentation

1 / 89
About This Presentation
Title:

Rethinking Network Control

Description:

Rethinking Network Control & Management The Case for a New 4D Architecture David A. Maltz Carnegie Mellon University Joint work with Albert Greenberg, Gisli Hjalmtysson – PowerPoint PPT presentation

Number of Views:240
Avg rating:3.0/5.0
Slides: 90
Provided by: dma114
Category:

less

Transcript and Presenter's Notes

Title: Rethinking Network Control


1
Rethinking Network Control ManagementThe Case
for a New 4D Architecture
  • David A. Maltz
  • Carnegie Mellon University
  • Joint work with
  • Albert Greenberg, Gisli Hjalmtysson
  • Andy Myers, Jennifer Rexford, Geoffrey Xie,
  • Hong Yan, Jibin Zhan, Hui Zhang

2
Is the Network Down Again?
  • You sit at your home computer, trying to access a
    computer at work
  • But no data is getting through
  • Minutes or hours later, data flows again
  • You never find out why

Network operators arent much better at
predicting outages
3
Outline
  • What do networks look like today?
  • New approach to predicting network behavior
  • A new architecture for controlling networks

4
Many Kinds of Networks
  • Each has different
  • Size generally 10-1000 routers each
  • Owner company, university, organization
  • Topology mesh, tree, ring
  • Examples
  • Enterprise/Campus networks
  • Access networks DSL, cable modems
  • Metro networks connect up biz in cities
  • Data center networks disk arrays servers
  • Transit/Backbone networks

5
A Conventional View of a Network
E
H
A
C
F
I
J
D
B
G
  • Physical topology is a graph of nodes and links
  • Run Dijkstra to find route to each node

6
A Conventional View of a Network
E
H
A
C
F
I
Knowing how the routers are connected says almost
nothing about whether or not two hosts can
communicate
J
D
B
G
  • Physical topology is a graph of nodes and links
  • Run Dijkstra to find route to each node

7
Network Equipment
Picture from Internet2 Abilene Network
  • Boxes router, switch
  • Links Ethernet, SONET, T1,

8
The Data Plane of a Network
Hosts/servers
Router/Switch
Interfaces
9
Packets
Source Address Destination Addr Port numbers .
Meta-data
Packet
User data
  • For this talk, networks traffic in packets
  • A sequence of bytes processed as a unit

10
The Data Plane of a Network
Destination NextHop
A left
B right
C left
  • Forwarding Information Base (FIB)
  • Basically a look-up table, each entry is a route
  • Tests fields of packet and determines which
    interface to send packet out

11
The Data Plane of a Network
Permit A-gtB Drop C-gtB
  • Packet Filter
  • Specific to a single interface
  • Tests fields of packet and determines whether to
    permit or drop packet
  • Finer granularity than FIB can test more
    fields, even target specific applications

12
The Data Plane of a Network
  • Many other mechanisms
  • Queueing discipline
  • Packet transformers (e.g., address translation)

13
The Control Plane of a Network
Destination NextHop
A left
B right
C left
  • Where do FIB entries come from?
  • A distributed system called the Control Plane
  • Control plane failures responsible for many of
    the longest, hardest to debug outages!

14
The Control Plane of a Network
Routing Process
FIB
  • Routers run routing processes

15
The Control Plane of a Network
Routing Process
Routing Process
Routing Process
FIB
FIB
FIB
  • Adjacent processes exchange routing information
  • Information format defined by routing protocol
  • Many routing protocols BGP, OSPF, RIP, EIGRP
  • Adjacent processes must use the same protocol

16
The Control Plane of a Network
Routing Process
Routing Process
Routing Process
Destination NextHop
D left
FIB
FIB
FIB
  • Routing protocols define logic for computing
    routes
  • Combine all available information
  • Pick best route for each destination

17
Control Plane Creates Resiliency
Routing Process
D left
D
D
Routing Process
Routing Process
D
D
D left
D left
18
Control Plane Creates Resiliency
Routing Process
D right
D
Routing Process
Routing Process
D
D
D left
D left
19
A Study of Operational Production Networks
  • How complicated/simple are real control planes?
  • What is the structure of the distributed system?
  • Use reverse-engineering methodology
  • There are few or no documents
  • The ones that exist are out-of-date
  • Anonymized configuration files for 31 active
    networks (gt8,000 configuration files)
  • 6 Tier-1 and Tier-2 Internet backbone networks
  • 25 enterprise networks
  • Sizes between 10 and 1,200 routers
  • 4 enterprise networks significantly larger than
    the backbone networks

20
Excerpts from a Router Configuration File
  • interface Ethernet0
  • ip address 6.2.5.14 255.255.255.128
  • interface Serial1/0.5 point-to-point
  • ip address 6.2.2.85 255.255.255.252
  • ip access-group 143 in
  • frame-relay interface-dlci 28
  • router ospf 64
  • redistribute connected subnets
  • redistribute bgp 64780 metric 1 subnets
  • network 66.251.75.128 0.0.0.127 area 0
  • router bgp 64780
  • redistribute ospf 64 match route-map
    8aTzlvBrbaW
  • neighbor 66.253.160.68 remote-as 12762
  • neighbor 66.253.160.68 distribute-list 4 in

access-list 143 deny 1.1.0.0/16 access-list 143
permit any route-map 8aTzlvBrbaW deny 10 match
ip address 4 route-map 8aTzlvBrbaW permit 20
match ip address 7 ip route 10.2.2.1/16 10.2.1.7
21
Size of Configuration Files in One Network
2000
Lines in config file
1000
0
881
0
Router ID (sorted by file size)
22
Routing Processes Implement Policy
Routing Process
Routing Process
Routing Process
A
A,B
FIB
FIB
FIB
R1
R2
R3
  • Extensive use of policy commands to filter routes
  • Prevent some hosts from communicating security
    policy
  • Limit access to short-cut links resource policy

23
Packet Filters Implement Policy
  • Packet filters used extensively throughout
    networks
  • Protect routers from attack
  • Implement reachability matrix
  • Define which hosts can communicate
  • Localize traffic, particularly multicast

24
Mechanisms for Action at a Distance
A
Routing Process
Routing Process
Routing Process
Atag12
Atag12
Tag?
FIB
FIB
FIB
R1
R2
R3
  • Policy often implemented by tagging routes on one
    router
  • And testing for tag at another router

25
Multiple Interacting Routing Processes
Client
Server
26
The Routing Instance Graph of a 881 Router
Network
27
Take Away Points
  • Networks deal with both creating connectivity
  • and preventing it
  • Networks controlled by complex distributed
    systems
  • Must understand system to understand behavior
  • Focusing on individual protocols is not enough
  • Composition of protocols is important and complex
  • Developed abstractions to model routing design
  • Routing Process Graph accurately model design
  • Routing Instance abstracts away details
  • Reverse-engineer routing design from configs

28
Outline
  • What do networks look like today?
  • New approach to predicting network behavior
  • Frame the problem of reachability analysis
  • Sketch algebra for predicting reachability
  • A new architecture for controlling networks

29
Reachability
A
B
j
i
  • Can A send a packet to B?
  • Depends on routing protocols, advertised routes,
    policies, packet filters, ...
  • Predicting reachability is key to network
    survivability and security

30
Reachability
A
B
j
i
  • We focus on two types of policy
  • Survivability Certain packets should always be
    permitted, under all possible network states
  • Security Certain packets should never be
    permitted, under all possible network states

31
Reachability Example
R1
R2
Chicago (chi)
New York (nyc)
Data Center
Front Office
R5
R4
R3
  • Two locations, each with data center front
    office
  • All routers exchange routes over all links

32
Reachability Example
R1
R2
Chicago (chi)
New York (nyc)
Data Center
Front Office
R5
R4
R3
chi-DC
chi-FO
nyc-DC
nyc-FO
chi-DC
chi-FO
nyc-DC
nyc-FO
33
Reachability Example
Packet filter Drop nyc-FO -gt Permit
R1
R2
chi
Data Center
Front Office
Packet filter Drop chi-FO -gt Permit
R5
nyc
R4
R3
34
Reachability Example
Packet filter Drop nyc-FO -gt Permit
R1
R2
chi
Data Center
Front Office
Packet filter Drop chi-FO -gt Permit
R5
nyc
R4
R3
  • A new short-cut link added between data centers
  • Intended for backup traffic between centers

35
Reachability Example
Packet filter Drop nyc-FO -gt Permit
R1
R2
chi
Data Center
Front Office
Packet filter Drop chi-FO -gt Permit
R5
nyc
R4
R3
  • Oops new link lets packets violate security
    policy!
  • Routing changed, but
  • Packet filters dont update automatically

36
Reachability Example
Packet filter Drop nyc-FO -gt Permit
R2
R1
chi
Data Center
Front Office
Packet filter Drop chi-FO -gt Permit
R5
nyc
R4
R3
  • Typical response add more packet filters to
    plug the holes in security policy

37
Reachability Example
Drop nyc-FO -gt
R2
R1
chi
Data Center
Front Office
R5
nyc
Drop chi-FO -gt
R4
R3
  • Packet filters have surprising consequences
  • Consider a link failure
  • chi-FO and nyc-FO still connected

38
Reachability Example
Drop nyc-FO -gt
R2
R1
chi
Data Center
Front Office
R5
nyc
Drop chi-FO -gt
R4
R3
  • Network has less survivability than topology
    suggests
  • chi-FO and nyc-FO still connected
  • But packet filter means no data can flow!
  • Probing the network wont predict this problem

39
State of the Art in Reachability Analysis
  • Build the network, try sending packets
  • ping, traceroute, monitoring tools
  • Only checks paths currently selected by routing
    protocols
  • Cannot be used for what if analysis
  • Our goal Static Reachability Analysis
  • Predict reachability over multiple scenarios
    through analysis of router configuration files

40
Predicting Reachability
  • How can we formalize the reachability provided by
    a network?
  • The set of packets the network will carry from
    router i to router j
  • A function of the forwarding state s
  • s represents the contents of each FIB
  • Ri,j(s) is the instantaneous reachability

Ri,j(s)
j
i
41
Computing Reachability
Packets allowed along path p
The set of all paths from i to j
R1
R2
F2,1(s)
F1,2(s)
F2,3(s)
Fi,j(s) Set of packets permitted along link from
node i to node j in network state s
F3,4(s)
F3,2(s)
R4
F4,3(s)
R3
42
Jointly Modeling the Effects of Packet Filters
and Routing
  • Key Problem
  • Fi,j(s) affected by routing and packet filters
  • Key Insight
  • Treat routes as dynamic packet filters

R1
R3
R2
Dest NextHop
A R3
B R1
C R3
43
Bounding the Instantaneous Reachability
  • Knowing the exact forwarding state s is
    impractical
  • Knowing Ri,j(s) doesnt help much, anyway
  • Want to predict behavior over a range of states
  • Luckily, predicting behavior over set of all
    possible states is easier than predicting
    reachability for a single state

44
Reachability Bounds
  • Lower bound on Reachability
  • Packets in this set never prohibited by network
  • Upper bound on Reachability
  • Packets not in this set always prohibited by
    network

45
Example Upper Bound Analysis
  • Before short-cut link added
  • After short-cut link added

46
Example Lower Bound Analysis
Packet filter Drop nyc-FO -gt Permit
R2
chi
R1
Packet filter Drop chi-FO -gt Permit
R5
nyc
R4
R3
  • Before extra packet filters added
  • After extra packet filters added

47
Take Away Points
  • We have defined an algebra for modeling
    reachability
  • Packet filters, routing protocols, NAT
  • GriffinBush validated RFC 2547 VPNs
  • Status
  • Algebra works on test cases
  • Currently experimenting with production networks
  • Algebras strength and weakness is static
    analysis
  • Can validate that network meets static objectives
  • Can have false positives
  • Cannot design the network to meet objectives
  • Cannot control network to obey dynamic objectives

48
Outline
  • What do networks look like today?
  • New approach to predicting network behavior
  • A new architecture for controlling networks
  • New principles for network control
  • New architecture embodying those principles
  • Experimental validation

49
Does Network Control Actually Matter?
  • YES!
  • Microsoft All services fell off the network for
    23 hours due to misconfiguration of routers in
    their network (2001)
  • Major ISP 50 of outages occur during planned
    maintenance (2005)
  • IP networks have 2-3x the outages as
    circuit-switched networks (2005)

50
Three Principles forNetwork Control Management
  • Network-level Objectives
  • Express goals explicitly
  • Security policies, QoS, egress point selection
  • Do not bury goals in box-specific configuration

Reachability matrix Traffic engineering rules
Management Logic
51
Three Principles forNetwork Control Management
  • Network-wide Views
  • Design network to provide timely, accurate info
  • Topology, traffic, resource limitations
  • Give logic the inputs it needs

Reachability matrix Traffic engineering rules
Management Logic
Read state info
52
Three Principles forNetwork Control Management
  • Direct Control
  • Allow logic to directly set forwarding state
  • FIB entries, packet filters, queuing parameters
  • Logic computes desired network state, let it
    implement it

Reachability matrix Traffic engineering rules
Write state
Management Logic
Read state info
53
Overview of the 4D Architecture
Network-level objectives
Decision
Dissemination
Direct control
Network-wide views
Discovery
Data
  • Decision Plane
  • All management logic implemented on centralized
    servers making all decisions
  • Decision Elements use views to compute data plane
    state that meets objectives, then directly writes
    this state to routers

54
Overview of the 4D Architecture
Network-level objectives
Decision
Dissemination
Direct control
Network-wide views
Discovery
Data
  • Dissemination Plane
  • Provides a robust communication channel to each
    router
  • May run over same links as user data, but
    logically separate and independently controlled

55
Overview of the 4D Architecture
Network-level objectives
Decision
Dissemination
Direct control
Network-wide views
Discovery
Data
  • Discovery Plane
  • Each router discovers its own resources and its
    local environment
  • E.g., the identity of its immediate neighbors

56
Overview of the 4D Architecture
Network-level objectives
Decision
Dissemination
Direct control
Network-wide views
Discovery
Data
  • Data Plane
  • Spatially distributed routers/switches
  • No need to change todays technology

57
Control Management Today
  • Management Plane
  • Figure out what is happening in network
  • Decide how to change it

Shell scripts
Traffic Eng
Planning tools
Databases
Config files
SNMP
netflow
OSPF
  • Data Plane
  • Distributed routers
  • Forwarding, filtering, queueing
  • Based on FIB or labels

Packet filters
58
Good Abstractions Reduce Complexity
Management Plane
Configs
Decision Plane
Control Plane
FIBs, ACLs
FIBs, ACLs
Dissemination
Data Plane
Data Plane
  • All decision making logic lifted out of control
    plane
  • Eliminates duplicate logic in management plane
  • Dissemination plane provides robust communication
    to/from data plane routers

59
Three Key Questions
  • Could the 4D architecture ever be deployed?
  • Is the 4D architecture feasible?
  • Can the 4D architecture actually simplify network
    control and management?

60
Deployment of the 4D Architecture
  • Pre-existing industry trend towards separating
    router hardware from software
  • IETF FORCES, GSMP, GMPLS
  • SoftRouter Lakshman, HotNets04
  • Incremental deployment path exists
  • Individual networks can upgrade to 4D and gain
    benefits
  • Small enterprise networks have most to gain

61
The Feasibility of the 4D Architecture
  • We designed and built a prototype of the 4D
  • Decision plane
  • Contains logic to simultaneously compute routes
    and enforce reachability matrix
  • Multiple Decision Elements per network, using
    simple election protocol to pick master
  • Dissemination plane
  • Uses source routes to direct control messages
  • Extremely simple, but can route around failed
    data links

62
Performance of the 4D Prototype
  • Evaluated using Emulab (www.emulab.net)
  • Linux PCs used as routers (650 800MHz)
  • Tested on 9 enterprise network topologies (10-100
    routers each)
  • Recovers from single link failure in lt 300 ms
  • lt 1 s response considered excellent
  • Survives failure of master Decision Element
  • New DE takes control within 1 s
  • No disruption unless second fault occurs
  • Gracefully handles complete network partitions
  • Less than 1.5 s of outage

63
4D Makes Network Management Control Error-proof
Packet filter Drop nyc-FO -gt Permit
R1
R2
chi
Data Center
Front Office
Packet filter Drop chi-FO -gt Permit
R5
nyc
R4
R3
chi-DC
chi-FO
nyc-DC
nyc-FO
chi-DC
chi-FO
nyc-DC
nyc-FO
64
Prohibiting Packets from chi-FO to nyc-DC
65
4D Makes Network Management Control Error-proof
Drop nyc-FO -gt
R2
R1
chi
Data Center
Front Office
R5
nyc
Drop chi-FO -gt
R4
R3
66
Allowing Packets from chi-FO to nyc-FO
67
Related Work
  • Driving network operation from network-wide views
  • Traffic Engineering
  • Traffic Matrix computation
  • Centralization of decision making logic
  • Routing Control Point Feamster
  • Path Computation Element Farrel
  • Signaling System 7 Ma Bell

68
Take Aways
  • No need for complicated distributed system in
    control plane do away with it!
  • 4D Architecture a promising approach
  • Power of solution comes from
  • Colocating all decision making in one plane
  • Providing that plane with network-wide views
  • Directly express solution by writing forwarding
    state
  • Benefits
  • Coordinated state updates ! better reliability
  • Separates network issues from distributed systems
    issues

69
Summary
  • Networks must meet many different types of
    objectives
  • Security, traffic engineering, robustness
  • Today, objectives met using control plane
    mechanisms
  • Results in complicated distributed system
  • Ripe with opportunities to set time-bombs
  • Predicting static properties is possible, but
    difficult
  • Refactoring into a 4D Architecture very
    promising
  • Separates network issues from reliability issues
  • Eliminates duplicate logic and simplifies network
  • Enables new capabilities, like joint control

70
Questions?
71
Backup Slides
72
Computing Reachability Bounds
  • Problem reduced to estimating all routes
    potentially in routing table (FIB) of each router
  • Much easier than predicting exactly which routes
    will be in FIB

73
How to Organize the Decision Plane?
  • We have exposed the network control logic --- now
    what?
  • Need a way to structure that logic
  • Mutual optimization of multiple objectives
  • Potentially mutually exclusive
  • Each objective has different time constants
  • Multiple objectives may affect the same bit of
    data-plane state

74
Future Directions
  • 4D in different network contexts
  • Ethernet networks
  • Mixed networks circuit- and packet-switched
  • Include services in the 4D
  • Domain Name Service
  • HTTP Proxies and load balancers

75
Reverse-Engineering Overview
Configuration files
Find links
Construct Layer 3 Topology
Find adjacent routing processes
Construct Routing Process Graph
Condense adjacent routing processes
AS2
Construct Routing Instance Graph
OSPF 1
OSPF 2
BGP AS1
76
Reconstruct the Layer 3 Topology
Internet
Router 1 Config
Router 2 Config
interface Serial1/0.5 ip address 1.1.1.1
255.255.255.252 .
interface Serial2/1.5 ip address 1.1.1.2
255.255.255.252 .
77
Abstract to a Routing Instance Graph
AS2
Policy1
Policy2
OSPF 1
OSPF 2
BGP AS1
  • Pick an unassigned Routing Process
  • Flood fill along process adjacencies, labeling
    processes
  • Repeat until all processes assigned to an
    Instance

78
Textbook Routing Design for Enterprise Networks
EBGP
EBGP
  • Border routers speak eBGP to external peers
  • BGP selects a few key external routes to
    redistribute into OSPF
  • 7 of 25 enterprise networks follow this pattern

AS2
OSPF
BGP AS 1
AS3
79
Reality A Diversity of Unusual Routing Designs
Rest of the World
BGP AS 2
BGP AS 1
BGP AS 3
BGP AS 4
BGP AS 5
  • Network broken up into compartments, each with
    only 1 to 4 routers
  • Each compartment has its own AS number
  • Hub and spoke logical topology
  • Why? Lots of control over how spokes communicate

80
Reality A Diversity of Unusual Routing Designs
Rest of the World
BGP AS 1
BGP AS 2
EIGRP
EIGRP
EIGRP
Rest of the World
BGP AS 3
BGP AS 4
  • Network broken up into many compartments, each
    running EIGRP, some with 400 routers
  • BGP used to filter routes passed between
    compartments
  • Compartments themselves pass information between
    BGP speakers
  • Why? Little need for IBGP few routers speak
    BGP Lots of control over how packets move
    between compartments

81
Link Down
82
Reconvergence Time UnderSingle Link Failure
83
Reconvergence Time When Master DE Crashes
84
Reconvergence Time WhenNetwork Partitions
85
Reconvergence Time WhenNetwork Partitions
86
Slides in Progressor Looking for a Place to go
87
Separation of Issues
  • The 4D Architecture separates issues
  • Networking logic goes into decision plane

88
Dissemination Plane
  • Make clear that dissem paths can use same
    physical links, but different routing
  • Discovery and dissem packets can be independent
    of data-plane (e.g. IP)
  • IP is very configuration intensive (addresses,
    etc) so we avoid it whenever possible

89
Questions
  • What if I want to take a bunch of hosts and stick
    them together into a small network? Havent you
    made this common case terrifically hard?
  • Today, Id use static routes its neither
    common nor easy
  • In the 4D model, what do I do?
  • DE co-located on the host
  • Doesnt talk to any other DEs or routers

90
Problems with State of the Art
  • Today Network behavior determined by multiple
    interacting distributed programs, written in
    assembly language
  • No way to visualize or describe routing design
  • Impossible to establish linkage between
    configurations and network objectives
  • Only a few textbook routing designs are widely
    known
Write a Comment
User Comments (0)
About PowerShow.com