Title: Declarative Networking: Extensible Networks with Declarative Queries
1Declarative Networking Extensible Networks with
Declarative Queries
- Boon Thau Loo
- University of California, Berkeley
2Era of change for the Internet
in the thirty-odd years since its invention, new
uses and abuses, ., are pushing the Internet
into realms that its original design neither
anticipated nor easily accommodates..
Overcoming Barriers to Disruptive Innovation in
Networking, NSF Workshop Report 05
3Efforts at Internet Innovation
- Evolution Overlay Networks
- Commercial (Akamai, VPN, MS Exchange servers)
- P2P (filesharing, telephony)
- Research prototypes on testbed (PlanetLab)
- Revolution Clean slate design
- NSF Future Internet Design (FIND) program
- NSF Global Environment for Network Investigations
(GENI) initiative
Missing software tools that can significantly
accelerate Internet innovation
4Approach Declarative Networking
- A declarative framework for networks
- Declarative language ask for what you want, not
how to implement it - Declarative specifications of networks, compiled
to distributed dataflows - Runtime engine to execute distributed dataflows
- Observation Recursive queries are a natural fit
for routing
5P2 Declarative Networking System
http//p2.cs.berkeley.edu
P2 Declarative Networking System
Query Planner
Dataflow Engine
Dataflow
6The Case for Declarative
- Ease of programming
- Compact and high-level representation of
protocols - Orders of magnitude reduction in code size
- Easy customization
- Safety
- Queries are sandboxed within query processor
- Potential for static analysis techniques on
safety - What about efficiency?
- No fundamental overhead when executing standard
routing protocols - Application of well-studied query optimizations
- Note Same question was asked of relational
databases in the 70s.
7Main Contributions
- Declarative Routing HotNets 04, SIGCOMM 05
- Extensible Routers (balance of flexibility,
efficiency and safety). - Declarative Overlays SOSP 05
- Rapid prototyping of new overlay networks
- Database Fundamentals SIGMOD 06
- Network specific query language and semantics
- Distributed recursive query execution strategies
- Query Optimizations, classical and new
8A Breadth of Use Cases
- Implemented to date
- Textbook routing protocols (3-8 lines,
UCB/Wisconsin) - Chord DHT overlay routing (47 lines, UCB/IRB)
- Narada mesh (16 lines, UCB/Intel)
- Distributed Gnutella/Web crawlers (Dataflow, UCB)
- Lamport/Chandy snapshots (20 lines,
Intel/Rice/MPI) - Paxos distributed consensus (44 lines, Harvard)
- In Progress
- OSPF routing (UCB)
- Distributed Junction Tree statistical inference
(UCB)
9Outline
- Background
- The Connection Routing as a Query
- Execution Model
- Path-Vector Protocol Example
- Query specification ? protocol implementation
- More Examples
- Realizing the Connection
- P2 Declarative Routing Engine
- Beyond routing Declarative Overlays
- Conclusion
10Traditional Router
Routing Protocol
Control Plane
Forwarding Plane
Traditional Router
11Review Path Vector Protocol
- Advertisement entire path to a destination
- Each node receives advertisement, add itself to
path and forward to neighbors
pathc,d
pathb,c,d
patha,b,c,d
b
d
c
a
c advertises c,d
b advertises b,c,d
12Declarative Router
P2 Engine
Routing Protocol
Control Plane
Forwarding Plane
Declarative Router
Traditional Router
13Datalog rule syntax
ltresultgt ? ltcondition1gt, ltcondition2gt, ,
ltconditionNgt.
Head
Body
- Types of conditions is body
- Input tables link(src,dst) predicate
- Arithmetic and list operations
- Head is an output table
- Recursive rules result of head in rule body
14All-Pairs Reachability
R1 reachable(S,D) ? link(S,D)
R2 reachable(S,D) ? link(S,Z), reachable(Z,D)
For all nodes S,D, If there is a
link from S to D, then S can reach D.
link(a,b) there is a link from node a to node
b
reachable(a,b) node a can reach node b
- Input link(source, destination)
- Output reachable(source, destination)
15All-Pairs Reachability
R1 reachable(S,D) ? link(S,D)
R2 reachable(S,D) ? link(S,Z), reachable(Z,D)
For all nodes S,D and Z, If there is
a link from S to Z, AND Z can reach D, then S
can reach D.
- Input link(source, destination)
- Output reachable(source, destination)
16Towards Network Datalog
- Specify tuple placement
- Value-based partitioning of tables
- Tuples to be combined are co-located
- Rule rewrite ensures body is always single-site
- All communication is among neighbors
- No multihop routing during basic rule execution
- Enforced via simple syntactic restrictions
17Network Datalog
R1 reachable(_at_S,D) ? link(_at_S,D) R2
reachable(_at_S,D) ? link(_at_S,Z), reachable(_at_Z,D)
Query reachable(_at_a,N)
Query reachable(_at_M,N)
link
link
link
link
Input table
b
d
c
a
reachable
reachable
reachable
reachable
Output table
Query reachable(_at_a,N)
18Path Vector in Network Datalog
R1 path(_at_S,D,P) ? link(_at_S,D), P(S,D).
path(_at_Z,D,P2),
?
link(_at_Z,S),
path(_at_S,D,P)
PS?P2.
R2
Query path(_at_S,D,P)
Add S to front of P2
- Input link(_at_source, destination)
- Query output path(_at_source, destination,
pathVector)
19Query Execution
R1 path(_at_S,D,P) ? link(_at_S,D), P(S,D). R2
path(_at_S,D,P) ? link(_at_Z,S), path(_at_Z,D,P2), PS?P2.
Query path(_at_a,d,P,C)
link
link
link
link
Neighbor table
b
d
c
a
Forwarding table
20Query Execution
R1 path(_at_S,D,P) ? link(_at_S,D), P(S,D). R2
path(_at_S,D,P) ? link(_at_Z,S), path(_at_Z,D,P2), PS?P2.
Query path(_at_a,d,P,C)
link
link
link
link
Neighbor table
Communication patterns are identical to those in
the actual path vector protocol
b
d
c
a
path(_at_a,d,a,b,c,d)
path(_at_b,d,b,c,d)
Forwarding table
21Sanity Check
- All-pairs shortest latency path query
- Query convergence time proportional to diameter
of the network. Same as hand-coded PV. - Per-node communication overhead Increases
linearly with the number of nodes - Same scalability trends compared with PV/DV
protocols
22Outline
- Background
- The Connection Routing as a Query
- Execution Model
- Path-Vector Protocol Example
- Query specifications ? protocol implementation
- Example Queries
- Realizing the Connection
- Declarative Overlays
- Conclusion
23Example Routing Queries
- Best-Path Routing
- Distance Vector
- Dynamic Source Routing
- Policy Decisions
- QoS-based Routing
- Link-state
- Multicast Overlays (Single-Source CBT)
Takeaways
- Compact, natural representation
- Customization easy to make modifications to get
new protocols - Connection between query optimization and
protocols
24All-pairs All-paths
R1 path(_at_S,D, ,C) ? link(_at_S,D,C) R2
path(_at_S,D, ,C) ? CC1C2,
Query path(_at_S,D, ,C)
, P(S,D).
P
link(_at_S,Z,C1), path(Z,D, ,C2),
P2
P
PS?P2.
P
25All-pairs Best-path
R1 path(_at_S,D,P,C) ? link(_at_S,D,C), P(S,D). R2
path(_at_S,D,P,C) ? link(_at_S,Z,C1), path(_at_Z,D,P2,C2),
CC1C2, Query
bestPath(_at_S,D,P,C)
PS?P2.
R3 bestPathCost(_at_S,D,minltCgt) ?
path(_at_S,D,Z,C). R4 bestPath(_at_S,D,Z,C) ?
bestPathCost(_at_S,D,C), path(_at_S,D,P,C).
26Customizable Best-Paths
R1 path(_at_S,D,P,C) ? link(_at_S,D,C), P(S,D). R2
path(_at_S,D,P,C) ? link(_at_S,Z,C1), path(_at_Z,D,P2,C2),
CFN(C1,C2),
Query bestPath(_at_S,D,P,C)
PS?P2.
R3 bestPathCost(_at_S,D,AGGltCgt) ?
path(_at_S,D,Z,C). R4 bestPath(_at_S,D,Z,C) ?
bestPathCost(_at_S,D,C), path(_at_S,D,P,C).
Customizing C, AGG and FN lowest RTT, lowest
loss rate, highest capacity, best-k
27All-pairs All-paths
R1 path(_at_S,D, ,C) ? link(_at_S,D,C) R2
path(_at_S,D, ,C) ? CC1C2,
Query path(_at_S,D, ,C)
P
, P(S,D).
link(_at_S,Z,C1), path(_at_Z,D, ,C2),
P
P2
PS?P2.
P
28Distance Vector
R1 path(_at_S,D, ,C) ? link(_at_S,D,C). R2
path(_at_S,D, ,C) ? link(_at_S,Z,C1), path(_at_Z,D,
,C2), CC1C2 Query (_at_S,D, ,C)
D
Z
W
R3 shortestLength(_at_S,D,minltCgt) ?
path(_at_S,D,Z,C). R4 nextHop(_at_S,D,Z,C) ?
nextHop(_at_S,D,Z,C), shortestLength(_at_S,D,C).
Z
nextHop
Count to Infinity problem?
29Distance Vector with Split Horizon
R1 path(_at_S,D,D,C) ? link(_at_S,D,C) R2
path(_at_S,D,Z,C) ? link(_at_S,Z,C1), path(_at_Z,D,W,C2),
CC1C2 R3 shortestLength(_at_S,D,minltCgt) ?
path(_at_S,D,Z,C). R4 nextHop(_at_S,D,Z,C) ?
nextHop(_at_S,D,Z,C), shortestLength(_at_S,D,C).
Query nextHop(_at_S,D,Z,C)
, W!S
30Distance Vector with Poisoned Reverse
R1 path(_at_S,D,D,C) ? link(_at_S,D,C) R2
path(_at_S,D,Z,C) ? link(_at_S,Z,C1), path(_at_Z,D,W,C2),
CC1C2, W!S R4 shortestLength(_at_S,D,minltCgt) ?
path(_at_S,D,Z,C). R5 nextHop(_at_S,D,Z,C) ?
nextHop(_at_S,D,Z,C), shortestLength(_at_S,D,C).
Query nextHop(_at_S,D,Z,C)
R3 path(_at_S,D,Z,C) ? link(_at_S,Z,C1),
path(_at_Z,D,W,C2), C?, WS
31All-pairs All-Paths
R1 path(_at_S,D,P,C) ? link(_at_S,D,C), P (S,D).
R2 path(_at_S,D,P,C) ? CC1C2,
Query
path(_at_S,D,P,C)
link(_at_S,Z,C1), path(_at_Z,D,P2,C2),
PS?P2.
32Dynamic Source Routing
R1 path(_at_S,D,P,C) ? link(_at_S,D,C), P (S,D).
R2 path(_at_S,D,P,C) ? CC1C2,
Query
path(_at_S,D,P,C)
path(_at_S,Z,P1,C1),
link(_at_Z,D,C2),
PS?P2.
PP1?D.
Predicate reordering path vector protocol ?
dynamic source routing
33Other Routing Examples
- Best-Path Routing
- Distance Vector
- Dynamic Source Routing
- Policy Decisions
- QoS-based Routing
- Link-state
- Multicast Overlays (Single-Source CBT)
34Outline
- Background
- The Connection Routing as a Query
- Realizing the Connection
- Dataflow Generation and Execution
- Recursive Query Processing
- Optimizations
- Semantics in a dynamic network
- Beyond routing Declarative Overlays
- Conclusion
35Dataflow Graph
Strands
Network Out
Network In
Messages
Messages
Single P2 Node
- Nodes in dataflow graph (elements)
- Network elements (send/recv, cc, retry, rate
limitation) - Flow elements (mux, demux, queues)
- Relational operators (selects, projects, joins,
aggregates)
36Dataflow Strand
Strand Elements
Element2
Element1
Input Tuples
Output Tuples
Elementn
Input Incoming network messages, local table
changes, local timer events
Condition Process input tuple using strand
elements
Output Outgoing network messages, local table
updates
37Rule ? Dataflow Strands
R2 path(_at_S,D,P) ? link(_at_S,Z), path(_at_Z,D,P2),
PS?P2.
38Localization Rewrite
- Rules may have body predicates at different
locations
R2 path(_at_S,D,P) ? link(_at_S,Z), path(_at_Z,D,P2),
PS?P2.
Rewritten rules
R2a linkD(S,_at_D) ? link(_at_S,D)
R2b path(_at_S,D,P) ? linkD(S,_at_Z), path(_at_Z,D,P2),
PS?P2.
39Dataflow Strand Generation
R2b path(_at_S,D,P) ? linkD(S,_at_Z), path(_at_Z,D,P2),
PS?P2.
Strand Elements
Network In
Network In
40Recursive Query Evaluation
- Semi-naïve evaluation
- Iterations (rounds) of synchronous computation
- Results from iteration ith used in (i1)th
9
7
5
2
10
4
1
8
0
3
6
Path Table
Link Table
Network
Problem Unpredictable delays and failures
41Pipelined Semi-naïve (PSN)
- Fully-asynchronous evaluation
- Computed tuples in any iteration pipelined to
next iteration - Natural for distributed dataflows
9
10
7
9
5
6
2
10
4
1
3
Relaxation of semi-naïve
8
0
8
5
3
2
7
6
4
1
Path Table
Link Table
Network
42Pipelined Evaluation
- Challenges
- Does PSN produce the correct answer?
- Is PSN bandwidth efficient?
- I.e. does it make the minimum number of
inferences? - Duplicate avoidance local timestamps
- Theorems
- RSSN(p) RSPSN(p), where RS is results set
- No repeated inferences in computing RSPSN(p)
43Outline
- Background
- The Connection Routing as a Query
- P2 Declarative Networking System
- Dataflow Generation and Execution
- Recursive Query Processing
- Optimizations
- Beyond routing Declarative Overlays
- Conclusion
44Overview of Optimizations
- Traditional evaluate in the NW context
- Aggregate Selections
- Magic Sets rewrite
- Predicate Reordering
- New motivated by NW context
- Multi-query optimizations
- Query Results caching
- Opportunistic message sharing
- Cost-based optimizations (work-in-progress)
- Neighborhood density function
- Hybrid rewrites
45Aggregate Selections
- Prune communication using running state of
monotonic aggregate - Avoid sending tuples that do not affect value of
agg - E.g., shortest-paths query
- Challenge in distributed setting
- Out-of-order (in terms of monotonic aggregate)
arrival of tuples - Solution Periodic aggregate selections
- Buffer up tuples, periodically send best-agg
tuples
46Aggregate Selections Evaluation
- P2 implementation of routing protocols on Emulab
(100 nodes) - All-pairs best-path queries (with aggregate
selections) - Aggregate Selections reduces communication
overhead - More effective when link metric correlated with
network delay - Periodic AS reduces communication overhead further
47Outline
- Background
- The Connection Routing as a Query
- Realizing the Connection
- P2 Declarative Routing Engine
- Beyond routing Declarative Overlays
- Conclusion
48Recall Declarative Routing
P2 Engine
Control Plane
Forwarding Plane
Declarative Router
49Declarative Overlays
P2 Engine
Control and forwarding Plane
Application level
Internet
Default Internet Routing
Declarative Overlay Node
50Declarative Overlays
- More challenging to specify
- Not just querying for routes using input links
- Rules for generating overlay topology
- Message delivery, acknowledgements, failure
detection, timeouts, periodic probes, etc - Extensive use of timer-based event predicates
ping(_at_D,S) - periodic(_at_S,10), link(_at_S,D)
51P2-Chord
- Chord Routing, including
- Multiple successors
- Stabilization
- Optimized finger maintenance
- Failure detection
- 47 rules
- 13 table definitions
- MIT-Chord x100 more code
- Another example
- Narada mesh in 16 rules
10 pt font
52Actual Chord Lookup Dataflow
53P2-Chord Evaluation
- P2 nodes running Chord on 100 Emulab nodes
- Logarithmic lookup hop-count and state
(correct) - Median lookup latency 1-1.5s
- BW-efficient 300 bytes/s/node
54Moving up the stack
- Querying the overlay
- Routing tables are views to be queried
- Queries on route resilience, network diameter,
path length - Recursive queries for network discovery
- Distributed Gnutella crawler on PlanetLab IPTPS
03 - Distributed web crawler over DHTs on PlanetLab
55Outline
- Background
- The Connection Routing as a Query
- Realizing the Connection
- Beyond routing Declarative Overlays
- Conclusion
56A Sampling of Related Work
- Databases
- Recursive queries software analysis, trust
management, distributed systems diagnosis - Opportunities Computational biology, data
integration, sensor networks - Networking
- XORP Extensible Routers
- High-level routing specifications
- Meta-Routing, Routing logic
57Future Directions
- Declarative Networking
- Static checks on desirable network properties
- Automatic cost-based optimizations
- Component-based network abstractions
- Core Internet Infrastructure
- Declarative specifications of ISP configurations
- P2 deployment in routers
58Distributed Data Management on Declarative
Networks
-
- Run-time cross-layer optimizations
- Reoptimize data placement and queries
- Reconfigure networks based on data and query
workloads
59Other Work
- Internet-Scale Query Processing
- PIER Distributed query processor on DHTs
- http//pier.cs.berkeley.edu VLDB 2003, CIDR
2005 - P2P Search Infrastructures
- P2P Web Search and Indexing IPTPS 2003
- Gnutella measurements on PlanetLab IPTPS 2004
- Distributed Gnutella crawler and monitoring
- Hybrid P2P search VLDB 2004
60Contributions and Summary
- P2 Declarative Networking System
- Declarative Routing Engine
- Extensible routing infrastructure
- Declarative Overlays
- Rapid prototyping overlay networks
- Database fundamentals
- Query language
- New distributed query execution strategies and
optimizations - Semantics in dynamic networks
- Period of flux in Internet research
- Declarative Networks can play an important role
61